All Products
Search
Document Center

DataWorks:Permission management for data in compute engines

Last Updated:Jul 16, 2024

Before members in a workspace can use various data sources in the workspace, the members must be granted the permissions that are required to access the data in compute engines. This topic describes how to manage permissions on data in compute engines in DataWorks.

Prerequisites

Permissions required to access data in different types of data sources

The following table describes the permissions that are required to access data in different types of compute engines and the methods that can be used to grant the permissions to the members in a workspace.

Data source type

Permission description

References

MaxCompute

Built-in role

The built-in workspace-level roles of DataWorks are mapped to the roles of a MaxCompute compute engine. If you assign a built-in workspace-level role to a RAM user, the RAM user is automatically granted the permissions of the mapped role of the MaxCompute compute engine in the development environment.

  • MaxCompute compute engine in the development environment:

    By default, built-in workspace-level roles have specific permissions on a MaxCompute compute engine in the development environment. Users that are assigned built-in workspace-level roles can access MaxCompute tables in the development environment.

  • MaxCompute compute engine in the production environment:

    Built-in workspace-level roles do not have permissions on a MaxCompute compute engine in the production environment. To access MaxCompute tables in the production environment, you must request the permissions in Security Center. For more information about Security Center, see Overview.

Custom workspace-level role

If you create a custom workspace-level role and map the role to a role of a MaxCompute compute engine, the custom workspace-level role has the permissions of the mapped role of the MaxCompute compute engine.

EMR cluster

You can configure mappings between the members in a workspace and the accounts of the EMR cluster that is registered to DataWorks. This way, the members in the workspace are granted the permissions of the accounts of the EMR cluster.

Cloudera's Distribution Including Apache Hadoop (CDH) or Cloudera Data Platform (CDP) cluster

When you register a CDH or CDP cluster to DataWorks, you can configure mappings between the members in your workspace and Linux or Kerberos accounts of the CDH or CDP cluster. This way, the members in the workspace are granted the permissions on the CDH or CDP cluster.

Register a CDH or CDP cluster to DataWorks

Hologres

You can grant the permissions on a Hologres compute engine to the members in a workspace by using policies supported by Hologres. If you want to grant the permissions on a Hologres data source added to a workspace to the members in the workspace, you must perform the authorization based on the authorization-related topic in Hologres.

Permission management overview

Other types of data sources

The permissions on the data sources are determined by the scheduling access identities that are specified for different environments when you add the data sources to a workspace.

Note
  • When you add a data source other than the preceding types of data sources to a workspace, you must specify scheduling access identities of the data source in the development environment and production environment. For example, you must specify the username and password for database access in each environment when you add an AnalyticDB for PostgreSQL data source to a workspace.

  • Users that are assigned built-in or custom workspace-level roles use the specified scheduling access identity to run tasks on a data source. Permissions on a compute engine other than a MaxCompute compute engine are not directly granted to workspace-level roles. The permissions are determined based on the scheduling access identity that you specify when you add the data source to your workspace.

-