All Products
Search
Document Center

DataWorks:Create and manage workspaces

Last Updated:Nov 18, 2024

You can create, delete, and disable workspaces in the DataWorks console. On the Workspace page in SettingCenter, you can manage and configure the properties of a specific workspace and add data sources, such as MaxCompute projects and E-MapReduce (EMR) clusters, to a workspace for data development. This topic describes the basic operations that you can perform on a workspace.

Entry points for operations

The following table describes the operations that you can perform on a workspace and the entry points for the operations.

Operation

Description

Entry point

Creates a workspace

A workspace is the basic unit in which you can manage tasks and members, assign roles, and grant permissions. All tasks are developed in specific workspaces.

DataWorks console

Delete or disable a workspace

If you no longer need to use a workspace, you can delete or disable the workspace.

  • If you delete a workspace, the code in the workspace is also deleted.

  • If you disable a workspace, the code in the workspace is retained but the workspace becomes unavailable.

Add a data source to a workspace

DataWorks allows you to add various types of data sources, such as MaxCompute, EMR, and Realtime Compute for Apache Flink, to a workspace and synchronize data between different data sources. In addition, DataWorks allows you to run computing tasks on MaxCompute, Hologres, AnalyticDB for PostgreSQL, AnalyticDB for MySQL, and ClickHouse data sources in DataStudio to manage the data stored in the data sources.

SettingCenter

View and modify the configurations of a workspace

After you create a workspace, you can view and modify the configurations of the workspace.

Add members, assign roles, and view permissions

During data development, you need to add RAM users to a workspace as members and assign roles such as Workspace Administrator, Develop, O&M, and Visitor to the members to implement collaborative data development.

Members who are assigned different roles have different permissions on DataWorks services. If built-in workspace-level roles cannot meet your business requirements, you can create custom roles.

Limits

  • Only an Alibaba Cloud account and RAM users to which the AliyunDataWorksFullAccess policy is attached can perform operations in the DataWorks console.

  • Only users who are assigned the Workspace Administrator role can perform operations on the Workspace page in SettingCenter.

Creates a workspace

A workspace is the basic unit in which you can manage tasks and members, assign roles, and grant permissions. All tasks are developed in specific workspaces. Before you develop tasks, you must create a workspace.

Prerequisites

  • DataWorks is activated. For more information, see Activate DataWorks.

  • The account that you want to use to create a workspace is prepared.

    • If you want to create a workspace by using an Alibaba Cloud account, prepare an Alibaba Cloud account. For more information, see Prepare an Alibaba Cloud account.

    • If you want to create a workspace as a RAM user, prepare a RAM user. For more information, see Prepare a RAM user.

    • If you want to create a workspace as a RAM user, grant the CreateWorkspace permission to the RAM user. For more information about how to grant permissions to a RAM user, see Grant permissions to a RAM user.

Preparations before you create a workspace

Before you create a workspace, plan the workspace configurations and select a suitable workspace mode. The following table describes how to make the preparations.

Operation

Description

References

Plan a workspace

A workspace is the largest business unit supported by DataWorks. Before you create a workspace, you must understand how workspaces work and plan a workspace for your business scenario.

Plan a workspace

Select a workspace mode

DataWorks allows you to create a workspace that is in basic mode or standard mode.

  • Basic mode: A workspace in basic mode provides only the production environment and uses data sources in the production environment. Permissions on data cannot be isolated, and the development environment and production environment cannot be isolated.

  • Standard mode: A workspace in standard mode provides the development environment and production environment and uses data sources in the development environment and production environment. Permissions on data can be isolated, and the development environment and production environment can be isolated.

Note

We recommend that you develop a task in a workspace in standard mode.

Differences between workspaces in basic mode and workspaces in standard mode

Procedure

  1. Select a region.

    1. Log on to the DataWorks console.

    2. Select a region in the top navigation bar.

      Workspaces are created based on regions. You must select a region based on the region where your business data is used. Then, you can create workspaces in the selected region.

      Note
      • Check whether the current region is the desired region. After you create a workspace, you cannot change the region in which the workspace resides.

      • To prevent impacts that are exerted by the switching of daylight saving time on the running of tasks in your workspace, we recommend that you view the Scenario: Impacts exerted by the switching of daylight saving time on the running of tasks topic if the region in which the tasks reside uses the daylight saving time.

      • The time zone of the region that you select is automatically used as the time zone for scheduling. This indicates that the time zone is used when you specify the scheduling time for a task.

      • DataWorks allows you to change the time zone for scheduling in workspaces that reside in specific regions. For more information, see Scenario: Change the time zone for scheduling.

  2. Create a workspace.

    1. In the left-side navigation pane of the DataWorks console, click Workspace.

    2. On the Workspaces page, click Create Workspace.

      A workspace is the basic unit in which you can manage tasks and members, assign roles, and grant permissions. All tasks are developed in specific workspaces. Before you develop tasks, you must create a workspace. After you enter the entry point for creating a workspace, configure parameters as instructed.

      image.png

      The following table describes the parameters.

      Parameter

      Description

      Workspace Name

      The name of the workspace. The name uniquely identifies the workspace and cannot be changed after the workspace is created.

      Display Name

      The display name of the workspace. We recommend that you specify a name based on your business attributes.

      Isolate Development and Production Environments

      The mode of the workspace.

      • If you want to isolate production and development environments, select Yes. In this case, the workspace that you create is in standard mode.

      • If you do not want to isolate production and development environments, select No. In this case, the workspace that you create is in basic mode.

      For more information about the modes of workspaces, see Differences between workspaces in basic mode and workspaces in standard mode. You can specify the mode of the workspace based on your business requirements.

      Workspace Administrator

      The administrator of the workspace.

      By default, the current logon account is used as the administrator of the workspace. You can specify a member in the workspace as an administrator to help manage the workspace. For more information about how to add a RAM user to a workspace as a member, see Add a RAM user to a workspace as a member and assign roles to the member.

      Alibaba Cloud Resource Group

      Select a resource group created in Alibaba Cloud Resource Management. By default, Default Resource Group is selected.

      If you purchase various Alibaba Cloud resources, you can create resource groups in Resource Management, specify an administrator for each resource group, and manage the resources by group.

      Important

      The selected resource group is created in the Resource Group service provided by Resource Management. The Resource Group service allows you to sort resources owned by your Alibaba Cloud account. This simplifies resource and permission management within your Alibaba Cloud account. The resource group is different from the resource group that is used to run tasks in DataWorks.

      Schedule PAI Nodes

      If you want the system to schedule Platform for AI (PAI) tasks on a regular basis, turn on this switch. If you do not turn on this switch when you create a workspace, you can go to SettingCenter and turn on this switch on the Basic Settings tab of the Workspace page after you create the workspace.

      Note

      This switch cannot be turned off after it is turned on. Therefore, turn on this switch based on your business requirements. For more information about PAI tasks, see What is PAI?

      Description

      The description of the workspace. The description can help you identify the workspace. You can specify the purpose of the workspace in the description.

Manage workspaces

You can go to SettingCenter and perform the following operations on a specific workspace on the Workspace page.

View and modify basic information about the workspace

  • In the Basic Properties section of the Basic Settings tab, you can view and modify basic information about the workspace.

    Parameter

    Description

    Workspace ID

    The unique identifier and name of the workspace. You cannot change the values of the parameters after a workspace is created.

    Workspace Name

    Status

    The status of the workspace. Valid values: Normal, Deleted, Initializing, Initialization Failed, Manual Disable, Deleting, Deletion Failed, Suspended (Overdue), Updating, and Update Failed.

    Note
    • If a workspace fails to be created, the workspace enters the Initialization Failed state. In this case, you can recreate the workspace.

    • A workspace administrator can disable a workspace that is in the Normal state. After the workspace is disabled, all features in the workspace cannot be used but data in the workspace is retained. Instances that are generated and scheduled to run on the current day are automatically run at their scheduling time. The instances are not automatically scheduled on the next day, and you cannot access the workspace to view information about the instances.

    • A workspace administrator can click Enable in the Actions column of a disabled workspace on the Workspaces page to recover the workspace to the Normal state.

    Display Name

    The display name of the workspace. You can use an account that is assigned the Workspace Administrator role to modify the display name.

    Mode

    The mode of the workspace. Valid values: Basic Mode and Standard Mode.

    Note
    • The configurations of a DataWorks workspace vary based on the mode of the DataWorks workspace. You must configure the parameters for the production and development environments of a DataWorks workspace that is in standard mode.

    • For a DataWorks workspace in basic mode, you can log on to the DataWorks console with an Alibaba Cloud account and upgrade the workspace from basic mode to standard mode. You can perform the upgrade operation only by using an Alibaba Cloud account. For more information, see Scenario: Upgrade a workspace from the basic mode to the standard mode.

    Owner

    The owner of the workspace. You cannot change the value of this parameter after a workspace is created. The owner of a workspace has the permissions to delete and disable the workspace.

  • In the Security Settings section of the Basic Settings tab, you can configure security settings for the workspace. The following table describes the parameters.

    Parameter

    Description

    Download SELECT Result

    Specifies whether the query results that are returned by SELECT statements in DataStudio can be downloaded. If you turn off this switch, the query results cannot be downloaded.

    Note

    Only a workspace administrator has the permissions to turn on or off this switch for a workspace.

    Change Node Owner By RAM User

    Specifies whether RAM users can be used to change the owners of their nodes.

    Sandbox Whitelist (The Whitelist Contains IP Addresses Or Domain Names That Can Be Accessed By Shell Tasks.)

    The IP addresses or domain names that can be accessed by a Shell task that runs on the shared resource group.

    Note

    You must specify public IP addresses or domain names that are accessible. For internal services, we recommend that you use exclusive resource groups to ensure network accessibility. For more information, see Exclusive resource group mode.

Manage workspace members and roles

On the Workspace Members tab, you can add RAM users to the current workspace as members, remove members from the current workspace, and assign roles to members. On the Workspace Roles tab, you can view and manage roles in the current workspace.

  1. Add a workspace member

    You can add a RAM user to the current workspace as a member and assign workspace-level roles to the member. This way, the member has all permissions of the workspace-level roles. For more information, see Add a RAM user to a workspace as a member and assign roles to the member.

    Note

    You can assign workspace-level custom roles or built-in roles to RAM users. Workspace-level custom roles can be created only by the workspace administrator on the Workspace Roles tab. Users who are assigned different roles have different permissions on workspace-level services. For more information, see Manage permissions on workspace-level services.

  2. Manage member roles

    You can view built-in or custom roles in the current workspace. If built-in roles cannot meet your business requirements, you can create custom roles. You can allow a custom role to have permissions on specific workspace-level services. You can also configure permission mappings between custom roles and MaxCompute project roles based on your business requirements. For more information, see Manage permissions on workspace-level services.

    Note

    Only an Alibaba Cloud account or a RAM user to which the Admin or Super_Administrator role of a MaxCompute project is assigned can configure permission mappings.

Add data sources

DataWorks allows you to add various types of data sources, such as MaxCompute, EMR, and Realtime Compute for Apache Flink, to a workspace and synchronize data between different data sources. In addition, DataWorks allows you to run computing tasks on MaxCompute, Hologres, AnalyticDB for PostgreSQL, AnalyticDB for MySQL, and ClickHouse data sources in DataStudio to manage the data stored in the data sources.

For more information about how to add a data source, see Add and manage data sources.

View permissions

On the Permissions tab, you can view the permissions of each built-in role. For more information, see Permissions of built-in workspace-level roles.

Delete or disable a workspace

On the Workspaces page in the DataWorks console, you can move the pointer over the 更多 icon in the Actions column of a workspace and select Delete Workspace to delete the workspace or Disable Workspace to disable the workspace.

  • Delete Workspace: After you delete a workspace, you cannot recover it. We recommend that you do not perform this operation unless the operation is necessary.

  • Disable Workspace:

    • After you disable a workspace, the system no longer generates instances for auto triggered tasks in the workspace. The instances that are generated before you disable the workspace are automatically scheduled at the specified time. However, you cannot access the workspace to view information about these instances.

    • After you disable a workspace, data sources that are added to the workspace still exist, and you may be still charged for the data sources that you use to store data. You are not charged in the DataWorks service but in the Alibaba Cloud services to which the data sources you use belong. If you have questions about billing, you can contact the technical support of the Alibaba Cloud services.

What to do next

You have learned how to create and manage workspaces. During data development, you also need to perform other operations. For example, you need to associate a resource group with a workspace, add RAM users to a workspace as members, and add data sources.

  • After you activate DataWorks, you must purchase a resource group to use resources in data synchronization, data scheduling, or DataService Studio. For more information, see Overview.

  • If you want to use DataWorks to synchronize data between data sources, you must add the data sources to DataWorks and configure information about the data sources. This way, when you configure a data synchronization task, you can determine the database from which you want to read data and the database to which you want to write data based on the names of the data sources. For more information, see Add and manage data sources.

  • If you want to collaborate with other RAM users to perform data development operations in a workspace, you can add the RAM users to the workspace as members and assign different roles to them for collaborative development. For more information, see Overview of the DataWorks permission management system.