This topic describes how to create a workspace, add a resource to the workspace, and configure Object Storage Service (OSS) buckets for storing code and user data in Data Workstation.
Step 1: Create and go to a workspace
- Log on to the DMS console V5.0.
Click the icon in the upper-left corner and choose
.NoteIf you are not using the Data Management (DMS) console in simple mode, choose
in the top navigation bar.Click Create Workspace. In the Create Workspace dialog box, configure the Workspace Name and Region parameters and click OK.
NoteThe workspace name can contain letters, digits, and underscores (_).
You can select only the Singapore region.
Click Go to Workspace in the Actions column of the workspace to go to the workspace.
NoteBy default, only the workspace creator can access a workspace. If collaborative development is required, the workspace creator must grant the development permissions to specific users who need to access the workspace.
Step 2: Add workspace members
If a workspace has multiple users, these users must be assigned with different roles.
The users to which you want to assign roles must have been added to DMS. For more information, see Manage users.
Step 3: Add a resource and configure an OSS bucket for storing code
You can use Notebook to query and analyze data only after you add and start resources.
On the tab, click Resource Configuration.
Click Add Resource and configure resource-related information.
Parameter
Description
Resource Name
The name of the resource. Enter a name that is easy to understand and use.
Resource Introduction
The description of the resource.
Image
Spark 3.5+Python 3.9
Spark 3.3+Python 3.9
Python 3.9
AnalyticDB Instance
The AnalyticDB for MySQL cluster that you want to use.
NoteIf you select Spark 3.3 or Spark 3.5 for the Image parameter, you must also select an AnalyticDB for MySQL cluster.
If the cluster that you want to use is not found, check whether the cluster is added to DMS. For more information, see Register an Alibaba Cloud database instance.
AnalyticDB Resource Group
The resource group that you want to use in the AnalyticDB for MySQL cluster.
Executor Specifications
The resource specifications of the Spark executor. Each type corresponds to distinct specifications. For more information, see the Type column in Spark resource specifications.
Executors
The number of executors in the Spark configurations.
Driver Specifications
The resource specifications of the Spark driver. Valid values:
General_XSmall_v1 (2 CPU cores, 8 GB memory)
General_Small_v1 (4 CPU cores, 16 GB memory)
General_Mediun_v1 (8 CPU cores, 32 GB memory)
General_Large_v1 (16 CPU cores, 64 GB memory)
Notebook Specifications
This parameter appears if you select Python 3.9 for the Image parameter. Valid values:
General_XSmall_v1 (2 CPU cores, 8 GB memory)
General_Small_v1 (4 CPU cores, 16 GB memory)
General_Mediun_v1 (8 CPU cores, 32 GB memory)
General_Large_v1 (16 CPU cores, 64 GB memory)
VPC ID
The virtual private cloud (VPC) in which the resource resides.
Zone ID
The zone of the VPC.
VSwitch ID
The vSwitch in the VPC.
Security Group ID
The ID of the security group.
Click Save.
Start the resource.
Find the resource that you want to start, click Start in the Actions column, and then click OK. You are navigated to the Storage Configuration page.
On the Storage Configuration page, click the icon to the right of Code Storage Space.
In the Select OSS Directory dialog box, select the bucket that you want to use for the Bucket parameter.
The selected bucket must be in the same region as the workspace, and the storage class of the bucket must be Standard. If no buckets are available in the specified region, create one in the OSS console. For more information, see Create a bucket.
Click OK.
On the Resource Configuration page, manually start the resource again.
NoteIt usually takes 1 minute to start a resource.
Step 4: Configure an OSS bucket for storing user data
You may need to read data other than that of DMS Notebook when you use the Data Workstation features. In this case, DMS allows you to specify multiple OSS buckets to read data from these buckets.
After you go to a workspace, click Storage Management on the tab.
In the User Storage Space area, configure an OSS path.
NoteA mount path must start with /mnt/.
Click the icon to save the OSS path.
Step 5: View data
After you go to a workspace, click the tab.
Perform the following operations in SQL Console:
Query data
You can use Copilot to generate SQL statements or directly enter SQL statements. The SQL syntax must be the same as that for a logical data warehouse.
NoteYou can use the MySQL syntax to query tables that are located in different databases, such as AnalyticDB for MySQL and ApsaraDB RDS for MySQL. DMS automatically converts and optimizes your SQL statements.
When you use Copilot to generate SQL statements, Copilot can automatically obtain business knowledge based on your feedback and the metadata of databases, tables, and columns. If the obtained knowledge is inaccurate, you can edit the knowledge to improve its reference values. This way, Copilot can offer higher accuracy when it answers similar questions later. For more information, see Use Copilot to generate SQL statements.
If the SQL statements generated by Copilot meet your business requirements and you are satisfied with the SQL statements, you can give them likes. This operation improves the accuracy of SQL statements that are generated later.
View the usage notes of a table
DMS automatically generates table descriptions based on the metadata of databases, tables, and columns. You can click a database to show all tables in the database, find the table that you want to view, and then double-click the table name to go to the table details page. On this page, you can view or edit the table description on the Usage Notes tab.
What to do next
Manage resources
After resources are added, you can edit and stop resources and start resources that are stopped on the Resource Configuration page.