This topic describes how to create a workspace, add a resource to the workspace, and configure Object Storage Service (OSS) buckets for storing code and user data in Notebook.
Step 1: Create and go to a workspace
- Log on to the DMS console V5.0.
Click the icon in the upper-left corner and choose
.NoteIf you are not using the DMS console in simple mode, choose
in the top navigation bar.Click Create Workspace. In the Create Workspace dialog box, configure the Workspace Name and Region parameters and click OK.
NoteThe workspace name can contain letters, digits, and underscores (_).
You can select only the Singapore region.
Click Go to Workspace in the Actions column of the workspace to go to the workspace.
NoteBy default, only the workspace creator can access a workspace. If collaborative development is required, the workspace creator must grant the development permissions to specific users who need to access the workspace.
Step 2: Add workspace members
If a workspace has multiple users, these users must be assigned with different roles.
The users to which you want to assign roles must have been added to DMS. For more information, see Manage users.
Step 3: Configure an OSS bucket for storing code
After you go to a workspace, click Storage Management on the tab.
On the Storage Management page, click the icon to the right of Code Storage Space.
In the Select OSS Directory dialog box, select the bucket that you want to use for the Bucket parameter.
The selected bucket must be in the same region as the workspace, and the storage class of the bucket must be Standard.
NoteIf no buckets are available in the specified region, create one in the OSS console. For more information, see Create a bucket.
Click OK.
Step 4: Add a resource
You can use Notebook to query and analyze data only after you add and start resources.
On the tab, click Resource Configuration.
Click Add Resource and configure resource-related information.
Parameter
Description
Resource Name
The name of the resource. Enter a name that is easy to understand and use.
Resource Introduction
The description of the resource.
Image
Spark 3.5+Python 3.9
Spark 3.3+Python 3.9
Python 3.9
AnalyticDB Instance
The AnalyticDB for MySQL cluster that you want to use.
NoteIf you select Spark 3.3 or Spark 3.5 for the Image parameter, you must also select an AnalyticDB for MySQL cluster.
If the cluster that you want to use is not found, check whether the cluster is added to DMS. For more information, see Register an Alibaba Cloud database instance.
AnalyticDB Resource Group
The resource group that you want to use in the AnalyticDB for MySQL cluster.
Executor Spec
The resource specifications of the Spark executor. Each type corresponds to distinct specifications. For more information, see the Type column in Spark application configuration parameters.
Executor Count
The number of executors in the Spark configurations.
NoteDuring the public preview stage, you can add a maximum of six executors for resources in each notebook. If you want to add more executors, contact DMS technical support.
Driver Specifications
The resource specifications of the Spark driver. Valid values:
General_XSmall_v1 (2 CPU cores, 8 GB memory)
General_Small_v1 (4 CPU cores, 16 GB memory)
General_Medium_v1 (8 CPU cores, 32 GB memory)
General_Large_v1 (16 CPU cores, 64 GB memory)
NotebookQuantity
This parameter appears if you select Python 3.9 for the Image parameter. Valid values:
General_XSmall_v1 (2 CPU cores, 8 GB memory)
General_Small_v1 (4 CPU cores, 16 GB memory)
General_Medium_v1 (8 CPU cores, 32 GB memory)
General_Large_v1 (16 CPU cores, 64 GB memory)
VPC ID
The virtual private cloud (VPC) in which the resource resides.
Zone ID
The zone of the VPC.
VSwitch ID
The vSwitch in the VPC.
Security Group ID
The ID of the security group.
Click Save.
Start the resource.
Find the resource that you want to start, click Start in the Action column, and then click OK.
NoteIt takes about 1 minute to start the resource. After the resource is started, the resource enters the Running state.
Step 5: Configure an OSS bucket for storing user data
You may need to read data other than that of DMS Notebook when you use the Notebook features. In this case, DMS allows you to specify multiple OSS buckets to read data from these buckets.
After you go to a workspace, click Storage Management on the tab.
In the User Storage Space area, configure an OSS path.
NoteA mount path must start with /mnt/.
Click the icon to save the OSS path.
Step 6: View data
After you go to a workspace, click the tab.
Perform the following operations in SQL Console:
Query data
You can use Copilot to generate SQL statements or directly enter SQL statements. The SQL syntax must be the same as that for a logical data warehouse.
NoteYou can use the MySQL syntax to query tables that are located in different databases, such as AnalyticDB for MySQL and ApsaraDB RDS for MySQL. DMS automatically converts and optimizes your SQL statements.
When you use Copilot to generate SQL statements, Copilot can automatically obtain business knowledge based on your feedback and the metadata of databases, tables, and columns. If the obtained knowledge is inaccurate, you can edit the knowledge to improve its reference values. This way, Copilot can offer higher accuracy when it answers similar questions later.
If the SQL statements generated by Copilot meet your business requirements and you are satisfied with the SQL statements, you can give them likes. This operation improves the accuracy of SQL statements that are generated later.
View the usage notes of a table
DMS automatically generates table descriptions based on the metadata of databases, tables, and columns. You can click a database to show all tables in the database, find the table that you want to view, and then double-click the table name to go to the table details page. On this page, you can view or edit the table description on the Usage Notes tab.
What to do next
Manage Notebook resources
You can perform the following operations to manage added resources on the Resource Configuration page. For more information about how to go to the Resource Configuration page, see the Step 4: Add a resource section of this topic.
Manually stop a resource
Edit the resource information
NoteYou can edit the resource information only after the resource stops running.
Manually start a stopped resource
Automatically release a resource
When all kernels in the notebook are exited, the kernels enter the idle state. The resource is automatically released when the maximum idle period elapses.
View the historical Spark jobs of a resource
NoteYou can go to the Spark UI page only when the default resource is used and the resource contains a Spark image is used.
On the Resource Configuration page, find the resource that you want to manage and click SparkUI in the Action column. Go to the History Server page.
Click the ID of the application that you want to manage to view the Spark jobs.