All Products
Search
Document Center

Data Management:Create a workspace and import resources

Last Updated:Nov 08, 2024

This topic describes how to create a workspace, add a resource to the workspace, and configure Object Storage Service (OSS) buckets for storing code and user data in Notebook.

Step 1: Create and go to a workspace

  1. Log on to the DMS console V5.0.
  2. Click the 2023-01-28_15-57-17.png icon in the upper-left corner and choose All Features > Data Development > Notebook.

    Note

    If you are not using the DMS console in simple mode, choose Data Development > Notebook in the top navigation bar.

  3. Click Create Workspace. In the Create Workspace dialog box, configure the Workspace Name and Region parameters and click OK.

    Note
    • The workspace name can contain letters, digits, and underscores (_).

    • You can select only the Singapore region.

  4. Click Go to Workspace in the Actions column of the workspace to go to the workspace.

    Note

    By default, only the workspace creator can access a workspace. If collaborative development is required, the workspace creator must grant the development permissions to specific users who need to access the workspace.

Step 2: Add workspace members

If a workspace has multiple users, these users must be assigned with different roles.

The users to which you want to assign roles must have been added to DMS. For more information, see Manage users.

image

Step 3: Configure an OSS bucket for storing code

  1. After you go to a workspace, click Storage Management on the image tab.

  2. On the Storage Management page, click the image icon to the right of Code Storage Space.

  3. In the Select OSS Directory dialog box, select the bucket that you want to use for the Bucket parameter.

    The selected bucket must be in the same region as the workspace, and the storage class of the bucket must be Standard.

    Note

    If no buckets are available in the specified region, create one in the OSS console. For more information, see Create a bucket.

  4. Click OK.

Step 4: Add a resource

You can use Notebook to query and analyze data only after you add and start resources.

  1. On the image tab, click Resource Configuration.

  2. Click Add Resource and configure resource-related information.

    Parameter

    Description

    Resource Name

    The name of the resource. Enter a name that is easy to understand and use.

    Resource Introduction

    The description of the resource.

    Image

    • Spark 3.5+Python 3.9

    • Spark 3.3+Python 3.9

    • Python 3.9

    AnalyticDB Instance

    The AnalyticDB for MySQL cluster that you want to use.

    Note
    • If you select Spark 3.3 or Spark 3.5 for the Image parameter, you must also select an AnalyticDB for MySQL cluster.

    • If the cluster that you want to use is not found, check whether the cluster is added to DMS. For more information, see Register an Alibaba Cloud database instance.

    AnalyticDB Resource Group

    The resource group that you want to use in the AnalyticDB for MySQL cluster.

    Executor Spec

    The resource specifications of the Spark executor. Each type corresponds to distinct specifications. For more information, see the Type column in Spark application configuration parameters.

    Executor Count

    The number of executors in the Spark configurations.

    Note

    During the public preview stage, you can add a maximum of six executors for resources in each notebook. If you want to add more executors, contact DMS technical support.

    Driver Specifications

    The resource specifications of the Spark driver. Valid values:

    • General_XSmall_v1 (2 CPU cores, 8 GB memory)

    • General_Small_v1 (4 CPU cores, 16 GB memory)

    • General_Medium_v1 (8 CPU cores, 32 GB memory)

    • General_Large_v1 (16 CPU cores, 64 GB memory)

    NotebookQuantity

    This parameter appears if you select Python 3.9 for the Image parameter. Valid values:

    • General_XSmall_v1 (2 CPU cores, 8 GB memory)

    • General_Small_v1 (4 CPU cores, 16 GB memory)

    • General_Medium_v1 (8 CPU cores, 32 GB memory)

    • General_Large_v1 (16 CPU cores, 64 GB memory)

    VPC ID

    The virtual private cloud (VPC) in which the resource resides.

    Zone ID

    The zone of the VPC.

    VSwitch ID

    The vSwitch in the VPC.

    Security Group ID

    The ID of the security group.

  3. Click Save.

  4. Start the resource.

    Find the resource that you want to start, click Start in the Action column, and then click OK.

    Note

    It takes about 1 minute to start the resource. After the resource is started, the resource enters the Running state.

Step 5: Configure an OSS bucket for storing user data

You may need to read data other than that of DMS Notebook when you use the Notebook features. In this case, DMS allows you to specify multiple OSS buckets to read data from these buckets.

  1. After you go to a workspace, click Storage Management on the image tab.

  2. In the User Storage Space area, configure an OSS path.

    Note

    A mount path must start with /mnt/.

    image

  3. Click the image icon to save the OSS path.

Step 6: View data

  1. After you go to a workspace, click the image tab.

  2. Perform the following operations in SQL Console:

    • Query data

      You can use Copilot to generate SQL statements or directly enter SQL statements. The SQL syntax must be the same as that for a logical data warehouse.

      Note

      You can use the MySQL syntax to query tables that are located in different databases, such as AnalyticDB for MySQL and ApsaraDB RDS for MySQL. DMS automatically converts and optimizes your SQL statements.

      When you use Copilot to generate SQL statements, Copilot can automatically obtain business knowledge based on your feedback and the metadata of databases, tables, and columns. If the obtained knowledge is inaccurate, you can edit the knowledge to improve its reference values. This way, Copilot can offer higher accuracy when it answers similar questions later.

      If the SQL statements generated by Copilot meet your business requirements and you are satisfied with the SQL statements, you can give them likes. This operation improves the accuracy of SQL statements that are generated later.

    • View the usage notes of a table

      DMS automatically generates table descriptions based on the metadata of databases, tables, and columns. You can click a database to show all tables in the database, find the table that you want to view, and then double-click the table name to go to the table details page. On this page, you can view or edit the table description on the Usage Notes tab.

What to do next

Use Notebook to query and analyze data

Manage Notebook resources

You can perform the following operations to manage added resources on the Resource Configuration page. For more information about how to go to the Resource Configuration page, see the Step 4: Add a resource section of this topic.

  • Manually stop a resource

  • Edit the resource information

    Note

    You can edit the resource information only after the resource stops running.

  • Manually start a stopped resource

  • Automatically release a resource

    When all kernels in the notebook are exited, the kernels enter the idle state. The resource is automatically released when the maximum idle period elapses.

  • View the historical Spark jobs of a resource

    Note

    You can go to the Spark UI page only when the default resource is used and the resource contains a Spark image is used.

    1. On the Resource Configuration page, find the resource that you want to manage and click SparkUI in the Action column. Go to the History Server page.

    2. Click the ID of the application that you want to manage to view the Spark jobs.