All Products
Search
Document Center

AnalyticDB:Notebook editor

Last Updated:Jun 18, 2024

AnalyticDB for MySQL Data Lakehouse Edition (V3.0) uses an interactive data analysis and development platform to provide notebook development that features job editing, data analysis, and data visualization. In the notebook editor, you can write Spark applications in the Spark SQL and Python languages.

Prerequisites

  • An AnalyticDB for MySQL Data Lakehouse Edition (V3.0) cluster is created. For more information, see Create a Data Lakehouse Edition cluster.

  • A job resource group that has at least 8 AnalyticDB compute units (ACUs) of maximum computing resources is created. For more information, see Create a resource group.

  • A database account of the AnalyticDB for MySQL cluster is associated with a Resource Access Management (RAM) user. For more information, see Associate or disassociate a database account with or from a RAM user.

  • An Object Storage Service (OSS) bucket is created in the same region and belongs to the same Alibaba Cloud account as the AnalyticDB for MySQL cluster.

  • AnalyticDB for MySQL is granted the AliyunADBSparkProcessingDataRole permission. For more information, see Perform authorization.

Usage note

The notebook development feature becomes unavailable as of May 10, 2024.

  • If you created notebooks before May 10, 2024, you can still use the feature.

  • If you did not create notebooks before May 10, 2024, you cannot use the feature in the AnalyticDB for MySQL console.

Create a notebook

  1. Log on to the AnalyticDB for MySQL console.

  2. In the upper-left corner of the console, select a region.

  3. In the left-side navigation pane, click Clusters.

  4. On the Data Lakehouse Edition (V3.0) tab, find the cluster that you want to manage and click the cluster ID.

  5. In the left-side navigation pane, choose Job Development > Notebook Development.

  6. On the Notebook Development page, click Create Notebook in the upper-right corner.

  7. In the Log Settings dialog box, select Default or Custom for the log path and click OK.

    Note

    The first time you create a notebook, the system checks whether you have configured a log path to store Spark runtime logs. If you have configured a log path, this step is skipped. Otherwise, a dialog box appears.

  8. In the Create Notebook panel, configure the parameters that are described in the following table.

    Parameter

    Description

    Example

    Resource Group

    The resource group that is used to create a notebook. Select a job resource group from the drop-down list. The job resource group must meet the following requirements:

    • The resource group is in the running state.

    • The maximum amount of computing resources is greater than or equal to 8 ACUs.

    (Job) notebook

    Name

    The name of the notebook.

    • The name can be up to 64 characters in length.

    • The name can contain letters, digits, underscores (_), and hyphens (-).

    • The name must be unique.

    notebook-test

    Description

    The description of the notebook.

    Feature testing

  9. Click OK.

Develop a notebook

On the Notebook Development page, click the name of the notebook that you want to develop and perform development in the Paragraph section.

book123

The following tables describe the sections and their parameters on the Notebook Development page.

① Menu bar

Parameter

Description

Resource Group

Create Resource Group

Allows you to create a job resource group. For more information, see Create a resource group.

Change Resource Group

Allows you to change the resource group of the notebook. Before you change the resource group of the notebook, make sure that the following requirements are met:

  • The resource group is of the job type.

  • The resource group is in the running state.

  • The maximum amount of computing resources is greater than or equal to 8 ACUs.

Important

The resource group change causes the notebook kernel to restart. The restart process requires approximately 3 minutes to complete, and the running notebook jobs fail during the process.

Kernel

Restart Kernel

Allows you to restart the notebook kernel. The restart process requires approximately 3 minutes to complete, and the running notebook jobs fail during the process.

Kill Kernel

Allows you to kill the notebook kernel. If you kill the notebook kernel, the running notebook jobs fail.

② Toolbar

Parameter

Description

保存

Allows you to save notebook jobs. The following methods are also available to save notebook jobs:

  • The system saves notebook jobs at an interval of 5 seconds.

  • You can press Ctrl+S to save notebook jobs.

新增段落

Allows you to create a paragraph. You can also move the pointer over the middle part between two paragraphs and click +Create Paragraph.添加段落

运行代码

Allows you to run code in all paragraphs.

暂停运行

Allows you to pause the execution of code in all paragraphs.

Important

You cannot pause notebook jobs that are in the running state.

清除

Allows you to clear the results of all paragraphs.

设置

Allows you to configure startup parameters for a notebook job. For more information, see Spark application configuration parameters.

Example:

{
        "spark.driver.resourceSpec": "small",
        "spark.executor.instances": 2,
        "spark.executor.resourceSpec": "small",
        "spark.adb.eni.vswitchId":"vsw-bp14pj8h0k5p0kwu3****",
        "spark.adb.eni.securityGroupId": "sg-bp14qrdskvwnzels****",
        "spark.hadoop.hive.metastore.uris": "thrift://192.168.XX.XX:9083"
    }
Important

The configuration of startup parameters for a notebook job causes the notebook kernel to restart. The restart process requires approximately 3 minutes to complete, and the running notebook jobs fail during the process.

③ Status bar

Parameter

Description

保存成功

The saving status of notebook jobs. The system saves notebook jobs at an interval of 5 seconds.

kernel

The status of the notebook kernel. The system refreshes the status of the notebook kernel at an interval of 5 seconds. Valid values:

  • 未启动 (Kernel Not Started): The notebook kernel is not started. You can choose Kernel > Restart Kernel to restart the notebook kernel.

  • 空闲 (Kernel Idle): The notebook kernel is idle and available to run notebook jobs.

  • 启动中 (Kernel Starting): The notebook kernel is being started. You can try again later.

  • 作业较多 (Kernel Busy): The notebook kernel is running large amounts of code. You can try again later.

  • 启动错误 (Kernel Error): A startup error occurs on the notebook kernel. You can try again later or restart the notebook kernel.

  • karnel已失效 (Kernel Invalid): The notebook kernel is invalid. You can restart the notebook kernel.

  • kernel已销毁 (Kernel Killed): The notebook kernel is killed. You can restart the notebook kernel.

  • 状态未知 (Kernel Unknown): The status of the notebook kernel is unknown. You can restart the notebook kernel.

④ Paragraph

duanluo

The following table describes the parameters in the Paragraph section.

Parameter

Description

The ID of the handle that uniquely identifies a running statement and is used to identify issues.

The code editing box. Syntax keywords are automatically highlighted, and Spark SQL and Python language are supported.

The toolbar that allows you to switch languages, format code, run code, pause code execution, clear results, and delete paragraphs.

  • 下拉框: allows you to switch languages between Spark SQL and Python.

  • 格式化: allows you to format only Spark SQL code.

  • 运行: allows you to run the code in the current paragraph.

  • 取消: allows you to pause the execution of the code in the current paragraph.

  • 清空: allows you to clear the results of the current paragraph.

  • 删除: allows you to delete the current paragraph.

The result section of the current paragraph. The code execution results are displayed in the table format for Spark SQL, and are displayed in the text format for other languages.

The status bar of the current paragraph that contains the running status, the execution duration, and the most recent update time.

Error codes

Error code

Error message

Solution

Console.NotebookNamingDuplicate

The notebook name already exists.

Specify another notebook name.

Console.NotebookParagraphNotRunning

The notebook code is not run.

Run the notebook code.

Console.NotebookParagraphMissingProgramCode

No program code is found in the notebook paragraph.

Write program code in the notebook paragraph.

Console.NotebookKernelNotStartup

The notebook kernel is not started.

Start the notebook kernel.

Spark.NotebookKernelStarting

The notebook kernel is starting.

Try again later.

Spark.NotebookKernelBusy

The notebook kernel contains a large amount of code that is pending execution.

Try again later.

Spark.NotebookKernelExpired

The notebook kernel is expired.

Restart the notebook kernel.

Spark.NotebookKernelInvalidStatus

The notebook kernel is invalid.

Restart the notebook kernel.

Spark.GetNotebookKernelFailed

Failed to start the notebook kernel.

Contact technical support.

Spark.GetNotebookKernelStateFailed

Failed to query the status of the notebook kernel.

Spark.ExecuteNotebookStatementFailed

Failed to run the notebook code.

Spark.CancelNotebookStatementFailed

Failed to pause the execution of the notebook code.

Spark.GetNotebookStatementResultFailed

Failed to query the notebook code results.

Spark.CloseNotebookKernelFailed

Failed to shut down the notebook kernel.

Console.NotebookNotFound

The created notebook cannot be found.

Console.NotebookCreateFailed

Failed to create the notebook.

Console.NotebookParagraphNotFound

The notebook paragraph cannot be found.

Console.NotebookParagraphCreateFailed

Failed to create the notebook paragraph.