Use the notebook editor - AnalyticDB - Alibaba Cloud Documentation Center

AnalyticDB for MySQL uses an interactive data analysis and development platform to provide notebook development that features job editing, data analysis, and data visualization. In the notebook editor, you can write Spark applications in the Spark SQL and Python languages.

Prerequisites

An AnalyticDB for MySQL Data Lakehouse Edition cluster is created.
A job resource group that has at least 8 AnalyticDB compute units (ACUs) of maximum computing resources is created. For more information, see Create a resource group.
A database account is created for the AnalyticDB for MySQL cluster.
- If you use an Alibaba Cloud account, you need to only create a privileged account. For more information, see the "Create a privileged account" section of the Create a database account topic.
- If you use a Resource Access Management (RAM) user, you must create a privileged account and a standard account and associate the standard account with the RAM user. For more information, see Create a database account and Associate or disassociate a database account with or from a RAM user.
An Object Storage Service (OSS) bucket is created in the same region and belongs to the same Alibaba Cloud account as the AnalyticDB for MySQL cluster.
AnalyticDB for MySQL is authorized to assume the AliyunADBSparkProcessingDataRole role to access other cloud resources. For more information, see Perform authorization.

Usage notes

The notebook development feature becomes unavailable as of May 10, 2024.

If you created notebooks before May 10, 2024, you can still use the feature.
If you did not create notebooks before May 10, 2024, you cannot use the feature in the AnalyticDB for MySQL console.

Create a notebook

Log on to the AnalyticDB for MySQL console. In the upper-left corner of the console, select a region. In the left-side navigation pane, click Clusters. On the Clusters page, click an edition tab. Find the cluster that you want to manage and click the cluster ID.
In the left-side navigation pane, choose Job Development > Notebook Development.
On the Notebook Development page, click Create Notebook in the upper-right corner.
In the Log Settings dialog box, select Default or Custom for the log path and click OK.
Note
The first time you create a notebook, the system checks whether you have configured a log path to store Spark runtime logs. If you have configured a log path, this step is skipped. Otherwise, a dialog box appears.

In the Create Notebook panel, configure the parameters that are described in the following table.

Parameter	Description	Example
Resource Group	The resource group that is used to create a notebook. Select a job resource group from the drop-down list. The job resource group must meet the following requirements: The resource group is in the running state. The maximum amount of computing resources is greater than or equal to 8 ACUs.	(Job) notebook
Name	The name of the notebook. The name can be up to 64 characters in length. The name can contain letters, digits, underscores (_), and hyphens (-). The name must be unique.	notebook-test
Description	The description of the notebook.	Feature testing

Click OK.

Develop a notebook

On the Notebook Development page, click the name of the notebook that you want to develop and perform development in the Paragraph section.

book123

The following tables describe the sections and their parameters on the Notebook Development page.

① Menu bar

Parameter		Description
Resource Group	Create Resource Group	Allows you to create a job resource group. For more information, see Create a resource group.
Resource Group	Change Resource Group	Allows you to change the resource group of the notebook. Before you change the resource group of the notebook, make sure that the following requirements are met: The resource group is of the job type. The resource group is in the running state. The maximum amount of computing resources is greater than or equal to 8 ACUs. Important The resource group change causes the notebook kernel to restart. The restart process requires approximately 3 minutes to complete, and the running notebook jobs fail during the process.
Kernel	Restart Kernel	Allows you to restart the notebook kernel. The restart process requires approximately 3 minutes to complete, and the running notebook jobs fail during the process.
Kernel	Kill Kernel	Allows you to kill the notebook kernel. If you kill the notebook kernel, the running notebook jobs fail.

② Toolbar

Parameter	Description
	Allows you to save notebook jobs. The following methods are also available to save notebook jobs: The system saves notebook jobs at an interval of 5 seconds. You can press Ctrl+S to save notebook jobs.
	Allows you to create a paragraph. You can also move the pointer over the middle part between two paragraphs and click +Create Paragraph.
	Allows you to run code in all paragraphs.
	Allows you to pause the execution of code in all paragraphs. Important You cannot pause notebook jobs that are in the running state.
	Allows you to clear the results of all paragraphs.
	Allows you to configure startup parameters for a notebook job. For more information, see Spark application configuration parameters. Example: `{ "spark.driver.resourceSpec": "small", "spark.executor.instances": 2, "spark.executor.resourceSpec": "small", "spark.adb.eni.vswitchId":"vsw-bp14pj8h0k5p0kwu3**", "spark.adb.eni.securityGroupId": "sg-bp14qrdskvwnzels", "spark.hadoop.hive.metastore.uris": "thrift://192.168.XX.XX:9083" }` Important** The configuration of startup parameters for a notebook job causes the notebook kernel to restart. The restart process requires approximately 3 minutes to complete, and the running notebook jobs fail during the process.

③ Status bar

Parameter

Description

保存成功

The saving status of notebook jobs. The system saves notebook jobs at an interval of 5 seconds.

kernel

The status of the notebook kernel. The system refreshes the status of the notebook kernel at an interval of 5 seconds. Valid values:

(Kernel Not Started): The notebook kernel is not started. You can choose Kernel > Restart Kernel to restart the notebook kernel.
(Kernel Idle): The notebook kernel is idle and available to run notebook jobs.
(Kernel Starting): The notebook kernel is being started. You can try again later.
(Kernel Busy): The notebook kernel is running large amounts of code. You can try again later.
(Kernel Error): A startup error occurs on the notebook kernel. You can try again later or restart the notebook kernel.
(Kernel Invalid): The notebook kernel is invalid. You can restart the notebook kernel.
(Kernel Killed): The notebook kernel is killed. You can restart the notebook kernel.
(Kernel Unknown): The status of the notebook kernel is unknown. You can restart the notebook kernel.

④ Paragraph

duanluo

The following table describes the parameters in the Paragraph section.

Parameter	Description
①	The ID of the handle that uniquely identifies a running statement and is used to identify issues.
②	The code editing box. Syntax keywords are automatically highlighted, and Spark SQL and Python language are supported.
③	The toolbar that allows you to switch languages, format code, run code, pause code execution, clear results, and delete paragraphs. : allows you to switch languages between Spark SQL and Python. : allows you to format only Spark SQL code. : allows you to run the code in the current paragraph. : allows you to pause the execution of the code in the current paragraph. : allows you to clear the results of the current paragraph. : allows you to delete the current paragraph.
④	The result section of the current paragraph. The code execution results are displayed in the table format for Spark SQL, and are displayed in the text format for other languages.
⑤	The status bar of the current paragraph that contains the running status, the execution duration, and the most recent update time.

Error codes

Error code	Error message	Solution
Console.NotebookNamingDuplicate	The notebook name already exists.	Specify another notebook name.
Console.NotebookParagraphNotRunning	The notebook code is not run.	Run the notebook code.
Console.NotebookParagraphMissingProgramCode	No program code is found in the notebook paragraph.	Write program code in the notebook paragraph.
Console.NotebookKernelNotStartup	The notebook kernel is not started.	Start the notebook kernel.
Spark.NotebookKernelStarting	The notebook kernel is starting.	Try again later.
Spark.NotebookKernelBusy	The notebook kernel contains a large amount of code that is pending execution.	Try again later.
Spark.NotebookKernelExpired	The notebook kernel is expired.	Restart the notebook kernel.
Spark.NotebookKernelInvalidStatus	The notebook kernel is invalid.	Restart the notebook kernel.
Spark.GetNotebookKernelFailed	Failed to start the notebook kernel.	Contact technical support.
Spark.GetNotebookKernelStateFailed	Failed to query the status of the notebook kernel.
Spark.ExecuteNotebookStatementFailed	Failed to run the notebook code.
Spark.CancelNotebookStatementFailed	Failed to pause the execution of the notebook code.
Spark.GetNotebookStatementResultFailed	Failed to query the notebook code results.
Spark.CloseNotebookKernelFailed	Failed to shut down the notebook kernel.
Console.NotebookNotFound	The created notebook cannot be found.
Console.NotebookCreateFailed	Failed to create the notebook.
Console.NotebookParagraphNotFound	The notebook paragraph cannot be found.
Console.NotebookParagraphCreateFailed	Failed to create the notebook paragraph.