AnalyticDB for MySQL Data Lakehouse Edition (V3.0) uses an interactive data analysis and development platform to provide notebook development that features job editing, data analysis, and data visualization. In the notebook editor, you can write Spark applications in the Spark SQL and Python languages.
Prerequisites
An AnalyticDB for MySQL Data Lakehouse Edition (V3.0) cluster is created. For more information, see Create a Data Lakehouse Edition cluster.
A job resource group that has at least 8 AnalyticDB compute units (ACUs) of maximum computing resources is created. For more information, see Create a resource group.
A database account of the AnalyticDB for MySQL cluster is associated with a Resource Access Management (RAM) user. For more information, see Associate or disassociate a database account with or from a RAM user.
An Object Storage Service (OSS) bucket is created in the same region and belongs to the same Alibaba Cloud account as the AnalyticDB for MySQL cluster.
AnalyticDB for MySQL is granted the AliyunADBSparkProcessingDataRole permission. For more information, see Perform authorization.
Usage note
The notebook development feature becomes unavailable as of May 10, 2024.
If you created notebooks before May 10, 2024, you can still use the feature.
If you did not create notebooks before May 10, 2024, you cannot use the feature in the AnalyticDB for MySQL console.
Create a notebook
Log on to the AnalyticDB for MySQL console.
In the upper-left corner of the console, select a region.
In the left-side navigation pane, click Clusters.
On the Data Lakehouse Edition (V3.0) tab, find the cluster that you want to manage and click the cluster ID.
In the left-side navigation pane, choose Job Development > Notebook Development.
On the Notebook Development page, click Create Notebook in the upper-right corner.
In the Log Settings dialog box, select Default or Custom for the log path and click OK.
NoteThe first time you create a notebook, the system checks whether you have configured a log path to store Spark runtime logs. If you have configured a log path, this step is skipped. Otherwise, a dialog box appears.
In the Create Notebook panel, configure the parameters that are described in the following table.
Parameter
Description
Example
Resource Group
The resource group that is used to create a notebook. Select a job resource group from the drop-down list. The job resource group must meet the following requirements:
The resource group is in the running state.
The maximum amount of computing resources is greater than or equal to 8 ACUs.
Name
The name of the notebook.
The name can be up to 64 characters in length.
The name can contain letters, digits, underscores (_), and hyphens (-).
The name must be unique.
notebook-test
Description
The description of the notebook.
Feature testing
Click OK.
Develop a notebook
On the Notebook Development page, click the name of the notebook that you want to develop and perform development in the Paragraph section.
The following tables describe the sections and their parameters on the Notebook Development page.
① Menu bar
Parameter | Description | |
Resource Group | Create Resource Group | Allows you to create a job resource group. For more information, see Create a resource group. |
Change Resource Group | Allows you to change the resource group of the notebook. Before you change the resource group of the notebook, make sure that the following requirements are met:
Important The resource group change causes the notebook kernel to restart. The restart process requires approximately 3 minutes to complete, and the running notebook jobs fail during the process. | |
Kernel | Restart Kernel | Allows you to restart the notebook kernel. The restart process requires approximately 3 minutes to complete, and the running notebook jobs fail during the process. |
Kill Kernel | Allows you to kill the notebook kernel. If you kill the notebook kernel, the running notebook jobs fail. |
② Toolbar
Parameter | Description |
Allows you to save notebook jobs. The following methods are also available to save notebook jobs:
| |
Allows you to create a paragraph. You can also move the pointer over the middle part between two paragraphs and click +Create Paragraph. | |
Allows you to run code in all paragraphs. | |
Allows you to pause the execution of code in all paragraphs. Important You cannot pause notebook jobs that are in the running state. | |
Allows you to clear the results of all paragraphs. | |
Allows you to configure startup parameters for a notebook job. For more information, see Spark application configuration parameters. Example:
Important The configuration of startup parameters for a notebook job causes the notebook kernel to restart. The restart process requires approximately 3 minutes to complete, and the running notebook jobs fail during the process. |
③ Status bar
Parameter | Description |
The saving status of notebook jobs. The system saves notebook jobs at an interval of 5 seconds. | |
The status of the notebook kernel. The system refreshes the status of the notebook kernel at an interval of 5 seconds. Valid values:
|
④ Paragraph
The following table describes the parameters in the Paragraph section.
Parameter | Description |
① | The ID of the handle that uniquely identifies a running statement and is used to identify issues. |
② | The code editing box. Syntax keywords are automatically highlighted, and Spark SQL and Python language are supported. |
③ | The toolbar that allows you to switch languages, format code, run code, pause code execution, clear results, and delete paragraphs.
|
④ | The result section of the current paragraph. The code execution results are displayed in the table format for Spark SQL, and are displayed in the text format for other languages. |
⑤ | The status bar of the current paragraph that contains the running status, the execution duration, and the most recent update time. |
Error codes
Error code | Error message | Solution |
Console.NotebookNamingDuplicate | The notebook name already exists. | Specify another notebook name. |
Console.NotebookParagraphNotRunning | The notebook code is not run. | Run the notebook code. |
Console.NotebookParagraphMissingProgramCode | No program code is found in the notebook paragraph. | Write program code in the notebook paragraph. |
Console.NotebookKernelNotStartup | The notebook kernel is not started. | Start the notebook kernel. |
Spark.NotebookKernelStarting | The notebook kernel is starting. | Try again later. |
Spark.NotebookKernelBusy | The notebook kernel contains a large amount of code that is pending execution. | Try again later. |
Spark.NotebookKernelExpired | The notebook kernel is expired. | Restart the notebook kernel. |
Spark.NotebookKernelInvalidStatus | The notebook kernel is invalid. | Restart the notebook kernel. |
Spark.GetNotebookKernelFailed | Failed to start the notebook kernel. | Contact technical support. |
Spark.GetNotebookKernelStateFailed | Failed to query the status of the notebook kernel. | |
Spark.ExecuteNotebookStatementFailed | Failed to run the notebook code. | |
Spark.CancelNotebookStatementFailed | Failed to pause the execution of the notebook code. | |
Spark.GetNotebookStatementResultFailed | Failed to query the notebook code results. | |
Spark.CloseNotebookKernelFailed | Failed to shut down the notebook kernel. | |
Console.NotebookNotFound | The created notebook cannot be found. | |
Console.NotebookCreateFailed | Failed to create the notebook. | |
Console.NotebookParagraphNotFound | The notebook paragraph cannot be found. | |
Console.NotebookParagraphCreateFailed | Failed to create the notebook paragraph. |