This topic describes how to create and run Spark applications in the AnalyticDB for MySQL console.
Overview
You can use the Spark editor to create and run Spark batch or streaming applications.
You can view the driver logs and submission details of the current Spark application.
You can view the execution logs of SQL statements.
Prerequisites
An AnalyticDB for MySQL Data Lakehouse Edition cluster is created.
A job resource group is created for the AnalyticDB for MySQL Data Lakehouse Edition cluster. For more information, see Create a resource group.
A Resource Access Management (RAM) user is granted the required permissions. For more information, see the "Grant permissions to a RAM user" section of the Manage RAM users and permissions topic.
A database account is created for the AnalyticDB for MySQL cluster.
If you use an Alibaba Cloud account, you need to only create a privileged account. For more information, see the "Create a privileged account" section of the Create a database account topic.
If you use a Resource Access Management (RAM) user, you must create a privileged account and a standard account and associate the standard account with the RAM user. For more information, see Create a database account and Associate or disassociate a database account with or from a RAM user.
AnalyticDB for MySQL is authorized to assume the AliyunADBSparkProcessingDataRole role to access other cloud resources. For more information, see Perform authorization.
The log storage path of Spark applications is configured.
NoteLog on to the AnalyticDB for MySQL console. Find the cluster that you want to manage and click the cluster ID. In the left-side navigation pane, choose . Click Log Settings. In the dialog box that appears, select the default path or specify a custom storage path. You cannot set the custom storage path to the root directory of OSS. Make sure that the custom storage path contains at least one layer of folders.
Create and run a Spark application
Log on to the AnalyticDB for MySQL console. In the upper-left corner of the console, select a region. In the left-side navigation pane, click Clusters. On the Data Lakehouse Edition tab, find the cluster that you want to manage and click the cluster ID.
In the left-side navigation pane, choose .
On the Spark JAR Development page, click the icon to the right of Applications.
In the Create Application panel, configure the parameters that are described in the following table.
Parameter
Description
Name
The name of the application or directory. File names are case-insensitive.
Type
If you select Application from the Type drop-down list, the template is in the file format.
If you select Directory from the Type drop-down list, the template is in the folder format.
Parent Level
The parent directory of the file or folder.
Job Type
Batch: batch application.
Streaming: streaming application.
SQL Engine: Spark distributed SQL engine.
Click OK.
After you create a Spark template, configure a Spark application in the Spark editor. For more information, see Overview.
After you configure the Spark application, perform the following operations:
Click Save to save the Spark application. Then, you can reuse the application.
Click Run Now to run the Spark application. The status of the Spark application is displayed on the Applications tab in real time.
Before you run a Spark application, you must select a job resource group and an application type.
View information about a Spark application
On the Applications tab, search for an application by application ID and perform the following operations to view information about the Spark application:
Click Logs in the Actions column to view the driver logs of the current Spark application or the execution log of SQL statements.
Click UI in the Actions column to go to the corresponding Spark UI. Access to the UI has a validity period. If the validity period ends, you must re-access the UI.
Click Details in the Actions column to view submission details of the current application, such as the log path, web UI URL, cluster ID, and resource group name.
Choose More > Stop in the Actions column to stop the current application.
Choose More > History in the Actions column to view the history of retry attempts on the current application.
On the Execution History tab, view the history of retry attempts on all applications.
NoteBy default, no retry is performed after an application fails. To perform retry attempts, configure the spark.adb.maxAttempts and spark.adb.attemptFailuresValidityInterval parameters. For more information, see Spark application configuration parameters.