This topic describes how to create Spark jobs, view the details of Spark jobs, and terminate Spark jobs.
Prerequisites
Lindorm Distributed Processing System (LDPS) is activated for the Lindorm instance. For more information, see Activate LDPS and modify the configurations.
A job is developed. For more information, see Create a job in Java or Create a job in Python.
The developed job is uploaded to HDFS or Object Storage Service (OSS). For more information, see Upload files in the Lindorm console.
Create a Spark job
Log on to the Lindorm console.
In the upper-left corner of the page, select the region where the instance is deployed.
On the Instances page, click the ID of the instance that you want to manage or click Manage in the Actions column corresponding to the instance.
In the left-side navigation pane, click Compute Engine.
Click the Job Management tab.
Click Create Job.
In the Create Job dialog box, specify the job name and select the job type.
Configure the parameters described in the following table based on the job template. Keep the default values of other parameters.
Parameter
Description
token
The token that is used to authenticate computing resources when you submit the Spark job. You can obtain the token by performing the following steps in the Lindorm console: Click the ID of the Lindorm instance. In the left-side navigation pane, click Database Connections, and then click the Compute Engine tab.
mainResource
The path in which the JAR package or Python file of the job is stored in OSS or HDFS.
mainClass
The class that is used as the entry point for your program in the JAR Spark job.
args
The parameter that is passed to the mainClass parameter.
configs
System parameters of the Spark job. If the job is uploaded to OSS, you must configure the following parameters in configs:
spark.hadoop.fs.oss.endpoint: The path in which the Spark job is stored in OSS.
spark.hadoop.fs.oss.accessKeyId: The AccessKey ID used to access OSS. You can obtain the AccessKey ID in the console. For more information, see Obtain an AccessKey pair.
spark.hadoop.fs.oss.accessKeySecret: The AccessKey secret used to access OSS. You can obtain the AccessKey secret in the console. For more information, see Obtain an AccessKey pair.
spark.hadoop.fs.oss.impl: The class used to access OSS. Set the value to org.apache.hadoop.fs.aliyun.oss.AliyunOSSFileSystem.
Click Save in the upper-right corner of the page.
Click Run in the upper-right corner of the page.
View the details of a Spark job
Log on to the Lindorm console.
In the upper-left corner of the page, select the region where the instance is deployed.
On the Instances page, click the ID of the instance that you want to manage or click Manage in the Actions column corresponding to the instance.
In the left-side navigation pane, click Compute Engine.
Click the Jobs tab to view the details of a Spark job.
Spark job information
Description
JobId
The ID of the Spark job.
AppName
The name of the Spark job.
WebUI Address
The address of the Spark web UI. You can copy the address of the Spark web UI, paste the address in the address bar of your browser, and log on to the Spark web UI by using the username and password of LindormTable to view the details of the Spark job. For the detailed information that is displayed on the Spark web UI page, see View the information about Spark jobs.
NoteYou can obtain the default username and password of LindormTable on the Wide Table Engine tab of the Database Connections page.
Status
The status of the Spark job.
Details
The details of the Spark job status.
Actions
Operations that can be performed on the Spark job.
Terminate a Spark job
Only Spark jobs that are in the Starting or Running state can be terminated.
The termination of Spark jobs does not affect the Lindorm instance.
Log on to the Lindorm console.
In the upper-left corner of the page, select the region where the instance is deployed.
On the Instances page, click the ID of the instance that you want to manage or click Manage in the Actions column corresponding to the instance.
In the left-side navigation pane, click Compute Engine.
Click the Jobs tab.
Find the Spark job that you want to terminate and then click Stop Job in the Actions column corresponding to the job.