How do I create a SPARK node - E-MapReduce - Alibaba Cloud Documentation Center

SPARK nodes are used to run Spark applications. This topic describes the parameters that are involved when you create a SPARK node.

Parameters

Parameter	Description
Node Name	The name of the node. A node name is unique in a workflow.
Run flag	The run flag of the node. Default value: Normal. Valid values: Normal: The node is run when the workflow is run. Prohibition execution: The node is not run when the workflow is run.
Description	The feature description of the node.
Task priority	The priority of the node in the workflow. Default value: MEDIUM. Valid values: HIGHEST HIGH MEDIUM LOW LOWEST
Number of failed retries	The maximum number of times that the system automatically retries to run the node if the workflow fails.
Failed retry interval	The interval between two consecutive retries. Unit: minutes.
Delay execution time	The amount of time that is delayed before the node is run. The default value is 0, which indicates that the node is immediately run after it is created. The node is delayed to be run only if you specify a value that is greater than 0 for this parameter. Unit: minutes.
Timeout alarm	Specifies whether to enable alerting on node execution timeout. By default, the Timeout alarm switch is turned off. You can turn on the Timeout alarm switch and select Timeout alarm and Timeout failure as Timeout strategy. If the execution of a node exceeds the timeout period, an alert message is sent to your mailbox, and the node fails.
Program Type	The type of the programming language. Valid values: JAVA, SCALA, PYTHON, SQL, and CUSTOM_SCRIPT. Note The parameters that are displayed vary based on the type of the programming language that you select. The parameters that are displayed in the console take precedence.
Main Class	The full path of the class that contains the main method for the Spark application.
Main Package	The JAR package that is used to run the Spark application. The JAR package is uploaded by using the resource management feature. For more information, see Resource management.
Deploy Mode	The deployment mode. Set this parameter to cluster.
Script	The script. If you set the Program Type parameter to SQL, you must enter SQL statements in the Script field. If you set the Program Type parameter to CUSTOM_SCRIPT, you must enter the complete spark-submit or spark-sql command in the Script field.
App Name	The name of the Spark application.
Driver Cores	The number of driver cores. Configure this parameter based on the actual production environment.
Driver Memory	The size of the driver memory. Configure this parameter based on the actual production environment.
Executor Number	The number of executors. Configure this parameter based on the actual production environment.
Executor Memory	The size of the executor memory. Configure this parameter based on the actual production environment.
Executor Cores	The number of executor cores. Configure this parameter based on the actual production environment.
Main Arguments	The input parameters of the Spark application. You can replace custom variables defined in the input parameters.
Option Parameters	The optional parameters. Supported formats: `--jars`, `--files`, `--archives`, and `--conf`.
Resources	The resource files that are required for node execution. If a resource file is referenced by other parameters, make sure that you created or uploaded the file on the File Manage page of the Resources tab and select the file as Resources.
Custom Parameters	The custom parameters of the node. The custom parameters are used to replace `${Variable}` in the script.
Pre tasks	The ancestor node of the current node.