All Products
Search
Document Center

E-MapReduce:SPARK

Last Updated:Aug 14, 2023

SPARK nodes are used to run Spark applications. This topic describes the parameters that are involved when you create a SPARK node.

Parameters

Parameter

Description

Node Name

The name of the node. A node name is unique in a workflow.

Run flag

  • The run flag of the node. Default value: Normal. Valid values: Normal: The node is run when the workflow is run.

  • Prohibition execution: The node is not run when the workflow is run.

Description

The feature description of the node.

Task priority

The priority of the node in the workflow. Default value: MEDIUM. Valid values:

  • HIGHEST

  • HIGH

  • MEDIUM

  • LOW

  • LOWEST

Number of failed retries

The maximum number of times that the system automatically retries to run the node if the workflow fails.

Failed retry interval

The interval between two consecutive retries. Unit: minutes.

Delay execution time

The amount of time that is delayed before the node is run. The default value is 0, which indicates that the node is immediately run after it is created. The node is delayed to be run only if you specify a value that is greater than 0 for this parameter. Unit: minutes.

Timeout alarm

Specifies whether to enable alerting on node execution timeout. By default, the Timeout alarm switch is turned off. You can turn on the Timeout alarm switch and select Timeout alarm and Timeout failure as Timeout strategy. If the execution of a node exceeds the timeout period, an alert message is sent to your mailbox, and the node fails.

Program Type

The type of the programming language. Valid values: JAVA, SCALA, PYTHON, SQL, and CUSTOM_SCRIPT.

Note

The parameters that are displayed vary based on the type of the programming language that you select. The parameters that are displayed in the console take precedence.

Main Class

The full path of the class that contains the main method for the Spark application.

Main Package

The JAR package that is used to run the Spark application. The JAR package is uploaded by using the resource management feature. For more information, see Resource management.

Deploy Mode

The deployment mode. Set this parameter to cluster.

Script

  • The script. If you set the Program Type parameter to SQL, you must enter SQL statements in the Script field.

  • If you set the Program Type parameter to CUSTOM_SCRIPT, you must enter the complete spark-submit or spark-sql command in the Script field.

App Name

The name of the Spark application.

Driver Cores

The number of driver cores. Configure this parameter based on the actual production environment.

Driver Memory

The size of the driver memory. Configure this parameter based on the actual production environment.

Executor Number

The number of executors. Configure this parameter based on the actual production environment.

Executor Memory

The size of the executor memory. Configure this parameter based on the actual production environment.

Executor Cores

The number of executor cores. Configure this parameter based on the actual production environment.

Main Arguments

The input parameters of the Spark application. You can replace custom variables defined in the input parameters.

Option Parameters

The optional parameters. Supported formats: --jars, --files, --archives, and --conf.

Resources

The resource files that are required for node execution. If a resource file is referenced by other parameters, make sure that you created or uploaded the file on the File Manage page of the Resources tab and select the file as Resources.

Custom Parameters

The custom parameters of the node. The custom parameters are used to replace ${Variable} in the script.

Pre tasks

The ancestor node of the current node.