E-MapReduce (EMR) Serverless Spark provides application and SQL Compute defaults to meet different requirements for job running and management. Application defaults are suitable for batch processing jobs that require fixed runtime parameters and resources. SQL compute defaults are suitable for the development and fast iteration of SQL jobs. This topic describes how to use these configurations to simplify job submission and management.
Prerequisites
A workspace is created. For more information, see Manage workspaces.
Overview
Template | Description |
Application Defaults | Application Defaults is used to predefine a set of configurations for Spark jobs. A job template contains all configurations that are required to run a specific job. If you use a job template to develop jobs, you can ensure that the configurations and running environments of the jobs are consistent when you submit the jobs. |
SQL Compute Defaults | SQL Compute Defaults is used to predefine a set of configurations for Spark interactive sessions. The configurations predefined by an SQL compute template include the resource quotas and other configurations of the interactive environment. This allows you to execute code snippets in a persistent Spark environment. SQL compute templates are suitable for scenarios that require real-time interaction or frequent iterations, such as data analysis, development, and testing. SQL compute templates allow you to flexibly submit jobs, view results, and dynamically modify parameters and resource configurations in a persistent session environment. Important If you want to modify the configurations of an SQL compute during development, you must go to the Compute page. For more information, see Manage SQL computes. |
Template parameters
In the left-side navigation pane of the EMR Serverless Spark page, click Defaults to view or modify the template parameters.
Parameter | Description |
Engine Version | The version of the engine that is used by the compute. For more information about engine versions, see Engine versions. |
spark.driver.cores | The number of CPU cores that are used by the driver of the Spark application. |
spark.driver.memory | The size of memory that is available to the driver of the Spark application. |
spark.executor.cores | The number of CPU cores that can be used by each executor of the Spark application. |
spark.executor.memory | The size of memory that is available to each executor of the Spark application. |
spark.executor.instances | The number of executors that are allocated to the Spark application. |
Dynamic Resource Allocation | By default, this feature is disabled. After you enable this feature, you must configure the following parameters:
|
More Memory Configurations (Click to Show) |
|
Spark Configuration | The Spark configurations. Separate the configurations with spaces, such as |