All Products
Search
Document Center

E-MapReduce:Manage templates

Last Updated:Jul 16, 2024

E-MapReduce (EMR) Serverless Spark provides application and SQL Compute defaults to meet different requirements for job running and management. Application defaults are suitable for batch processing jobs that require fixed runtime parameters and resources. SQL compute defaults are suitable for the development and fast iteration of SQL jobs. This topic describes how to use these configurations to simplify job submission and management.

Prerequisites

A workspace is created. For more information, see Manage workspaces.

Overview

Template

Description

Application Defaults

Application Defaults is used to predefine a set of configurations for Spark jobs. A job template contains all configurations that are required to run a specific job. If you use a job template to develop jobs, you can ensure that the configurations and running environments of the jobs are consistent when you submit the jobs.

SQL Compute Defaults

SQL Compute Defaults is used to predefine a set of configurations for Spark interactive sessions. The configurations predefined by an SQL compute template include the resource quotas and other configurations of the interactive environment. This allows you to execute code snippets in a persistent Spark environment.

SQL compute templates are suitable for scenarios that require real-time interaction or frequent iterations, such as data analysis, development, and testing. SQL compute templates allow you to flexibly submit jobs, view results, and dynamically modify parameters and resource configurations in a persistent session environment.

Important

If you want to modify the configurations of an SQL compute during development, you must go to the Compute page. For more information, see Manage SQL computes.

Template parameters

In the left-side navigation pane of the EMR Serverless Spark page, click Defaults to view or modify the template parameters.

Parameter

Description

Engine Version

The version of the engine that is used by the compute. For more information about engine versions, see Engine versions.

spark.driver.cores

The number of CPU cores that are used by the driver of the Spark application.

spark.driver.memory

The size of memory that is available to the driver of the Spark application.

spark.executor.cores

The number of CPU cores that can be used by each executor of the Spark application.

spark.executor.memory

The size of memory that is available to each executor of the Spark application.

spark.executor.instances

The number of executors that are allocated to the Spark application.

Dynamic Resource Allocation

By default, this feature is disabled. After you enable this feature, you must configure the following parameters:

  • Min executors: Default value: 2.

  • Max executors: If you do not configure spark.executor.instances, this parameter is set to the default value 10.

More Memory Configurations (Click to Show)

  • spark.driver.memoryOverhead: the size of non-heap memory that is available to each driver. Default value: 1 GB.

  • spark.executor.memoryOverhead: the size of non-heap memory that is available to each executor. Default value: 1 GB.

  • spark.memory.offHeap.size: the size of non-heap memory that is available to the Spark application. Default value: 1 GB.

    This parameter is valid only if spark.memory.offHeap.enabled is set to true. By default, this parameter is valid and set to 1 GB only if the Fusion engine is used.

Spark Configuration

The Spark configurations. Separate the configurations with spaces, such as spark.sql.catalog.paimon.metastore dlf.