All Products
Search
Document Center

E-MapReduce:Terms

Last Updated:Oct 28, 2024

This topic describes the terms of E-MapReduce (EMR) Serverless Spark to help you better understand the service.

Term

Description

workspace

Workspaces are the basic unit for business development. A workspace is a collection of jobs, computing resources, and permissions that are isolated from those in other workspaces.

resource queue

EMR Serverless Spark uses compute units (CUs) as the basic unit to measure computing resources. For more information about CUs, see Billing.

Regardless of whether a Spark compute node is a driver or an executor, you can allocate one or more CUs to the node based on the vCore and memory configuration. EMR Serverless Spark provides a minimum of 20 GiB and a maximum of 160 GiB of local storage space for each compute node. The number of CUs that can be consumed by a job depends on the computation complexity of the job and the distribution of the relevant data. You can view the number of CUs consumed by a job run in the job list.

session resource

Session resources are Spark sessions that are available in an EMR Serverless Spark workspace. A session can be deployed in a queue to provide the basic resources that are required to run SQL statements and the notebook environment. In a session, you can change the associated engine version and queue and modify Spark parameters based on your business requirements.

publish

To prevent draft files under modification from affecting the scheduling of jobs, you must publish a draft file after the modification to the file is complete. The publishing of draft files helps isolate the development and production environments.

job run

In the job orchestration system, a job run ID is generated each time a workflow runs.

workflow

A workflow is an orderly process that consists of a series of jobs. Jobs in a workflow depend on each other and are run in a specific order.

user

User is a term that is used in access control. You can add a RAM user as a member of a workspace and then grant the RAM user the required permissions to manage jobs and resources in the workspace.

role

Role is a term that is used in access control. One user can assume multiple roles. Multiple users can assume the same role. After you grant permissions to a role, all users who assume this role have the same permissions.