A workspace is the basic unit of E-MapReduce (EMR) Serverless Spark. You can manage jobs, members, roles, and permissions based on workspaces. Job development must be implemented in a workspace. Therefore, you must create a workspace before you develop jobs. This topic describes how to create a workspace on the EMR Serverless Spark page.
Prerequisites
An Alibaba Cloud account is created and real-name verification is complete for the account.
The account that you want to use to create a workspace is prepared and the required permissions are granted to the account.
If you want to create a workspace by using an Alibaba Cloud account, prepare an Alibaba Cloud account and assign roles to the Alibaba Cloud account. For more information, see Assign roles to an Alibaba Cloud account.
If you want to create a workspace as a RAM user or a RAM role, prepare a RAM user or a RAM role and attach the AliyunEMRServerlessSparkFullAccess, AliyunOSSFullAccess, and AliyunDLFFullAccess policies to the RAM user or RAM role. Then, add the RAM user or RAM role on the Access Control page and assign the administrator role to the RAM user or RAM role. For more information, see Grant permissions to a RAM user and Manage users and roles.
Data Lake Formation (DLF) is activated. For more information, see Getting Started. For information about the regions in which DLF is supported, see Supported regions and endpoints.
Object Storage Service (OSS) is activated and a bucket is created. For more information, see Activate OSS and Create a bucket.
Precautions
The runtime environment of the code is managed and configured by the owner of the environment.
Procedure
Go to the Spark page.
Log on to the EMR console.
In the left-side navigation pane, choose
.In the top navigation bar, select a region based on your business requirements.
ImportantAfter you create a workspace, you cannot change the region of the workspace.
Click Create Workspace.
On the E-MapReduce Serverless Spark page, configure parameters. The following table describes the parameters.
Parameter
Description
Example
Region
The region in which the workspace resides. We recommend that you select the region where your data is stored.
China (Hangzhou)
Billing Method
The billing method of the workspace. Only the pay-as-you-go billing method is supported.
Pay-as-you-go
Workspace Name
The workspace name. The name must be 1 to 60 characters in length, and can contain only letters, digits, and hyphens (-). The name must start with a letter.
NoteThe names of workspaces within the same Alibaba Cloud account must be unique. If you enter the name of an existing workspace, the system displays a message that prompts you to enter a different name.
emr-serverless-spark
Maximum Quota
The maximum number of compute units (CUs) that can be concurrently used to process jobs in the workspace.
1000
Workspace Directory
The path of the OSS bucket that is used to store data files, such as job logs, running events, and resources.
We recommend that you select a bucket for which OSS-HDFS is enabled to ensure compatibility with native Hadoop Distributed File System (HDFS) interfaces. If HDFS is not involved in your business scenario, you can select an OSS bucket.
emr-oss-hdfs
DLF for Metadata Storage
Specifies whether to use DLF to store and manage your metadata.
By default, if you turn on the switch, the DLF catalog whose name is the same as the UID of your Alibaba Cloud account is selected. If you want different clusters to be associated with different DLF catalogs, you can perform the following operations to create DLF catalogs:
Click Create catalog. In the popover that appears, configure the Catalog ID parameter and click OK.
Select the DLF catalog you created from the drop-down list.
emr-dlf
Advanced Settings
You can configure the following parameter in the Advanced Settings section:
Execution Role: The role used by EMR Serverless Spark to run jobs. Select AliyunEMRSparkJobRunDefaultRole.
EMR Serverless Spark uses this role to access your resources in other cloud services, including the resources in OSS and DLF.
AliyunEMRSparkJobRunDefaultRole
Click Create Workspace.
References
After you create a workspace, you can develop jobs, such as Spark SQL jobs, in the workspace. For more information, see Get started with SQL jobs.