A gateway cluster provides load balancing and security isolation. You can also use a gateway cluster to submit jobs to an E-MapReduce (EMR) cluster. This topic describes how to create a gateway cluster in EMR.
Prerequisites
You must have an existing Hadoop or Kafka cluster in EMR. For more information, see Create a cluster.
You can create Hadoop or Kafka clusters only if your Alibaba Cloud account created such clusters before 17:00 (UTC+8) on December 19, 2022. If your account had not created these cluster types before this time, you can no longer create them.
Limitations
The method described in this topic applies only to Hadoop and Kafka clusters. For information about gateway environment deployment for DataLake, OLAP, DataFlow, and Custom clusters, see Gateway deployment modes and selection guide.
Procedure
Log on to the EMR console.
On the EMR on ECS page, click the name of the target cluster.
In the upper-right corner of the Basic Information page, choose All Operations > Create Gateway.
On the Create Gateway page, configure the parameters.
Section
Parameter
Description
Associated Settings
Region
The physical location of the gateway cluster.
Resource Group
Select the resource group for the gateway cluster.
To create a new resource group, click Create Resource Group.. For more information, see Create a resource group.
Associated Cluster
The compute cluster to associate with the gateway cluster, filtered by the selected region. The cluster to be associated must meet the following requirements:
The cluster status must be Running.
Only Hadoop or Kafka clusters can be associated.
NoteAfter you select an associated cluster, the VPC of the gateway cluster defaults to the VPC of the associated cluster. You can associate clusters created in both the new and old versions of the console.
Basic Settings
Billing Method
Subscription: A subscription billing method. You pay before you use the service.
Pay-as-you-go: A pay-as-you-go billing method. You pay after you use the service. You are charged hourly based on actual usage. This method is suitable for short-term tests or flexible, dynamic tasks.
Zone
The zone where the associated cluster is located.
vSwitch
Select the vSwitch in the corresponding VPC and zone.
Default Security Group
The security group of the associated cluster.
Assign Public Network IP
Specifies whether to attach an Elastic IP Address (EIP) to the gateway.
Node Group
Instance Type: The ECS instance types available in the selected region. For more information, see Instance families.
System Disk: The type of system disk for the gateway nodes. The available types are ultra disk, enterprise SSD (ESSD), and standard SSD. The displayed disk types vary based on the instance type and region. By default, the system disk is released when the cluster is released.
Adjust the system disk size as needed. The value range is 60 GiB to 500 GiB.
Data Disk: The type of data disk for the gateway nodes. The available types are ultra disk, ESSD, and standard SSD. The displayed disk types vary based on the instance type and region. By default, the data disk is released when the cluster is released.
Adjust the data disk size as needed. The value range is 40 GiB to 32,768 GiB.
Instances: The number of instances. The default value is 1. Adjust the number as needed.
Cluster Name
The name of the gateway cluster. The name must be 1 to 64 characters long and can contain Chinese characters, letters, digits, hyphens (-), and underscores (_).
Identity Credentials
The user credentials to log on to all nodes in the gateway cluster.
Password: Enter the password to log on to the gateway.
The length must be 8 to 30 characters.
It must contain uppercase letters and lowercase letters.
It must contain digits and special characters. The supported special characters are: !@#$%^&*
Key Pair: Select a key pair that is used to log on to the Gateway. If you have not created a key pair, you can click Create Key Pair to go to the ECS console and create one.
Keep the private key file (.pem file) secure. After the gateway is created, the public key of the key pair is automatically attached to the ECS instance of the gateway. When you log on to the gateway using Secure Shell (SSH), you must provide the private key from the private key file.
Advanced Settings
ECS Application Role
The Resource Access Management (RAM) role provides the necessary permissions for applications running on the cluster to call other Alibaba Cloud services. You can use the default value. The default value is AliyunECSInstanceForEMRRole.
Bootstrap Actions
This is an optional parameter. You can run custom scripts before the cluster starts. For more information, see Run scripts using bootstrap actions and Manually run scripts.
Tags
This is an optional parameter. You can attach tags when you create the cluster or after the cluster is created. For more information, see Manage tags.
Data Disk Encryption
This is an optional parameter. You can enable this feature only when you create the cluster. For more information, see Enable data disk encryption.
After you configure the parameters, click Create and Pay.
After the cluster is created, its Status changes from Creating to Running.