All Products
Search
Document Center

E-MapReduce:Gateway nodes

最終更新日:Dec 03, 2024

Gateway nodes play an important role in Alibaba Cloud E-MapReduce (EMR). Gateway nodes can be associated with an existing EMR cluster and serve as separate job submission points. This topic describes gateway clusters and gateway node groups and provides references on how to create a gateway cluster and a gateway node group based on an existing EMR cluster.

A gateway cluster or a gateway node group is an independent cluster or node group that consists of multiple gateway nodes with the same configurations. Clients such as Hadoop Distributed File System (HDFS), YARN, Hive, Spark 2, Spark 3, JindoSDK, Flink, Sqoop, Impala, Presto, Hudi, Iceberg, Tez, and Delta Lake are deployed on the cluster. If no gateway cluster or gateway node group is created, jobs of an EMR cluster, such as a Hadoop cluster, are submitted on the master node or a core node of the cluster. This consumes the resources of this cluster. After a gateway cluster is created, you can use the gateway cluster to submit jobs of the cluster associated with this gateway cluster. This way, the jobs do not occupy the resources of the associated cluster, and the stability of the core nodes and especially the master node in the associated cluster is improved.

Each gateway cluster or gateway node group can have an independent configuration environment. For example, you can create multiple gateway clusters or gateway node groups for one EMR cluster that is shared by multiple departments to meet their different business requirements. You can create a gateway cluster or gateway node group based on the cluster type and version. For more information about how to create a gateway cluster and a gateway node group, see the following references.

Cluster type

References

Hadoop

Create a gateway cluster.

DataLake and Dataflow

EMR V5.10.1 or a later minor version

You can create a gateway node group. For more information, see Manage node groups.

A minor version earlier than EMR V5.10.1

Use EMR-CLI to deploy a gateway.

OLAP