HBase is a real-time database that provides high write performance in the Hadoop ecosystem. OSS-HDFS (JindoFS service) is a storage service released by Alibaba Cloud and is compatible with the HDFS API. E-MapReduce (EMR) allows HBase to use OSS-HDFS as the underlying storage and supports storage of write-ahead logging (WAL) files. This decouples storage and computing.
Prerequisites
A cluster of EMR V3.42.0 or later, or EMR V5.8.0 or later is created. For more information, see Create a cluster.
OSS-HDFS is enabled for a bucket and permissions are granted to access OSS-HDFS. For more information about how to enable OSS-HDFS, see Enable OSS-HDFS and grant access permissions.
Procedure
Connect to the EMR cluster.
Log on to the EMR console. In the left-side navigation pane, click EMR on ECS.
Click the EMR cluster that you created.
Click the Nodes tab, and then click the plus icon () on the left side of the node group.
Click the ID of the ECS instance. On the Instances page, click Connect next to the instance ID.
For more information about how to log on to a cluster in Windows or Linux by using an SSH key pair or SSH password, see Log on to a cluster.
- Specify a storage path for HBase. To specify a path in OSS as the storage path for data and WAL files in HBase, you must change the value of the hbase.rootdir parameter in the hbase-site configuration file to the OSS path. The path is in the
oss://bucket.endpoint/hbase-root-dir
format.Important To release clusters, you must disable tables and make sure that all update operations performed on WAL files are synchronized to the HFiles.