If you want to perform multi-table association analysis on data in Alibaba Cloud ApsaraDB for HBase, Hive is needed. This topic describes how to associate Hive in your E-MapReduce (EMR) cluster with Alibaba Cloud ApsaraDB for HBase tables.
Prerequisites
A DataLake cluster is created. For more information, see Create a cluster.
An ApsaraDB for HBase cluster is created in the virtual private cloud (VPC) where your EMR cluster resides. For more information, see Purchase a cluster.
In this example, the ApsaraDB for HBase Standard Edition V2.0 is used. The ApsaraDB for HBase Performance-enhanced Edition (Lindorm) is not supported.
Step 1: Add Hive configurations
Go to the Configure tab.
Log on to the EMR console.
In the top navigation bar, select a region and a resource group based on your business requirements.
On the EMR on ECS page, find the desired cluster and click Services in the Actions column.
On the Services tab, find the Hive service and click Configure.
Click the hbase-site.xml tab.
Click Add Configuration Item. In the Add Configuration Item dialog box, add configuration items and click OK. The following table describes the configuration item.
For more information, see the Add configuration items section of the "Manage configuration items" topic.
Configuration item
Description
hbase.zookeeper.quorum
Enter the ZooKeeper address of the ApsaraDB for HBase cluster in the VPC. You can log on to the ApsaraDB for HBase console and go to the Database Connection page from the ApsaraDB for HBase cluster details page to obtain the ZooKeeper address of the ApsaraDB for HBase cluster.
Examples:
hb-xxxx-master1-001.hbase.rds.aliyuncs.com:2181, hb-xxxx-master2-001.hbase.rds.aliyuncs.com:2181, and hb-xxxx-master3-001.hbase.rds.aliyuncs.com:2181.
Step 2: View HBase tables
Access the ApsaraDB for HBase cluster. For more information, see Use HBase Shell to access an ApsaraDB for HBase Standard Edition cluster.
Run the
listcommand to check whether the HBase table hive_hbase_table or hbase_table exists.If hive_hbase_table or hbase_table does not exist, go to Step 4: Create an internal table in ApsaraDB for HBase.
If hive_hbase_table or hbase_table exists, go to Step 3: Create an external table in Hive and map the external table to the existing table in ApsaraDB for HBase.
Step 3: Create an external table in Hive and map the external table to the existing table in ApsaraDB for HBase
Run the following command to create a table in ApsaraDB for HBase:
create 'hbase_table','f'Insert data into the table.
put 'hbase_table','1122','f:col1','hello' put 'hbase_table','1122','f:col2','hbase'Create an external table in Hive, map the external table to the table in ApsaraDB for HBase, and query data from the external table.
Create an external table in Hive, and map the external table to the table in ApsaraDB for HBase.
create external table hbase_table(key int,col1 string,col2 string) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ("hbase.columns.mapping" = "f:col1,f:col2") TBLPROPERTIES ("hbase.table.name" = "hbase_table", "hbase.mapred.output.outputtable" = "hbase_table");Query data from the external table.
select * from hbase_table;The following information is returned:
1122 hello hbaseDelete the hbase_table table from Hive.
drop table hbase_table;Run the list command in HBase Shell to check whether the hbase_table table exists.
If the returned information shows that the table hbase_table exists, deleting the table in Hive does not affect the existing table in ApsaraDB for HBase.
Step 4: Create an internal table in ApsaraDB for HBase
Enter the
hivecommand to enter the Hive CLI.Run the following command to create a table in ApsaraDB for HBase:
CREATE TABLE hive_hbase_table(key int, value string) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,cf1:val") TBLPROPERTIES ("hbase.table.name" = "hive_hbase_table", "hbase.mapred.output.outputtable" = "hive_hbase_table");Run the following command to insert data into the hive_hbase_table table:
insert into hive_hbase_table values(212,'bab');Run the following command to view the table data in Hive:
select * from hive_hbase_table;Write data to the hive_hbase_table table and view the data in Hive.
Run the following command to write data to the hive_hbase_table table:
put 'hive_hbase_table','132','cf1:val','acb'Run the following command to view the data written to the table in Hive:
select * from hive_hbase_table;The following information is returned:
132 acb 212 bab
Delete the hive_hbase_table table and view the hive_hbase_table table in ApsaraDB for HBase.
Run the following command to delete the hive_hbase_table table from Hive:
drop table hive_hbase_table;Run the following command to view the hive_hbase_table table in ApsaraDB for HBase:
scan hive_hbase_table;The returned information indicates that the hive_hbase_table table does not exist.