This topic describes how to use Hive to perform basic operations, such as create databases and tables, in an E-MapReduce (EMR) cluster.
Prerequisites
An EMR cluster is created. For more information, see Create a cluster.
Open the Hive CLI
- Log on to the EMR cluster in SSH mode. For more information, see Log on to a cluster.
- Run the following command to switch to the hadoop user:
su hadoop
- Run the following command to open the Hive CLI:
hive
Manage databases
In this example, a database named testdb is used.
- Run the following command to create a database:
create database if not exists testdb;
If the returned information contains OK, the testdb database is created.
- Run the following command to query the information about a database:
desc database testdb;
- Run the following command to use a database:
use testdb;
- Run the following command to delete a database:
drop database if exists testdb;
If the returned information contains OK, the database is deleted.
Manage tables
In this example, a table named t is used.
- Run the following command to create a table:
create table if not exists t (id bigint, value string);
If the returned information contains OK, the t table is created.
- Run the following command to query the information about a table:
desc formatted t;
- Run the following command to query all existing tables:
show tables;
The following information is returned:OK t
- Run the following command to delete a table:
drop table if exists t;
If the returned information contains OK, the table is deleted.
Execute SQL statements
- Execute the following statement to insert data:
insert into table t select 1, 'value-1';
If the returned information contains OK, the data is inserted.OK Time taken: 14.73 seconds
- Execute the following statement to query the first 10 records in a table:
select * from t limit 10;
The following information is returned:OK 1 value-1 Time taken: 11.48 seconds, Fetched: 1 row(s)
- Execute the following statement to aggregate data:
select value, count(id) from t group by value;
The following information is returned:OK value-1 1 Time taken: 20.11 seconds, Fetched: 1 row(s)