This topic explains how to analyze data with temporary tables. Building temporary tables allows you to take advantage of the analytic power of Data Lake Analytics (DLA) without the need to create permanent tables.
Examples
You can query data in a temporary table.
SELECT * FROM
TABLE temp_1
(
id INT,
name STRING,
location STRING,
create_date DATE,
create_timestamp TIMESTAMP,
longitude DOUBLE,
latitude DOUBLE
)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '|'
STORED AS TEXTFILE LOCATION 'oss://mybucket102/file.txt'
The preceding is equivalent to the following:
CREATE EXTERNAL TABLE temp_1 (
id INT,
name STRING,
location STRING,
create_date DATE,
create_timestamp TIMESTAMP,
longitude DOUBLE,
latitude DOUBLE
)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '|'
STORED AS TEXTFILE LOCATION 'oss://mybucket102/';
==Continue to run the following syntax===
select * from temp_1
You can also query the joining of multiple temporary tables.
SELECT temp_1.col1, temp_2.smallint_col
FROM
TABLE temp_1
(
col1 int,
col2 int
)
LOCATION 'oss://mybucket102/file1.txt';
JOIN
TABLE temp_2
(
id INT COMMENT 'default',
bool_col BOOLEAN COMMENT 'default',
tinyint_col TINYINT COMMENT 'default',
smallint_col SMALLINT COMMENT 'default',
int_col INT COMMENT 'default',
bigint_col BIGINT COMMENT 'default',
float_col FLOAT COMMENT 'default',
double_col DOUBLE COMMENT 'default',
date_string_col STRING COMMENT 'default',
string_col STRING COMMENT 'default',
timestamp_col TIMESTAMP COMMENT 'default'
)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
WITH SERDEPROPERTIES ('field.delim'='|', 'serialization.format'='|')
STORED AS INPUTFORMAT 'org.apache.hadoop.mapred.TextInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION 'oss://mybucket102/file2.txt'
TBLPROPERTIES ('recursive.directories'='false')
ON temp_1.col1 = temp_2.id
WHERE temp_2.bool_col = true;
You can directly access OSS data through the preceding SQL queries.