This topic uses orders_tbl
as an example to describe how to convert a text file into an Optimized Row Columnar (ORC) file.
Procedure
- Create an Object Storage Service (OSS) schema.
CREATE SCHEMA dla_oss_db with DBPROPERTIES( catalog='oss', location 'oss://dlaossfile1/' )
- Create a table named orders_txt in Data Lake Analytics (DLA). The LOCATION parameter specifies the path of the orders.tbl file in OSS.
CREATE EXTERNAL TABLE orders_txt ( O_ORDERKEY INT, O_CUSTKEY INT, O_ORDERSTATUS STRING, O_TOTALPRICE DOUBLE, O_ORDERDATE DATE, O_ORDERPRIORITY STRING, O_CLERK STRING, O_SHIPPRIORITY INT, O_COMMENT STRING ) ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' STORED AS TEXTFILE LOCATION 'oss://dlaossfile1/orders/orders.tbl'
- Create the destination table orders_orc in DLA and set LOCATION to the required location in OSS.
CREATE EXTERNAL TABLE orders_orc ( O_ORDERKEY INT, O_CUSTKEY INT, O_ORDERSTATUS STRING, O_TOTALPRICE DOUBLE, O_ORDERDATE DATE, O_ORDERPRIORITY STRING, O_CLERK STRING, O_SHIPPRIORITY INT, O_COMMENT STRING ) STORED AS ORC LOCATION 'oss://dlaossfile1/orders_orc/'
NoteSTORED AS ORC
: specifies that the destination table is stored as an ORC file.- You must set the LOCATION parameter to an existing OSS path and the path must end with a forward slash (
/
).
- Execute the
INSERT...SELECT
statement to insert data from the orders_txt table into the orders_orc table.INSERT INTO orders_orc SELECT * FROM orders_txt;
- After the
INSERT...SELECT
statement is executed, you can view the generated ORC file in OSS.