This topic uses customer.tbl
as an example to describe how to convert text files to Parquet files.
- Create an OSS schema.
CREATE SCHEMA dla_oss_db with DBPROPERTIES( catalog='oss', location 'oss://dlaossfile1/TPC-H/' )
- Create a table named customer_txt in DLA and set LOCATION to the path of customer.tbl in OSS.
CREATE EXTERNAL TABLE customer_txt ( c_custkey int, c_name string, c_address string, c_nationkey int, c_phone string, c_acctbal double, c_mktsegment string, c_comment string ) ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' STORED AS TEXTFILE LOCATION 'oss://dlaossfile1/TPC-H/customer/customer.tbl'
- Create the target table customer_parquet in DLA and set LOCATION to the required path in OSS.Note LOCATION must be an existing directory in OSS and ended with
/
.CREATE EXTERNAL TABLE customer_parquet ( c_custkey int, c_name string, c_address string, c_nationkey int, c_phone string, c_acctbal double, c_mktsegment string, c_comment string ) ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' STORED AS PARQUET LOCATION 'oss://dlaossfile1/TPC-H/customer_parquet/'
STORED AS PARQUET
: indicates that the table is stored in Parquet format. - Run the
INSERT...SELECT
statement to insert data from the customer_txt table to the customer_parquet table.INSERT INTO customer_parquet SELECT * FROM customer_txt;
- View the data in table customer_parquet.After the
INSERT SELECT
statement is executed, view the Parquet file created in OSS.