This topic describes how to use SQL to access data in LindormTable from Lindorm Distributed Processing System (LDPS).
Preparations
Before you use SQL to access data in LindormTable, we recommend that you understand the precautions described in Precautions.
The required resources are initialized based on the type of your jobs. For more information, see the following documentation:
Access data in a wide table
You can access data in a Lindorm wide table in the "lindorm_table"
catalog. You can perform data manipulation language (DML) operations on Lindorm wide tables when you use LDPS to access the tables. Data definition language (DDL) operations and partitioning are not supported. The following examples show how to access data in a wide table:
Use the
"lindorm_table"
catalog.USE lindorm_table;
Query the schema of the table named test.
SHOW CREATE TABLE test;
The following result is returned:
+----------------------------------------------------+ | CREATE TABLE default.test ( | | `id` INT, | | `name` STRING) | | | +----------------------------------------------------+
Insert data to a Lindorm wide table.
INSERT INTO test VALUES (0, 'Jay');
Query data in a Lindorm wide table.
SELECT * FROM test;
For more information about the supported SQL syntax, see DML Statements.
Use the bulkload feature to import data to a Lindorm wide table (in public preview)
The SQL syntax used to import data in batches is the same as the INSERT
syntax used to insert normal data. To write data into LindormTable in batches by using SQL, configure the parameters described in the following table.
Parameter | Description |
spark.sql.catalog.lindorm_table.bulkLoad.enabled | Specifies whether to enable the bulkload feature. Valid values:
|
spark.sql.catalog.lindorm_table.bulkLoad.parallelFactor | The number of concurrent sessions used to write data to a single region of a wide table. Default value: |
After you enable the bulkload feature, wide table files are automatically generated based on the imported data and registered. This way, the write throughput is increased.
Secondary and search indexes are not automatically created for data imported by bulkload jobs.