Tablestore provides the BulkExport operation to batch read offline data from a data table in big data scenarios. After data is written to a data table, you can read the data based on specific conditions.
Prerequisites
An OTSClient instance is initialized. For more information, see Initialize an OTSClient instance.
A data table is created, and data is written to the data table.
Parameters
Parameter | Description |
tableName | The name of the data table. |
inclusiveStartPrimaryKey | The start and end primary keys for the batch read operation. The start and end primary keys must be valid primary keys or virtual points that consist of INF_MIN type values and INF_MAX type values. The number of columns in the virtual points must be the same as that in the primary key. INF_MIN indicates an infinitely small value. All values of other types are greater than a value of the INF_MIN type. INF_MAX indicates an infinitely great value. All values of other types are smaller than a value of the INF_MAX type.
The rows in a data table are sorted in ascending order based on primary key values. The range that is used to read data is a left-closed and right-open interval. If data is read in the forward direction, the rows whose primary key values are greater than or equal to the start primary key value but smaller than the end primary key value are returned. |
exclusiveEndPrimaryKey | |
columnsToGet | The columns that you want to read. You can specify the names of primary key columns or attribute columns.
Note
|
filter | The filter that you want to use to filter the query results on the server side. Only rows that meet the filter conditions are returned. For more information, see Configure a filter. Note If you configure the columnsToGet and filter parameters, Tablestore queries the columns that are specified by the columnsToGet parameter, and then returns the rows that meet the filter conditions. |
dataBlockType | The format type of the returned data for this read request. Valid values: PlainBuffer and SimpleRowMatrix. |
Examples
The following sample code provides an example on how to batch read data whose primary key value is within a specific range:
private static void bulkExport(SyncClient client, String start, String end){
// Specify the start primary key.
PrimaryKeyBuilder startPrimaryKeyBuilder = PrimaryKeyBuilder.createPrimaryKeyBuilder();
startPrimaryKeyBuilder.addPrimaryKeyColumn("pk", PrimaryKeyValue.fromString(String.valueOf(start)));
PrimaryKey startPrimaryKey = startPrimaryKeyBuilder.build();
// Specify the end primary key.
PrimaryKeyBuilder endPrimaryKeyBuilder = PrimaryKeyBuilder.createPrimaryKeyBuilder();
endPrimaryKeyBuilder.addPrimaryKeyColumn("pk", PrimaryKeyValue.fromString(String.valueOf(end)));
PrimaryKey endPrimaryKey = endPrimaryKeyBuilder.build();
// Create a bulkExportRequest.
BulkExportRequest bulkExportRequest = new BulkExportRequest();
// Create a bulkExportQueryCriteria.
BulkExportQueryCriteria bulkExportQueryCriteria = new BulkExportQueryCriteria("<TABLE_NAME>");
bulkExportQueryCriteria.setInclusiveStartPrimaryKey(startPrimaryKey);
bulkExportQueryCriteria.setExclusiveEndPrimaryKey(endPrimaryKey);
// Use the DBT_PLAIN_BUFFER encoding method.
bulkExportQueryCriteria.setDataBlockType(DataBlockType.DBT_PLAIN_BUFFER);
// If you want to use the DBT_SIMPLE_ROW_MATRIX encoding method, use the following code.
// bulkExportQueryCriteria.setDataBlockType(DataBlockType.DBT_SIMPLE_ROW_MATRIX);
bulkExportQueryCriteria.addColumnsToGet("pk");
bulkExportQueryCriteria.addColumnsToGet("DC1");
bulkExportQueryCriteria.addColumnsToGet("DC2");
bulkExportRequest.setBulkExportQueryCriteria(bulkExportQueryCriteria);
// Obtain the bulkExportResponse.
BulkExportResponse bulkExportResponse = client.bulkExport(bulkExportRequest);
// If you set DataBlockType to DBT_SIMPLE_ROW_MATRIX, you need to use the following code to print the result.
//{
// SimpleRowMatrixBlockParser parser = new SimpleRowMatrixBlockParser(bulkExportResponse.getRows());
// List<Row> rows = parser.getRows();
// for (int i = 0; i < rows.size(); i++){
// System.out.println(rows.get(i));
// }
//}
// Set DataBlockType to DBT_PLAIN_BUFFER and print the result.
{
PlainBufferBlockParser parser = new PlainBufferBlockParser(bulkExportResponse.getRows());
List<Row> rows = parser.getRows();
for (int i = 0; i < rows.size(); i++){
System.out.println(rows.get(i));
}
}
}
References
For more information about the API operation, see BulkExport.
For more information about how to call the operation to batch read offline data, see BulkExportRequest.java and BulkExportResponse.java.
If you want to accelerate data queries, you can use secondary indexes or search indexes. For more information, see Secondary Index or Search indexes.
If you want to visualize the data of a table, you can connect Tablestore to DataV or Grafana. For more information, see Data visualization tools.
If you want to download data from a table to a local file, you can use DataX or the Tablestore CLI. For more information, see Download data in Tablestore to a local file.
If you want to perform computing or analytics on data in Tablestore, you can use the SQL query feature of Tablestore. For more information, see Overview.
NoteYou can also use compute engines such as MaxCompute, Spark, Hive, Hadoop MapReduce, Function Compute, and Realtime Compute for Apache Flink to compute and analyze data in a table. For more information, see Overview.