All Products
Search
Document Center

Tablestore:TableStoreWriter parameters

Last Updated:Dec 05, 2024

When you initialize TableStoreWriter, you can specify TableStoreWriter parameters and callback function logic based on your business requirements. This topic describes TableStoreWriter parameters that you can specify and provides callback examples.

Parameters

When you initialize TableStoreWriter, you can modify WriterConfig to specify TableStoreWriter parameters based on your business requirements.

Examples

The following sample code provides an example of the parameters that you can specify by using WriterConfig.

WriterConfig config = new WriterConfig();
config.setBucketCount(3);
config.setBufferSize(1024);
config.setEnableSchemaCheck(true);
config.setDispatchMode(DispatchMode.HASH_PARTITION_KEY);
config.setBatchRequestType(BatchRequestType.BATCH_WRITE_ROW);
config.setConcurrency(10);
config.setWriteMode(WriteMode.PARALLEL);
config.setAllowDuplicatedRowInBatchRequest(true);
config.setMaxBatchSize(4 * 1024 * 1024);
config.setMaxBatchRowsCount(200);
config.setCallbackThreadCount(16);
config.setCallbackThreadPoolQueueSize(1024);
config.setMaxColumnsCount(128);
config.setMaxAttrColumnSize(2 * 1024 * 1024);
config.setMaxPKColumnSize(1024);
config.setFlushInterval(10000);
config.setLogInterval(10000);
config.setClientMaxConnections(300);
config.setWriterRetryStrategy(WriterRetryStrategy.CERTAIN_ERROR_CODE_NOT_RETRY);

Parameters

Parameter

Type

Description

bucketCount

Integer

The number of buckets in TableStoreWriter. Default value: 3. A bucket is equivalent to a buffer that is used to cache data.

You can specify this parameter to increase the number of parallel sequential write requests. If the machine bottleneck is not reached, the number of buckets is positively correlated with the write rate.

If the write mode of a bucket is concurrent write, retain the default settings.

bufferSize

Integer

The size of the buffer queue in memory. Unit: row. Default value: 1024. The value of this parameter must be an exponential multiple of 2.

enableSchemaCheck

Boolean

Specifies whether to check the schema when data is flushed to the buffer. Valid values:

  • true: checks the schema when data is flushed to the buffer. This is the default value. Before the row data is flushed to the buffer, TableStoreWriter performs the following checks on the row data:

    • Check whether the schema of the primary key of the row is the same as the schema defined for the table.

    • Check whether the value size of each primary key column or attribute column of the row exceeds the limit.

    • Check whether the number of attribute columns in the row exceeds the limit.

    • Check whether the name of an attribute column is the same as the name of a primary key column of the row.

    • Check whether the size of the row exceeds the maximum amount of data that can be imported at a time by using a request.

    If the row data fails the preceding checks, TableStoreWriter determines that the row data is dirty data. Dirty data is not flushed to the buffer.

  • false: does not check the schema when data is flushed to the buffer.

    If specific row data in the buffer is dirty data, the row data fails to be written to Tablestore when TableStoreWriter writes the row data in the buffer to Tablestore.

dispatchMode

DispatchMode

The mode in which data is dispatched to buckets when data is flushed to the buffer. This parameter takes effect only if the number of buckets is greater than or equal to 2. Valid values:

  • HASH_PARTITION_KEY: dispatches data to buckets based on the hash value of the partition key. Data whose partition key shares the same hash value is sequentially written to the same bucket. This is the default value.

  • HASH_PRIMARY_KEY: dispatches data to buckets based on the hash value of the primary key. Data whose primary key shares the same hash value is sequentially written to the same bucket.

  • ROUND_ROBIN: traverses each bucket to dispatch data in a loop. Data is randomly scattered in different buckets.

batchRequestType

BatchRequestType

The type of the request that TableStoreWriter uses to write data in the buffer to Tablestore. Valid values:

  • BATCH_WRITE_ROW: BatchWriteRowRequest. This is the default value.

  • BULK_IMPORT: BulkImportRequest.

concurrency

Integer

The maximum number of parallel requests that TableStoreWriter uses to write data in the buffer to Tablestore. Default value: 10.

writeMode

WriteMode

The mode in which data in buckets is written to Tablestore when TableStoreWriter writes data in the buffer to Tablestore. Valid values:

  • PARALLEL: writes data in parallel from buckets to Tablestore and concurrently from each bucket to Tablestore. This is the default value.

  • SEQUENTIAL: writes data in parallel from buckets to Tablestore and sequentially from each bucket to Tablestore.

allowDuplicatedRowInBatchRequest

Boolean

Specifies whether rows that have the same primary key value are allowed when TableStoreWriter creates a batch write request. Default value: true.

If a secondary index is created for the data table, Tablestore ignores the setting of this parameter and does not allow the rows that have the same primary key value. In this case, TableStoreWriter adds the rows that have the same primary key value to different requests when TableStoreWriter creates requests.

maxBatchSize

Integer

The maximum amount of data that can be written to Tablestore in a batch write request. Unit: bytes. By default, up to 4 MB of data can be written to Tablestore in a batch write request.

maxBatchRowsCount

Integer

The maximum number of rows that can be written to Tablestore in a batch write request. Default value: 200. Maximum value: 200.

callbackThreadCount

Integer

The number of threads in the thread pool that runs callbacks within TableStoreWriter. The default value is the number of processors.

callbackThreadPoolQueueSize

Integer

The queue size of the thread pool that runs callbacks within TableStoreWriter. Default value: 1024.

maxColumnsCount

Integer

The maximum number of columns in a row when data is flushed to the buffer. Default value: 128.

maxAttrColumnSize

Integer

The maximum size of the value of a single attribute column when data is flushed to the buffer. By default, up to 2 MB of data is allowed for the value of each attribute column. Unit: bytes.

maxPKColumnSize

Integer

The maximum size of the value of a single primary key column when data is flushed to the buffer. By default, up to 1 KB of data is allowed for the value of each primary key column. Unit: bytes.

flushInterval

Integer

The interval at which TableStoreWriter automatically writes data in the buffer to Tablestore. Default value: 10000. Unit: millisecond.

logInterval

Integer

The interval at which the task status is automatically displayed when TableStoreWriter writes data in the buffer to Tablestore. Default value: 10000. Unit: millisecond.

clientMaxConnections

Integer

The maximum number of connections that are used when the client is built internally. Default value: 300.

writerRetryStrategy

WriterRetryStrategy

The retry policy that is used when the client is built internally. Valid values:

  • CERTAIN_ERROR_CODE_NOT_RETRY: does not perform retry attempts for specific error codes and performs retry attempts for other error codes. This is the default value.

    Error codes for which no retry attempts are performed: OTSParameterInvalid, OTSConditionCheckFail, OTSRequestBodyTooLarge, OTSInvalidPK, OTSOutOfColumnCountLimit, and OTSOutOfRowSizeLimit

  • CERTAIN_ERROR_CODE_RETRY: performs retry attempts for specific error codes and does not perform retry attempts for other error codes.

    Error codes for which retry attempts are performed: OTSInternalServerError, OTSRequestTimeout, OTSPartitionUnavailable, OTSTableNotReady, OTSRowOperationConflict, OTSTimeout, OTSServerUnavailable, and OTSServerBusy

Callback

TableStoreWriter uses callbacks to report write successes or failures. If a row of data is written to Tablestore, TableStoreWriter invokes the onCompleted() function. If a row of data fails to be written to Tablestore, TableStoreWriter invokes the onFailed() function based on the category of the exception.

The following sample code provides an example on how to use callbacks to collect statistics on the number of rows that are written to Tablestore and the number of rows that fail to be written to Tablestore.

private static AtomicLong succeedRows = new AtomicLong();
private static AtomicLong failedRows = new AtomicLong();
TableStoreCallback<RowChange, RowWriteResult> resultCallback = new TableStoreCallback<RowChange, RowWriteResult>() {
    @Override
    public void onCompleted(RowChange rowChange, RowWriteResult cc) {
        // Collect statistics on the number of rows that are written to Tablestore. 
        succeedRows.incrementAndGet();
    }

    @Override
    public void onFailed(RowChange rowChange, Exception ex) {
        // Collect statistics on the number of rows that fail to be written to Tablestore. 
        failedRows.incrementAndGet();
    }
};