The faster-bulk plug-in is developed by the Alibaba Cloud Elasticsearch team. This plug-in aggregates bulk write requests in batches based on the specified maximum request size and aggregation interval. This prevents small bulk requests from blocking the write queue, improves write throughput, and reduces write request rejections. This topic introduces the use scenarios of the plug-in and describes how to use the plug-in.
Scenarios
- Test environment
- Node configuration: 3 data nodes and 2 independent client nodes. Each node offers 16 vCPUs and 64 GiB of memory.
- Dataset: nyc_taixs provided by Rally in open source Elasticsearch. The size of a single document is 650 bytes.
- Parameter setting: The apack.fasterbulk.combine.interval parameter is set to 200ms.
- Translog status: Tests are performed in each of the synchronous and asynchronous states. If the index.translog.durability parameter is set to request, translogs are in the synchronous state. If the index.translog.durability parameter is set to async, translogs are in the asynchronous state.
- Test results
Translog status Write performance of an open source Elasticsearch cluster without faster-bulk (document/s) Write performance of an Alibaba Cloud Elasticsearch cluster with faster-bulk (document/s) Performance improvement (percentage) Synchronous 182,314 226,242 23% Asynchronous 218,732 241,060 10% - Test conclusion
The write performance is improved in both the synchronous state (default state) and asynchronous state after the faster-bulk plug-in is used. In the synchronous state, the write performance is improved by 23%.
Prerequisites
- An Alibaba Cloud Elasticsearch V6.7.0 or V7.10.0 cluster is created.
For more information, see Create an Alibaba Cloud Elasticsearch cluster.Note Only Alibaba Cloud Elasticsearch V6.7.0 and V7.10.0 clusters of the Standard or Advanced Edition support the faster-bulk plug-in.
- The faster-bulk plug-in is installed.
For more information, see Install and remove a built-in plug-in. After the plug-in is installed, the bulk request aggregation feature is disabled by default. Before you use this plug-in, you must enable this feature.
Enable the bulk request aggregation feature
Configure the maximum request size and aggregation interval
PUT _cluster/settings
{
"transient" : {
"apack.fasterbulk.combine.flush_threshold_size":"1mb",
"apack.fasterbulk.combine.interval":"50"
}
}
- apack.fasterbulk.combine.flush_threshold_size: the maximum size of bulk requests. Default value: 1mb.
- apack.fasterbulk.combine.interval: the maximum interval at which bulk requests are aggregated. Default value: 50. Unit: ms.
Enable directed routing
- Enable directed routing for the Elasticsearch cluster
PUT _cluster/settings { "persistent" : { "index.direct_routing.global.enable" : "true" } }
- Enable directed routing for a specified index
PUT index/settings { "index.direct_routing.enable" : "true" }
Disable the bulk request aggregation feature
PUT _cluster/settings
{
"transient" : {
"apack.fasterbulk.combine.enabled":"false"
}
}