This topic describes how to configure split points in the script to fix the slow synchronization issue that occurs when Tablestore Reader is used to synchronize full data.
Problem description
When Tablestore Reader is used to synchronize full data, data is synchronized at a low speed. The following sample code shows how to configure full data synchronization:
"reader": {
"plugin": "ots",
"parameter": {
"datasource": "",
"table": "",
"column": [],
"range": {
"begin": [
{
"type": "INF_MIN"
}
],
"end": [
{
"type": "INF_MAX"
}
]
}
}
}
Cause
A large amount of data is synchronized, no split points are configured in the script, and only one thread is created to obtain the data. In this case, the speed at which the data is synchronized is affected.
Solution
If you want to synchronize a large amount of data by using Tablestore Reader, configure split points in the script. To configure split points in the script, perform the following steps:
Obtain the information about the required split points by using one of the following methods:
Use Tablestore SDK for Java to call the ComputeSplitPointsBySize operation. For more information, see Split data into shards of a specific size.
Sample response:
LowerBound:pkname1:INF_MIN, pkname2:INF_MIN UpperBound:pkname1:cbcf23c8cdf831261f5b3c052db3479e, pkname2:INF_MIN LowerBound:pkname1:cbcf23c8cdf831261f5b3c052db3479e, pkname2:INF_MIN UpperBound:pkname1:INF_MAX, pkname2:INF_MAX
Download the Tablestore CLI. Then, run the
points -s splitSize -t tablename
command. For more information, see Start the Tablestore CLI and configure access information.NoteThe unit of the splitSize value is 100 MB. If the amount of data that you want to synchronize is small, you do not need to configure split points. If the amount of data that you want to synchronize is large, we recommend that you specify a value for the splitSize parameter based on the maximum number of concurrent threads supported in your environment.
Sample response:
[ { "LowerBound": { "PrimaryKeys": [ { "ColumnName": "pkname1", "Value": null, "PrimaryKeyOption": 2 }, { "ColumnName": "pkname2", "Value": null, "PrimaryKeyOption": 2 } ] }, "UpperBound": { "PrimaryKeys": [ { "ColumnName": "pkname1", "Value": "cbcf23c8cdf831261f5b3c052db3479e\u0000", "PrimaryKeyOption": 0 }, { "ColumnName": "pkname2", "Value": null, "PrimaryKeyOption": 2 } ] }, "Location": "80310717938EDF503FB1E26F70710391" }, { "LowerBound": { "PrimaryKeys": [ { "ColumnName": "pkname1", "Value": "cbcf23c8cdf831261f5b3c052db3479e\u0000", "PrimaryKeyOption": 0 }, { "ColumnName": "pkname2", "Value": null, "PrimaryKeyOption": 2 } ] }, "UpperBound": { "PrimaryKeys": [ { "ColumnName": "pkname1", "Value": null, "PrimaryKeyOption": 3 }, { "ColumnName": "pkname2", "Value": null, "PrimaryKeyOption": 3 } ] }, "Location": "80310717938EDF503FB1E26F70710391" } ]
Find the values of the first primary key columns. For example, the value of the
pkname1
parameter of the first LowerBound is null, the value of thepkname1
parameter of the first UpperBound is "cbcf23c8cdf831261f5b3c052db3479e\u0000", the value of thepkname1
parameter of the second LowerBound is "cbcf23c8cdf831261f5b3c052db3479e\u0000", and the value of thepkname1
parameter of the second UpperBound is null. To synchronize full data, configure the following settings in the script:"split" : [ { "type":"STRING", "value":"cbcf23c8cdf831261f5b3c052db3479e\u0000" } ]
When you run the preceding script, Tablestore splits full data into two parts and concurrently obtains data based on the
(INF_MIN,cbcf23c8cdf831261f5b3c052db3479e\u0000)
and[cbcf23c8cdf831261f5b3c052db3479e\u0000,INF_MAX)
ranges. This way, data synchronization is accelerated.
Configure split points in the script used to synchronize data. The following sample code shows how to configure split points:
"range": { "begin": [ { "type": "INF_MIN" } ], "end": [ { "type": "INF_MAX" } ], "split": [ { "type": "STRING", "value": "splitPoint1" }, { "type": "STRING", "value": "splitPoint2" }, { "type": "STRING", "value": "splitPoint3" } ] }
If the synchronization remains slow after you configure split points, submit a ticket to contact the technical support.