DataWorks Data Integration provides TSDB Writer for you to write data points to Lindorm Time Series Database (TSDB) provided by Alibaba Cloud ApsaraDB for Lindorm. This topic describes the capabilities of synchronizing data to TSDB data sources.
Supported TSDB versions
TSDB Writer supports all versions of ApsaraDB for Lindorm and HiTSDB V2.4.X or later.
How it works
TSDB Writer connects to a TSDB instance by using the TSDB client hitsdb-client and writes data points by using the HTTP API endpoint. For more information, see TSDB SDK documentation.
Data type mappings
If the sourceDbType parameter is set to TSDB, source data is read by using TSDB Reader or OpenTSDB Reader. In this case, TSDB Writer writes the source data to Lindorm TSDB in the format of JSON strings. If the sourceDbType parameter is set to RDB, the source is a relational database. In this case, TSDB Writer parses the source data based on the records of the relational database. The following table lists the valid values of the columnType parameter and the data types that match the column types when the sourceDbType parameter is set to RDB.
Data model | Valid value of columnType | Data type |
Data model | Valid value of columnType | Data type |
Tag | tag | A string data type. A tag describes the features of the data source. In most case, a tag does not change over time. |
Timestamp | timestamp | The TIMESTAMP data type. A timestamp specifies the point in time at which data is generated. The timestamp can be manually specified when data is written or automatically generated by the system. |
Field | field_string | A string data type. A field describes the measurement metrics of the data source. In most case, a field changes over time. |
field_double | A numeric data type. A field describes the measurement metrics of the data source. In most case, a field changes over time. |
field_boolean | A Boolean data type. A field describes the measurement metrics of the data source. In most case, a field changes over time. |
Develop a data synchronization task
Appendix: Code and parameters
Appendix: Configure a batch synchronization task by using the code editor
If you use the code editor to configure a batch synchronization task, you must configure parameters for the reader and writer of the related data source based on the format requirements in the code editor. For more information about the format requirements, see Configure a batch synchronization task by using the code editor. The following information describes the configuration details of parameters for the reader and writer in the code editor.
Code for TSDB Writer
Write data from RDB to TSDB by using the following default configurations (recommended)
{
"type": "job",
"version": "2.0",
"steps": [
{
"stepType": "stream",// You can replace the stream plug-in with the specific RDB plug-in. RDB databases include MySQL, Oracle, PostgreSQL, and DRDS databases.
"parameter": {},
"name": "Reader",
"category": "reader"
},
{
"stepType": "tsdb",
"parameter": {
"endpoint": "http://localhost:8242",
"username": "xxx",
"password": "xxx",
"sourceDbType": "RDB",
"batchSize": 256,
"columnType": [
"tag",
"tag",
"field_string",
"field_double",
"timestamp",
"field_bool"
],
"column": [
"tag1",
"tag2",
"field1",
"field2",
"timestamp",
"field3"
],
"multiField": "true",
"table": "testmetric",
"ignoreWriteError": "false",
"database": "default"
},
"name": "Writer",
"category": "writer"
}
],
"setting": {
"errorLimit": {
"record": "0"
},
"speed": {
"throttle":true,// Specifies whether to enable throttling. The value false indicates that throttling is disabled, and the value true indicates that throttling is enabled. The mbps parameter takes effect only when the throttle parameter is set to true.
"concurrent":1, // The maximum number of parallel threads.
"mbps":"12"// The maximum transmission rate. Unit: MB/s.
}
},
"order": {
"hops": [
{
"from": "Reader",
"to": "Writer"
}
]
}
}
Write data from a database that supports the OpenTSDB protocol to TSDB
{
"type": "job",
"version": "2.0",
"steps": [
{
"stepType": "opentsdb",
"parameter": {
"endpoint": "http://localhost:4242",
"column": [
"m1",
"m2",
"m3",
"m4",
"m5",
"m6"
],
"startTime": "2019-01-01 00:00:00",
"endTime": "2019-01-01 03:00:00"
},
"name": "Reader",
"category": "reader"
},
{
"stepType": "tsdb",
"parameter": {
"endpoint": "http://localhost:8242"
},
"name": "Writer",
"category": "writer"
}
],
"setting": {
"errorLimit": {
"record": "0"
},
"speed": {
"throttle":true,// Specifies whether to enable throttling. The value false indicates that throttling is disabled, and the value true indicates that throttling is enabled. The mbps parameter takes effect only when the throttle parameter is set to true.
"concurrent":1, // The maximum number of parallel threads.
"mbps":"12"// The maximum transmission rate. Unit: MB/s.
}
},
"order": {
"hops": [
{
"from": "Reader",
"to": "Writer"
}
]
}
}
Use the OpenTSDB protocol to write a univariate data point to TSDB (not recommended)
{
"type": "job",
"version": "2.0",
"steps": [
{
"stepType": "stream",// You can replace the stream plug-in with the specific RDB plug-in. RDB databases include MySQL, Oracle, PostgreSQL, and DRDS databases.
"parameter": {},
"name": "Reader",
"category": "reader"
},
{
"stepType": "tsdb",
"parameter": {
"endpoint": "http://localhost:8242",
"username": "xxx",
"password": "xxx",
"sourceDbType": "RDB",
"batchSize": 256,
"columnType": [
"tag",
"tag",
"field_string",
"field_double",
"timestamp",
"field_boolean"
],
"column": [
"tag1",
"tag2",
"field_metric_1",
"field_metric_2",
"timestamp",
"field_metric_3"
],
"ignoreWriteError": "false"
},
"name": "Writer",
"category": "writer"
}
],
"setting": {
"errorLimit": {
"record": "0"
},
"speed": {
"throttle":true,// Specifies whether to enable throttling. The value false indicates that throttling is disabled, and the value true indicates that throttling is enabled. The mbps parameter takes effect only when the throttle parameter is set to true.
"concurrent":1, // The maximum number of parallel threads.
"mbps":"12"// The maximum transmission rate. Unit: MB/s.
}
},
"order": {
"hops": [
{
"from": "Reader",
"to": "Writer"
}
]
}
}
Note
The names of the TSDB metrics are determined by the column names of fields for the column parameter. In the preceding code, a row of data in a relational database is written to three metrics: field_metric_1, field_metric_2, and field_metric_3.
Parameters in code for TSDB Writer
Parameter type | Parameter | Description | Required | Default value |
Parameter type | Parameter | Description | Required | Default value |
Common parameters | sourceDbType | The type of the source database. | No | TSDB Note Valid values: TSDB and RDB. The value TSDB indicates that the source database is an OpenTSDB, Prometheus, or Timescale database. The value RDB indicates that the source database is a relational database, such as a MySQL, Oracle, PostgreSQL, or DRDS database. |
endpoint | The HTTP URL of the destination TSDB database. Specify the endpoint in the format of http://IP address:Port number. You can obtain the HTTP endpoint in the ApsaraDB for Lindorm console. | Yes | No default value |
database | The name of the TSDB database to which data is written. | No | default Note You must create a database first. |
username | The username of the TSDB database. You must specify a value for this parameter if you configure authentication for the TSDB database. | No | No default value |
batchSize | The number of data records to write at a time. The value of this parameter is of the INT type and must be greater than 0. If you want to configure a large value for the batchSize parameter, you must reserve more memory space. | No | 100 |
Parameters for TSDB | maxRetryTime | The maximum number of retries allowed after a failure. The value of this parameter is of the INT type and must be greater than 1. | No | 3 |
ignoreWriteError | Specifies whether to ignore write errors. The value of this parameter is of the BOOLEAN type. If you set this parameter to true, TSDB Writer continues to perform the write operation after a write error occurs. If the write operation fails after the specified number of retries, the synchronization task is terminated. | No | false |
Parameters for RDB | table | The names of the metrics that you want to import to TSDB. If the multiField parameter is set to false, you can leave this parameter empty. In this case, you need to specify the names of the metrics for the column parameter. If the multiField parameter is set to true, you must configure this parameter. | No | No default value |
multiField | Specifies whether to write a multivariate data point to TSDB by using the HTTP API endpoint. Note If you want to use the native SQL capabilities of Lindorm TSDB to access data that is written by using the HTTP API endpoint, you must create a table in TSDB. Otherwise, you can query a multivariate data point only by using the TSDB HTTP API endpoint. For more information, see Query a multivariate data point. | Yes | false Note To write a multivariate data point to TSDB, you must set the value to true. |
column | The names of the columns whose data you want to write to the TSDB database. | Yes | No default value Note You must specify the columns in the same order as the columns specified for a reader. |
columnType | The data types of the columns in the relational database. The following types are supported: timestamp: a timestamp column. tag: a tag column. field_string: a metric column whose value is of a string data type. field_double: a metric column whose value is of a numeric data type. field_boolean: a metric column whose value is of a Boolean data type.
| Yes | No default value Note You must specify the columns in the same order as the columns specified for a reader. |
batchSize | The number of data records to write at a time. The value of this parameter is of the INT type and must be greater than 0. | No | 100 |