DataWorks Data Integration provides MetaQ Reader for you to read data from Message Queue. This topic describes the capabilities of synchronizing data from MetaQ data sources.
Supported MetaQ versions
<dependency>
<groupId>com.taobao.metaq.final</groupId>
<artifactId>metaq-client</artifactId>
<version>4.0.1</version>
</dependency>
<dependency>
<groupId>com.aliyun.openservices</groupId>
<artifactId>ons-sdk</artifactId>
<version>1.3.1</version>
</dependency>
Limits
- You can configure MetaQ Reader only by using the code editor to read data from Message Queue.
- MetaQ Reader supports only exclusive resource groups for Data Integration.
Data types
Data type | MetaQ Reader for batch data read |
---|---|
STRING | Supported |
Data Integration data type | Message Queue data type |
---|---|
STRING | STRING |
Develop a data synchronization node
- For more information about the configuration procedure, see Configure a batch synchronization node by using the code editor.
- For information about all parameters that are configured and the code that is run when you use the code editor to configure a batch synchronization node, see Appendix: Code and parameters.
Appendix: Code and parameters
Appendix: Configure a batch synchronization node by using the code editor
If you use the code editor to configure a batch synchronization node, you must configure parameters for the reader and writer of the related data source based on the format requirements in the code editor. For more information about the format requirements, see Configure a batch synchronization node by using the code editor. The following information describes the configuration details of parameters for the reader and writer in the code editor.
Code for MetaQ Reader
{
"job": {
"content": [
{
"reader": {
"name": "metaqreader",
"parameter": {
"accessId": "<yourAccessKeyId>",
"accessKey": "<yourAccessKeySecret>",
"consumerId": "Test01",
"topicName": "test",
"subExpression": "*",
"onsChannel": "ALIYUN",
"domainName": "***.aliyun.com",
"contentType": "singlestringcolumn",
"beginOffset": "lastRead",
"nullCurrentOffset": "begin",
"fieldDelimiter": ",",
"column": [
"col0"
],
"fieldDelimiter": ","
}
},
"writer": {
"name": "streamwriter",
"parameter": {
"print": false
}
}
}
]
}
}
Parameters in code for MetaQ Reader
Parameter | Description | Required |
---|---|---|
accessId | The AccessKey ID that you use to access Message Queue. | Yes |
accessKey | The AccessKey secret that you use to access Message Queue. | Yes |
consumerId | The consumer ID. A consumer is also known as a message subscriber, which receives and consumes messages. The consumer ID is the identifier of a type of consumer. In most cases, the consumers that have the same consumer ID receive and consume the same type of message and use the same consumption logic. | Yes |
topicName | The topic of the messages that you want to consume. A topic is used to classify messages. It is the primary classifier. | Yes |
subExpression | The subtopic of the messages. | Yes |
onsChannel | The channel that is used for authentication when MetaQ Reader connects to Message Queue. | Yes |
unitName | The destination unit that receives messages. Valid values:
| No |
instanceName | The name of the consumer instance. | No |
domainName | The endpoint that you use to connect to Message Queue. | Yes |
contentType | The type of the messages. Valid values: singlestringcolumn, text, and json. | Yes |
beginOffset | The offset from which MetaQ Reader starts to read data. Valid values: begin and lastRead. | No |
nullCurrentOffset | The offset from which MetaQ Reader starts to read data if the last offset is null. Valid values: begin and current. | Yes |
fieldDelimiter | The column delimiter that is used to separate message strings, such as commas (,). Control characters are supported. Example: \u0001. | Yes |
column | The names of the fields from which you want to read data in the messages. | Yes |
beginDateTime | The start time of data consumption. This parameter specifies the left boundary of a left-closed, right-open interval. The value of the beginDateTime parameter is a time string in the yyyyMMddHHmmss format. This parameter can be used together with the scheduling parameters in DataWorks. | No Note The beginDateTime and endDateTime parameters must be used in pairs. |
endDateTime | The end time of data consumption. This parameter specifies the right boundary of a left-closed, right-open interval. The value of the endDateTime parameter is a time string in the yyyyMMddHHmmss format. This parameter can be used together with the scheduling parameters in DataWorks. |