DataWorks provides RestAPI Reader and RestAPI Writer for you to read data from and write data to RestAPI data sources. This topic describes the capabilities of synchronizing data from or to RestAPI data sources.
Limits
RestAPI data sources support only exclusive resource groups for Data Integration.
DataWorks does not allow you to configure a timeout period when you use this type of data source. The built-in timeout period for a request in DataWorks is 60 seconds. If the time required to return the result of your API call exceeds 60 seconds, your task may fail.
Data type mappings
Category | RestAPI data type |
Integer | LONG and INT |
String | STRING |
Floating point | DOUBLE and FLOAT |
Boolean value | BOOLEAN |
Date and time | DATE |
Add a data source
Before you develop a synchronization task in DataWorks, you must add the required data source to DataWorks by following the instructions in Add and manage data sources. You can view the infotips of parameters in the DataWorks console to understand the meanings of the parameters when you add a data source.
Develop a data synchronization task
For information about the entry point for and the procedure of configuring a synchronization task, see the following configuration guides.
Configure a batch synchronization task to synchronize data of a single table
For more information about the configuration procedure, see Configure a batch synchronization task by using the codeless UI and Configure a batch synchronization task by using the code editor.
For information about all parameters that are configured and the code that is run when you use the code editor to configure a batch synchronization task, see Appendix: Code and parameters.
FAQ
Can I specify only the number of times of page flipping for a response?
Yes, you can specify only the number of times of page flipping for a response.
Can I configure automatic page flipping for a response?
No, you cannot configure automatic page flipping for a response. If you configure automatic page flipping for a response, page flipping is stopped when the required data is returned. In this case, sharding cannot be performed on the data.
The specified number of times of page flipping for a response is greater than the actual number of pages for the response. As a result, additional pages do not contain data. How does the system resolve this issue?
If no result is returned for the SQL query, additional pages do not contain data. In this case, the system continues to query the next data record.
Can RestAPI Reader parse only one level of data in the JSON-formatted response?
Yes, RestAPI Reader can parse only one level of data in the JSON-formatted response.
How do I configure RestAPI Reader to read data of a non-array type?
Make sure that the
dataPath
parameter is set to a path that points to data of a non-array type when you configure theparameter
field forRestAPI Reader
. This can help RestAPI Reader correctly locate the fields from which you want to read data. For example, you can configuredataPath:"data.list"
. In addition, set thedataMode
parameter tomultiData
. This way, DataWorks processes the data of a non-array type as multiple separate data records.NoteIf you set the dataMode parameter to
multiData
, thecolumn
parameter does not take effect. You must directly specify the path of data that you want to read in thedataPath
parameter.The following code provides a configuration example:
reader: { name: "restapi", parameter: { dataPath: "data.list", dataMode: "multiData", // Other parameters } }
Appendix: Code and parameters
Configure a batch synchronization task by using the code editor
If you want to configure a batch synchronization task by using the code editor, you must configure the related parameters in the script based on the unified script format requirements. For more information, see Configure a batch synchronization task by using the code editor. The following information describes the parameters that you must configure for data sources when you configure a batch synchronization task by using the code editor.