If your business requires database capabilities, such as high concurrent read and write performance, high scalability, high availability, complex retrieval, and big data analysis, but the architecture of the existing databases cannot meet your business requirements or the cost for database transformation is high, you can use DataWorks Data Integration to migrate data from the existing databases to Tablestore tables. You can also use DataWorks Data Integration to migrate data in Tablestore tables across instances or Alibaba Cloud accounts or migrate Tablestore data to Object Storage Service (OSS) or MaxCompute. This way, you can back up Tablestore data and use Tablestore data in other services.
Scenarios
DataWorks Data Integration is a stable, efficient, and scalable data synchronization platform. It is suitable for data migration and synchronization between multiple disparate data sources, such as MySQL, Oracle, MaxCompute, and Tablestore.
Tablestore allows you to use DataWorks Data Integration to migrate database data to Tablestore, migrate Tablestore data across instances or Alibaba Cloud accounts, and migrate Tablestore data to OSS or MaxCompute.
Migrate database data to Tablestore
DataWorks provides a stable and efficient data synchronization feature among disparate data sources. You can migrate data from various databases to Tablestore. The following figure shows the synchronization between Tablestore and various data sources.
For information about the data sources and Reader and Writer plug-ins supported by DataWorks, see Supported data source types, Reader plug-ins, and Writer plug-ins.
Migrate or synchronize Tablestore data across instances or Alibaba Cloud accounts
You can configure Tablestore-related Reader and Writer plug-ins in DataWorks to synchronize data in Tablestore data tables or time series tables. The following figure shows the synchronization process. The following table describes Tablestore-related Reader and Writer plug-ins.
Plug-in | Description |
OTSReader | The plug-in is used to read data from Tablestore tables. You can specify the range of data that you want to extract for incremental extraction. |
OTSStreamReader | The plug-in is used to export data in Tablestore tables in incremental mode. |
OTSWriter | The plug-in is used to write data to Tablestore. |
Migrate Tablestore data to OSS or MaxCompute
You can migrate Tablestore data to OSS or MaxCompute based on your business scenario.
MaxCompute is a fully managed data warehouse service that can process terabytes or petabytes of data at high speeds. You can use MaxCompute to back up Tablestore data or migrate Tablestore data to MaxCompute and use Tablestore data in MaxCompute.
OSS is a secure, cost-effective, and highly reliable service that can store large amounts of data. You can use OSS to back up Tablestore data or synchronize Tablestore data to OSS and download objects from OSS to your local devices.
Migration solutions
You can use DataWorks Data Integration to migrate data between Tablestore and various data sources.
You can use a data import solution to synchronize the following types of data to Tablestore: MySQL, Oracle, Kafka, HBase, and MaxCompute. You can also synchronize data across Tablestore data tables or time series tables.
You can use a data export solution to synchronize data from Tablestore to MaxCompute or OSS.
Import data
The following table describes data import solutions.
Solution | Description |
Synchronize MySQL data to Tablestore | You can migrate data in MySQL databases only to Tablestore data tables. During migration, the Reader script configurations of MySQL and the Writer script configurations of Tablestore are used. The following items describe the source and destination configurations:
|
Synchronize Oracle data to Tablestore | You can migrate data in Oracle databases only to Tablestore data tables. During migration, the Reader script configurations of Oracle and the Writer script configurations of Tablestore are used. The following items describe the source and destination configurations:
|
Synchronize Kafka data to Tablestore | You can migrate Kafka data to Tablestore data tables or time series tables. Important
During migration, the Reader script configurations of Kafka and the Writer script configurations of Tablestore are used. The following items describe the source and destination configurations:
|
Synchronize HBase data to Tablestore | You can migrate data in HBase databases only to Tablestore data tables. During migration, the Reader script configurations of HBase and the Writer script configurations of Tablestore are used. The following items describe the source and destination configurations:
|
Synchronize MaxCompute data to Tablestore | You can migrate MaxCompute data only to Tablestore data tables. During migration, the Reader script configurations of MaxCompute and the Writer script configurations of Tablestore are used. The following items describe the source and destination configurations:
|
Synchronize PolarDB-X 2.0 data to Tablestore | You can migrate data from PolarDB-X 2.0 only to Tablestore data tables. During migration, the Reader script configurations of PolarDB-X 2.0 and the Writer script configurations of Tablestore are used.
|
Synchronize data between Tablestore data tables | You can migrate data from a Tablestore data table only to another Tablestore data table. During migration, the Reader script configurations and the Writer script configurations of Tablestore are used. For information about the source and destination configurations, see Tablestore data source. When you specify the Reader script configurations and the Writer script configurations of Tablestore, refer to the configurations that are used to read and write data in tables in the Wide Column model. |
Synchronize data between Tablestore time series tables | You can migrate data from a Tablestore time series table only to another Tablestore time series table. During migration, the Reader script configurations and the Writer script configurations of Tablestore are used. For information about the source and destination configurations, see Tablestore data source. When you specify the Reader script configurations and the Writer script configurations of Tablestore, refer to the configurations that are used to read and write data in tables in the TimeSeries model. |
Export data
The following table describes data export solutions.
Solution | Description |
Synchronize Tablestore data to MaxCompute | You can use MaxCompute to back up Tablestore data or migrate Tablestore data to MaxCompute. During migration, the Reader script configurations of Tablestore and the Writer script configurations of MaxCompute are used. The following items describe the source and destination configurations:
|
Synchronize Tablestore data to OSS | You can download objects that are synchronized from Tablestore to OSS and store the objects in OSS as the backup of the data in Tablestore. During migration, the Reader script configurations of Tablestore and the Writer script configurations of OSS are used. The following items describe the source and destination configurations:
|
Prerequisites
After you determine a migration solution, make sure that the following preparations are made:
The network connection between the source and DataWorks and between the destination and DataWorks is established.
The following operations are performed on the source service: confirm the version, prepare the account, configure the required permissions, and perform service-specific configurations. For more information, see the configuration requirements in the documentation of the source.
The destination service is activated, and the required resources are created. For more information, see the configuration requirements in the documentation of the destination.
Usage notes
If you require technical support when you migrate data, submit a ticket.
Make sure that DataWorks Data Integration supports data migration of the specific product version.
The data type of the destination must match the data type of the source. Otherwise, dirty data may be generated during migration.
After you determine the migration solution, make sure to read the limits and usage notes in the documentation of the source and destination.
Before you migrate Kafka data, you must select a Tablestore data model to store the data based on your business scenario.
Configuration process
You can determine your migration solution and learn about how to configure data migration by using DataWorks Data Integration for your migration solution.
The following table describes the configuration steps.
No. | Step | Description |
1 | Create the required data sources based on the migration solution.
| |
2 | Configure a batch synchronization task by using the codeless UI | DataWorks Data Integration provides the codeless UI and step-by-step instructions to help you configure a batch synchronization task. The codeless UI is easy to use but provides only limited features. |
3 | Verify migration results | View the imported data in the destination based on the migration solution.
|
Examples
Import data
DataWorks Data Integration allows you to import data from sources, such as MySQL, Oracle, and MaxCompute, to Tablestore data tables. In this example, data is imported from MaxCompute to a Tablestore data table.
Prerequisites
Step 1: Add a Tablestore data source and a MaxCompute data source
Step 2: Configure a batch synchronization task by using codeless UI
Step 3: View the data imported to Tablestore
Export data
You can use DataWorks Data Integration to export Tablestore data to MaxCompute or OSS.
Export full Tablestore data to MaxCompute. For more information, see Export full data from Tablestore to MaxCompute.
Synchronize Tablestore data to OSS.
Full export
You can export full data from Tablestore to OSS. For more information, see Export full data from Tablestore to OSS.
Incremental synchronization
You can synchronize incremental data from Tablestore to OSS. For more information, see Synchronize incremental data to OSS.
Billing
After you import data to Tablestore, you are charged by Tablestore for storage usage based on the amount of stored data.
When you use a migration tool to access Tablestore, you are charged by Tablestore for read and write throughput based on the read and write requests. You are separately charged for metered read and write CUs and reserved read and write CUs. The type of the instance that you access determines whether metered read and write CUs or reserved read and write CUs are consumed.
NoteFor more information about instance types and CUs, see Instance and Read and write throughput.
When you use DataWorks tools, you are charged for specific features and resources. For more information, see Purchase guide.
Other solutions
You can download Tablestore data to a local file based on your business requirements. For more information, see Download data in Tablestore to a local file.
You can also use other migration tools, such as Tunnel Service, to import data.
Migration tool | Description | Migration solution |
DataX abstracts the synchronization between different data sources into a Reader plug-in that reads data from the source and a Writer plug-in that writes data to the destination. | Synchronize data from one table to another table in Tablestore | |
Tunnel Service is an integrated service used to consume full and incremental data based on the Tablestore API. This tool is suitable for scenarios in which the source for migration or synchronization is Tablestore. Tunnel Service provides tunnels that are used to export and consume data in the full, incremental, and differential modes. After you create tunnels, you can consume historical and incremental data that is exported from a specific table. | Synchronize data from one table to another table in Tablestore |
Appendix: Field type mappings
This section describes the field type mappings between common services and Tablestore. In actual scenarios, configure field mappings based on the field type mappings.
Field type mapping between MaxCompute and Tablestore
Field type in MaxCompute | Field type in Tablestore |
STRING | STRING |
BIGINT | INTEGER |
DOUBLE | DOUBLE |
BOOLEAN | BOOLEAN |
BINARY | BINARY |
Field type mapping between MySQL and Tablestore
Field type in MySQL | Field type in Tablestore |
STRING | STRING |
INT or INTEGER | INTEGER |
DOUBLE, FLOAT, or DECIMAL | DOUBLE |
BOOL or BOOLEAN | BOOLEAN |
BINARY | BINARY |
Field type mapping between Kafka and Tablestore
Kafka Schema Type | Field type in Tablestore |
STRING | STRING |
INT8, INT16, INT32, or INT64 | INTEGER |
FLOAT32 or FLOAT64 | DOUBLE |
BOOLEAN | BOOLEAN |
BYTES | BINARY |