If you encounter slow queries when you use an ApsaraDB RDS database, you can synchronize data from the database to an Alibaba Cloud Elasticsearch cluster for data queries and analytics. Alibaba Cloud Elasticsearch is a Lucene-based, distributed search and analytics engine. It allows you to store, query, and analyze large amounts of datasets in near real time. You can use Data Transmission Service (DTS), Logstash, DataWorks, or Canal to synchronize data from an ApsaraDB RDS database to an Alibaba Cloud Elasticsearch cluster. This topic describes the use scenarios of each method. You can select a method based on your business requirements.
Method | Description | Use scenario | Usage note | References |
Use DTS to synchronize data in real time | DTS synchronizes data by subscribing to binary logs. Data synchronization by using DTS has only a millisecond-level latency, without negative impacts imposed on the source database. | You have a high requirement for real-time performance of data synchronization. |
| Use DTS to synchronize MySQL data to an Alibaba Cloud Elasticsearch cluster in real time |
Use the logstash-input-jdbc plug-in to synchronize data | You can use the logstash-input-jdbc plug-in to query the data in an ApsaraDB RDS database and migrate the data to an Elasticsearch cluster. During data synchronization, the plug-in uses a round-robin method to identify the latest inserted or updated data in the database on a regular basis. Then, the plug-in queries all identified data at a time and migrates the data to an Elasticsearch cluster. Data synchronization by using the logstash-input-jdbc plug-in has poorer real-time performance than data synchronization by using DTS and has a second-level latency. |
|
| Use Logstash to synchronize data from ApsaraDB RDS for MySQL to Elasticsearch |
Use DataWorks to synchronize offline data | DataWorks is a comprehensive service that provides modules such as Data Integration, DataStudio, and Data Quality. You can use DataWorks to import and store structured data, convert and develop the data, and then synchronize the processed data to Elasticsearch clusters or other data systems. |
|
| Use DataWorks to synchronize data from a MySQL database to an Alibaba Cloud Elasticsearch cluster |
Use Canal to synchronize MySQL data | Canal synchronizes data in real time by subscribing to binary logs. Data synchronization by using Canal has only a millisecond-level latency, without negative impacts imposed on the source database. | You have a high requirement for real-time performance of data synchronization. |
| Use Canal to synchronize MySQL data to Alibaba Cloud Elasticsearch |