Performing full-text searches, multi-dimensional queries, and statistical analysis on data stored in MySQL can be a cumbersome task. Thankfully, Alibaba Cloud DataWorks offers a streamlined solution for synchronizing data from MySQL to Alibaba Cloud Elasticsearch unlocking powerful search and analytical capabilities. In this article, we'll walk you through the entire process, complete with code examples and best practices.
Before we dive in, ensure you have the following prerequisites in place:
1)An ApsaraDB RDS for MySQL instance (follow this guide to create one).
2)An Alibaba Cloud Elasticsearch cluster with the Auto Indexing feature enabled (learn how to create a cluster and configure the YML file).
3)A DataWorks workspace .
Note: The MySQL instance, Elasticsearch cluster, and DataWorks workspace must reside in the same region and time zone.
Let's start by creating a database and table in your ApsaraDB RDS for MySQL instance. Run the following SQL statements:
1-- Create table
2CREATE TABLE `es_test` (
3 `id` bigint(32) NOT NULL,
4 `name` varchar(32) NULL,
5 `age` bigint(32) NULL,
6 `hobby` varchar(32) NULL,
7 PRIMARY KEY (`id`)
8) ENGINE=InnoDB
9DEFAULT CHARACTER SET=utf8;
10
11-- Insert data
12INSERT INTO `es_test` (`id`,`name`,`age`,`hobby`) VALUES (1,'user1',22,'music');
13INSERT INTO `es_test` (`id`,`name`,`age`,`hobby`) VALUES (2,'user2',23,'sport');
14INSERT INTO `es_test` (`id`,`name`,`age`,`hobby`) VALUES (3,'user3',43,'game');
15INSERT INTO `es_test` (`id`,`name`,`age`,`hobby`) VALUES (4,'user4',24,'run');
16INSERT INTO `es_test` (`id`,`name`,`age`,`hobby`) VALUES (5,'user5',42,'basketball');
Next, create an exclusive resource group for Data Integration and associate it with a Virtual Private Cloud (VPC) and your DataWorks workspace. This ensures fast and stable data transmission.
1)Log in to the DataWorks console
2)Select your region and navigate to Resource Groups > Exclusive Resource Groups.
3)Click Create Resource Group for Data Integration, configure the resource group name, and complete the purchase.
4)Find the newly created resource group, click Network Settings, and associate it with the VPC where your MySQL instance and Elasticsearch cluster reside.
5)Add the CIDR blocks of the vSwitches (for MySQL and Elasticsearch) to their respective IP address whitelists.
6)Return to the Resource Groups page and click Change Workspace to associate the resource group with your DataWorks workspace.
It's time to add your MySQL instance and Elasticsearch cluster as data sources in Data Integration.
1)In the DataWorks console, navigate to Workspaces > Shortcuts > Data Integration.
2)In the left-side navigation pane, click Data Source > Add Data Source.
3)Add your MySQL data source by selecting MySQL and configuring the basic information (follow this guide).
4)Repeat the process to add your Elasticsearch data source (follow this guide).
Now, it's time to configure and run the batch synchronization task using the exclusive resource group.
1)In the DataWorks console, navigate to Workspaces > Shortcuts > Data Integration.
2)Create a new workflow and add an Offline Synchronization node.
3)Configure the network and resources:
4)Click Next and configure the task:
5)(Optional) Configure scheduling properties for the task.
6)Save and submit the task. You can run it immediately or set it to run on a schedule.
To verify that the data has been successfully synchronized to Elasticsearch, follow these steps:
1)Log in to the Kibana console of your Elasticsearch cluster.
2)Click the icon in the top-left corner and select Dev Tools.
3)On the Console tab, run the following command (replace es_test with your index name):
POST /es_test/_search?pretty
{
"query": { "match_all": {}}
}
If the data synchronization was successful, you should see the synchronized data in the response.
Congratulations! You've successfully synchronized data from MySQL to Alibaba Cloud Elasticsearch using DataWorks. Now, you can leverage Elasticsearch's powerful search and analytical capabilities on your MySQL data.
Ready to start your journey with Elasticsearch on Alibaba Cloud? Explore our tailored Cloud solutions and services to take the first step towards transforming your data into a visual masterpiece. Click here to embark on your 30-Day Free Trial
Sync Your MySQL Data to Elasticsearch with logstash-input-jdbc on Alibaba Cloud
Real-Time Synchronizing Data from MySQL to Alibaba Cloud Elasticsearch with DTS
Alibaba Cloud Community - May 10, 2024
Alibaba Clouder - September 26, 2019
Data Geek - May 10, 2024
JDP - December 30, 2021
Haemi Kim - September 15, 2021
Alibaba Clouder - September 3, 2019
Alibaba Cloud Elasticsearch helps users easy to build AI-powered search applications seamlessly integrated with large language models, and featuring for the enterprise: robust access control, security monitoring, and automatic updates.
Learn MoreA real-time data warehouse for serving and analytics which is compatible with PostgreSQL.
Learn MoreSecure and easy solutions for moving you workloads to the cloud
Learn MoreThis solution helps you easily build a robust data security framework to safeguard your data assets throughout the data security lifecycle with ensured confidentiality, integrity, and availability of your data.
Learn MoreMore Posts by Data Geek