All Products
Search
Document Center

AnalyticDB:Use Kettle to import data to Data Warehouse Edition

Last Updated:Dec 30, 2023

This topic describes how to use Kettle to import local data to AnalyticDB for MySQL Data Warehouse Edition (V3.0). Excel data is used in the example.

Background information

Kettle is a popular open source extract-transform-load (ETL) tool that is used for data collection, conversion, and migration. Kettle supports not only various relational databases and NoSQL databases such as HBase and MongoDB, but also niche data sources such as Microsoft Office Excel and Access. Kettle can support more data sources by using extensions and plug-ins.

Preparations

  • Install Kettle.

  • Create a database and a table in your AnalyticDB for MySQL cluster.

  • Add the IP address of the device on which Kettle is installed to a whitelist of the AnalyticDB for MySQL cluster. For more information, see Configure a whitelist.

  • If you want to connect to the AnalyticDB for MySQL cluster over the Internet, apply for a public endpoint for the cluster. For more information, see Apply for or release a public endpoint.

Procedure

  1. Start Kettle, and choose File > New > Conversion to create a conversion task.

  2. Choose File > New > Database Connection to create a database connection for the conversion task.

    Parameter

    Description

    Connection Name

    The name of the connection. We recommend that you specify a descriptive name that is easy to identify.

    Connection Type

    The engine of the source database. Select MySQL.

    Access

    The access mode of the connection. Select Native (JDBC).

    Host Name

    The endpoint that is used to connect to the AnalyticDB for MySQL cluster. You can view the connection information about the cluster on the Cluster Information page of the AnalyticDB for MySQL console. For more information, see Connect to an AnalyticDB for MySQL cluster.

    Database Name

    The name of the AnalyticDB for MySQL database.

    Port Number

    The port number that is used to connect to the AnalyticDB for MySQL cluster. The default port number is 3306.

    User Name

    The name of the database account that is used to connect to the AnalyticDB for MySQL cluster. You can use one of the following account types:

    • Privileged account

    • Standard account

    Password

    The password of the account that is used to connect to the AnalyticDB for MySQL cluster.

    Note

    Do not select Use Result Streaming Cursor when you configure the parameters.

  3. After you configure the preceding parameters, click Test. In the Database Connection Test dialog box, follow the prompts to verify if the connection to the AnalyticDB for MySQL database is successful. After the connection passes the test, click OK.

  4. In the left-side navigation pane of Kettle, click the Core objects tab and choose Input > Excel Input. Then, drag and drop Excel Input to the workspace.

  5. Double-click Excel Input in the workspace. In the Excel Input dialog box, click Browse and Add to add an Excel file to Selected Files.

    Configure parameters on tabs such as Worksheet, Content, and Field, and click Preview to check whether the specified values meet your requirements.

  6. In the left-side navigation pane of Kettle, click the Core objects tab and choose Output > Table Output. Then, drag and drop Table Output to the workspace.

  7. Add a connection line from Excel Input to Table Output.

  8. Double-click Table Output in the workspace. In the Table Output dialog box, configure the parameters.

    • Target Schema: Enter the name of the AnalyticDB for MySQL database.

    • Target Table: Enter the name of the table in the AnalyticDB for MySQL database.

    • Select Specify database fields.

    • Select Use batch update for inserts.

    On the Database fields tab of the Table Output dialog box, click Get fields and Enter field mapping to map columns in the Excel file to those in the AnalyticDB for MySQL table.

  9. Click the white arrow to perform the conversion. During this period, you can check the operation logs and operating status.

    After the data in the Excel file is imported to the AnalyticDB for MySQL database, you can use AnalyticDB for MySQL to analyze the data.