All Products
Search
Document Center

AnalyticDB:Access ApsaraDB RDS for MySQL

Last Updated:Nov 20, 2024

AnalyticDB for MySQL allows you to access ApsaraDB RDS for MySQL within an Alibaba Cloud account or across Alibaba Cloud accounts. You can access ApsaraDB RDS for MySQL over an elastic network interface (ENI) or an SSL connection. Compared with the ENI-based access, the SSL-based access provides network connection encryption and ensures higher data security. This topic describes how to access ApsaraDB RDS for MySQL over an ENI and an SSL connection.

Prerequisites

  • An AnalyticDB for MySQL Data Lakehouse Edition cluster is created.

  • An ApsaraDB RDS for MySQL instance is created in the same region as the AnalyticDB for MySQL cluster. For more information, see Create a cluster and Create an ApsaraDB RDS for MySQL instance.

  • A job resource group is created for the AnalyticDB for MySQL cluster. For more information, see Create a resource group.

  • A database account is created for the AnalyticDB for MySQL cluster.

  • The ApsaraDB RDS for MySQL instance is added to a security group. The inbound and outbound rules of the security group allow requests from the port of the ApsaraDB RDS for MySQL instance. For more information, see Configure a security group for an ApsaraDB RDS for MySQL instance and Add a security group rule.

  • Object Storage Service (OSS) is activated and a bucket is created in the same region as the AnalyticDB for MySQL cluster. For more information, see Activate OSS and Create buckets.

  • Authorization is complete. For more information, see Perform authorization.

    Important

    To access data within an Alibaba Cloud account, you must have the AliyunADBSparkProcessingDataRole permission. To access data across Alibaba Cloud accounts, you must perform authorization for other Alibaba Cloud accounts.

Prepare data

Create a database in the ApsaraDB RDS for MySQL instance, create a table in the database, and then write data to this table. Sample statements:

CREATE DATABASE `test`;

CREATE TABLE `test`.`persons` (
  `id` int(11) DEFAULT NULL,
  `first_name` varchar(32) DEFAULT NULL,
  `laster_name` varchar(32) DEFAULT NULL,
  `age` int(11) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8
;

INSERT INTO persons VALUES(1,'a','b',5);
INSERT INTO persons VALUES(2,'c','d',6);
INSERT INTO persons VALUES(3,'e','f',7);

Access ApsaraDB RDS for MySQL over an ENI

Upload the driver and the JAR package that is required for the Spark job

  1. Write a program that is used to access a table in the ApsaraDB RDS for MySQL instance. Then, compile the program into a JAR package that is required for the Spark job. In this example, the JAR package is named rds_test.jar. Sample code:

    package com.aliyun.spark
    import org.apache.spark.sql.SparkSession
    object SparkRDS {
      def main(args: Array[String]): Unit = {
        val sparkSession = SparkSession.builder()
        .appName("rds mysql test")
        .getOrCreate()
        // The internal endpoint of the ApsaraDB RDS for MySQL instance. For more information, see View and change the endpoints and port numbers. 
        val url = "jdbc:mysql://rm-bp11mpql1e01****.mysql.rds.aliyuncs.com"
        // The name of the table in the ApsaraDB RDS for MySQL instance. Specify the name in the db_name.table_name format. 
        val dbtable = "test.persons"
        // The database account that is used to connect to the ApsaraDB RDS for MySQL database. 
        val user = "mysql_username"
        // The password of the database account. 
        val password = "mysql_password"
        val jdbcDF = sparkSession.read
        .format("jdbc")
        .option("url", url)
        .option("driver", "com.mysql.jdbc.Driver")
        .option("dbtable", dbtable)
        .option("user", user)
        .option("password", password)
        .load()
        jdbcDF.show()
      }
    }
  2. Download the driver that is compatible with the engine version of the ApsaraDB RDS for MySQL instance from https://dev.mysql.com/downloads/connector/j/.

    In this example, mysql-connector-java-8.0.11.jar is downloaded.

  3. Upload the JAR package and the driver to OSS. For more information, see Upload objects.

Access ApsaraDB RDS for MySQL within an Alibaba Cloud account

  1. Log on to the AnalyticDB for MySQL console. In the upper-left corner of the console, select a region. In the left-side navigation pane, click Clusters. On the Clusters page, click an edition tab. Find the cluster that you want to manage and click the cluster ID.

  2. In the left-side navigation pane, choose Job Development > Spark JAR Development.

  3. Select a job resource group and a job type for the Spark job. In this example, the Batch type is selected.

  4. Run the following code in the Spark editor:

    {
        "name": "rds-mysql-example",
        "jars": [
            "oss://testBucketName/mysql-connector-java-8.0.11.jar"
        ],
        "file": "oss://testBucketName/rds_test.jar",
        "className": "com.aliyun.spark.SparkRDS",
        "conf": {
            "spark.adb.eni.enabled": "true",
            "spark.adb.eni.vswitchId": "vsw-bp17jqw3lrrobn6y****",
            "spark.adb.eni.securityGroupId": "sg-bp163uxgt4zandx****",
            "spark.driver.resourceSpec": "small",
            "spark.executor.instances": 1,
            "spark.executor.resourceSpec": "small"
        }
    }

    The following table describes the parameters.

    Parameter

    Description

    name

    The name of the Spark job.

    jars

    The OSS path of the ApsaraDB RDS for MySQL driver.

    In this example, the OSS path of the mysql-connector-java-8.0.11.jar package is used.

    file

    The OSS path of the JAR package that is required for the Spark job.

    className

    The entry class of the Java or Scala application.

    In this example, com.aliyun.spark.SparkRDS is used.

    spark.adb.eni.enabled

    Specifies whether to enable ENI.

    spark.adb.eni.vswitchId

    The vSwitch ID of the ApsaraDB RDS for MySQL instance. To view the vSwitch ID, move the pointer over VPC on the Database Connection page.

    spark.adb.eni.securityGroupId

    The ID of the security group to which the ApsaraDB RDS for MySQL instance is added. For more information, see Configure a security group for an ApsaraDB RDS for MySQL instance.

    conf

    The configuration parameters that are required for the Spark job, which are similar to those of Apache Spark. The parameters must be in the key:value format. Separate multiple parameters with commas (,). For more information, see Spark application configuration parameters.

  5. Click Run Now.

  6. After the Spark job is run, view the data of the ApsaraDB RDS for MySQL table in the logs of the Spark job. For more information, see the "View information about a Spark application" section of the Spark editor topic.

Access ApsaraDB RDS for MySQL across Alibaba Cloud accounts

  1. Log on to the AnalyticDB for MySQL console. In the upper-left corner of the console, select a region. In the left-side navigation pane, click Clusters. On the Clusters page, click an edition tab. Find the cluster that you want to manage and click the cluster ID.

  2. In the left-side navigation pane, choose Job Development > Spark JAR Development.

  3. Select a job resource group and a job type for the Spark job. In this example, the Batch type is selected.

  4. Run the following code in the Spark editor:

    {
        "name": "rds-mysql-example",
        "jars": [
            "oss://testBucketName/mysql-connector-java-8.0.11.jar"
        ],
        "file": "oss://testBucketName/rds_test.jar",
        "className": "com.aliyun.spark.SparkRDS",
        "conf": {
            "spark.adb.eni.enabled": "true",
            "spark.adb.eni.vswitchId": "vsw-bp17jqw3lrrobn6y****",
            "spark.adb.eni.securityGroupId": "sg-bp163uxgt4zandx****",
            "spark.driver.resourceSpec": "small",
            "spark.executor.instances": 1,
            "spark.executor.resourceSpec": "small"
            "spark.adb.eni.roleArn":"acs:ram::testAccountID:role/testUserName"
        }
    }

    The following table describes the parameters.

    Parameter

    Description

    spark.adb.eni.roleArn

    The RAM role that is used to access ApsaraDB RDS for MySQL across Alibaba Cloud accounts. Separate multiple roles with commas (,). Specify the parameter in the acs:ram::testAccountID:role/testUserName format.

    • testAccountID: the ID of the Alibaba Cloud account that owns the ApsaraDB RDS for MySQL data source.

    • testUserName: the name of the RAM role that is created when you perform authorization across Alibaba Cloud accounts. For more information, see the "Perform authorization across Alibaba Cloud accounts" section of the Perform authorization topic.

    For more information about the parameters, see the preceding parameter table.

  5. Click Run Now.

  6. After the Spark job is run, view the data of the ApsaraDB RDS for MySQL table in the logs of the Spark job. For more information, see the "View information about a Spark application" section of the Spark editor topic.

Access ApsaraDB RDS for MySQL over an SSL connection

Before you access ApsaraDB RDS for MySQL over an SSL connection, make sure that SSL encryption is enabled for the ApsaraDB RDS for MySQL instance and the connection to the internal endpoint is encrypted. For more information, see Configure the SSL encryption feature.

Download the CA certificate from the ApsaraDB RDS console and upload the certificate to OSS

  1. Log on to the ApsaraDB RDS console. In the upper-left corner of the console, select a region. In the left-side navigation pane, click Instances. On the page that appears, click the ID of the instance that you want to view.

  2. In the left-side navigation pane, click Data Security.

  3. Click Download CA Certificate.

    Important

    The default validity period of a CA certificate is one year. If the CA certificate expires, you must generate another certificate. You cannot use an expired CA certificate to access ApsaraDB RDS for MySQL over an SSL connection.

  4. Decompress the CA certificate package and upload the extracted JKS certificate file to OSS. For more information, see Upload objects.

Upload the driver and the JAR package that is required for the Spark job

  1. Write a program that is used to access a table in the ApsaraDB RDS for MySQL instance. Then, compile the program into a JAR package that is required for the Spark job. In this example, the JAR package is named test.jar. Sample code:

    package org.example
    import org.apache.spark.sql.SparkSession
    
    object Test {
      def main(args: Array[String]): Unit = {
        // The OSS path of the JKS certificate file, which is in the oss://testBucketName/folder/ApsaraDB-CA-Chain.jks format. 
        val JKS_FILE_PATH = args(0)
        
        // The database account that is used to connect to the ApsaraDB RDS for MySQL database. 
        val USERNAME = args(1)
    
        // The password of the database account. 
        val PASSWORD = args(2)
    
        The name of the ApsaraDB RDS for MySQL database. 
        val DATABASE_NAME = args(3)
    
        // The name of the table in the ApsaraDB RDS for MySQL instance. 
        val TABLE_NAME = args(4)
        // The internal endpoint of the ApsaraDB RDS for MySQL instance. 
        val mysqlUrl = "jdbc:mysql://rm-bp11mpql1e01****.mysql.rds.aliyuncs.com:3306/?" +
          "useSSL=true" +
          s"&trustCertificateKeyStoreUrl=file:///tmp/testBucketName/folder/ApsaraDB-CA-Chain.jks" +
          "&trustCertificateKeyStorePassword=apsaradb" +
          "&trustCertificateKeyStoreType=JKS" +
        
        val spark = SparkSession.builder().getOrCreate()
    
         spark.read.format("jdbc")
          .option("driver", "com.mysql.cj.jdbc.Driver")
          .option("url", mysqlUrl)
          .option("user", USERNAME)
          .option("password",  PASSWORD)
          .option("dbtable", s"${DATABASE_NAME}.${TABLE_NAME}")
          .load()
          .show()
      }
    }

    The following table describes the parameters.

    Parameter

    Description

    useSSL

    Specifies whether to enable SSL encryption. Valid values:

    • true

    • false (default)

    In this example, true is selected.

    trustCertificateKeyStoreUrl

    The on-premises path of the JKS certificate file, which is in the file:///tmp/<JKS_FILE_PATH> format. JKS_FILE_PATH specifies the OSS path of the JKS certificate file.

    For example, if the OSS path of the JKS certificate file is oss://testBucketName/folder/ApsaraDB-CA-Chain.jks, the on-premises path of the JKS certificate file is file:///tmp/testBucketName/folder/ApsaraDB-CA-Chain.jks.

    trustCertificateKeyStorePassword

    The password of the JKS certificate file. The value of this parameter is apsaradb.

    trustCertificateKeyStoreType

    The storage format of the certificate file. The value of this parameter is JKS.

  2. Upload the test.jar package to OSS. For more information, see Upload objects.

Access ApsaraDB RDS for MySQL within an Alibaba Cloud account

  1. Log on to the AnalyticDB for MySQL console. In the upper-left corner of the console, select a region. In the left-side navigation pane, click Clusters. On the Clusters page, click an edition tab. Find the cluster that you want to manage and click the cluster ID.

  2. In the left-side navigation pane, choose Job Development > Spark JAR Development.

  3. Select a job resource group and a job type for the Spark job. In this example, the Batch type is selected.

  4. Run the following code in the Spark editor:

    {
      "file": "oss://testBucketName/test.jar",
    	"className": "org.example.Test",
      "name": "MYSQL PEM Test",
      "conf": {
            "spark.kubernetes.driverEnv.ADB_SPARK_DOWNLOAD_FILES": "oss://testBucketName/folder/ApsaraDB-CA-Chain.jks",
    	      "spark.executorEnv.ADB_SPARK_DOWNLOAD_FILES": "oss://testBucketName/folder/ApsaraDB-CA-Chain.jks",
            "spark.driver.resourceSpec": "small",
            "spark.executor.instances": 1,
            "spark.executor.resourceSpec": "small",
            "spark.adb.eni.enabled": "true",
    	      "spark.adb.eni.vswitchId": "vsw-bp17jqw3lrrobn6y****",
    	      "spark.adb.eni.securityGroupId": "sg-bp163uxgt4zandx****"
        }
    }
    

    The following table describes the parameters.

    Parameter

    Description

    name

    The name of the Spark job.

    file

    The OSS path of the JAR package that is required for the Spark job.

    className

    The entry class of the Java or Scala application.

    In this example, org.example.Test is used.

    spark.kubernetes.driverEnv.ADB_SPARK_DOWNLOAD_FILES

    The Spark driver parameter, which specifies the OSS path of the JKS certificate file. Separate multiple JKS certificate files with commas (,). Example: spark.kubernetes.driverEnv.ADB_SPARK_DOWNLOAD_FILES: "oss://testBucketName/a.jks,oss://testBucketName/b.jks".

    spark.executorEnv.ADB_SPARK_DOWNLOAD_FILES

    The Spark executor parameter, which specifies the OSS path of the JKS certificate file. Separate multiple JKS certificate files with commas (,). Example: spark.executor.ADB_SPARK_DOWNLOAD_FILES: "oss://testBucketName/a.jks,oss://testBucketName/b.jks".

    spark.adb.eni.enabled

    Specifies whether to enable ENI.

    spark.adb.eni.vswitchId

    The vSwitch ID of the ApsaraDB RDS for MySQL instance. To view the vSwitch ID, move the pointer over VPC on the Database Connection page.

    spark.adb.eni.securityGroupId

    The ID of the security group to which the ApsaraDB RDS for MySQL instance is added. For more information, see Configure a security group for an ApsaraDB RDS for MySQL instance.

    conf

    The configuration parameters that are required for the Spark job, which are similar to those of Apache Spark. The parameters must be in the key:value format. Separate multiple parameters with commas (,). For more information, see Spark application configuration parameters.

  5. Click Run Now.

  6. After the Spark job is run, view the data of the ApsaraDB RDS for MySQL table in the logs of the Spark job. For more information, see the "View information about a Spark application" section of the Spark editor topic.

Access ApsaraDB RDS for MySQL across Alibaba Cloud accounts

  1. Log on to the AnalyticDB for MySQL console. In the upper-left corner of the console, select a region. In the left-side navigation pane, click Clusters. On the Clusters page, click an edition tab. Find the cluster that you want to manage and click the cluster ID.

  2. In the left-side navigation pane, choose Job Development > Spark JAR Development.

  3. Select a job resource group and a job type for the Spark job. In this example, the Batch type is selected.

  4. Run the following code in the Spark editor:

    {
      "file": "oss://testBucketName/test.jar",
      "className": "org.example.Test",
      "name": "MYSQL PEM Test",
      "conf": {
            "spark.kubernetes.driverEnv.ADB_SPARK_DOWNLOAD_FILES": "oss://testBucketName/folder/ApsaraDB-CA-Chain.jks",
            "spark.executorEnv.ADB_SPARK_DOWNLOAD_FILES": "oss://testBucketName/folder/ApsaraDB-CA-Chain.jks",
            "spark.driver.resourceSpec": "small",
            "spark.executor.instances": 1,
            "spark.executor.resourceSpec": "small",
            "spark.adb.eni.enabled": "true",
            "spark.adb.eni.vswitchId": "vsw-bp17jqw3lrrobn6y****",
            "spark.adb.eni.securityGroupId": "sg-bp163uxgt4zandx****",
            "spark.adb.eni.roleArn":"acs:ram::testAccountID:role/testUserName", 
        }
    }

    The following table describes the parameters.

    Parameter

    Description

    spark.adb.eni.roleArn

    The RAM role that is used to access ApsaraDB RDS for MySQL across Alibaba Cloud accounts. Separate multiple roles with commas (,). Specify the parameter in the acs:ram::testAccountID:role/testUserName format.

    • testAccountID: the ID of the Alibaba Cloud account that owns the ApsaraDB RDS for MySQL data source.

    • testUserName: the name of the RAM role that is created when you perform authorization across Alibaba Cloud accounts. For more information, see the "Perform authorization across Alibaba Cloud accounts" section of the Perform authorization topic.

    For more information about the parameters, see the preceding parameter table.

  5. Click Run Now.

  6. After the Spark job is run, view the data of the ApsaraDB RDS for MySQL table in the logs of the Spark job. For more information, see the "View information about a Spark application" section of the Spark editor topic.