All Products
Search
Document Center

AnalyticDB:Public network configuration for Spark application access

Last Updated:Nov 13, 2023

This topic describes how to configure public network environments that are required for Spark applications in AnalyticDB for MySQL Data Lakehouse Edition (V3.0) to access self-managed databases or third-party cloud services.

Background information

Internet NAT gateways provide the public address translation feature. You can create an Internet NAT gateway for a virtual private cloud (VPC) to allow all instances or clusters in the VPC to access the Internet and provide Internet-facing services. For more information, see What is an Internet NAT gateway?

Usage notes

If a self-managed database or third-party cloud service provides network security settings such as firewalls and IP address whitelists, you must add the public IP addresses that are specified when you create an SNAT entry or the vSwitch CIDR block to the network security settings. This way, Spark applications can access data sources.

Procedure

  1. Create an Internet NAT gateway. For more information, see Create an Internet NAT gateway.

    The Internet NAT gateway must reside in the same region as the AnalyticDB for MySQL cluster.

  2. Associate an elastic IP address (EIP) with the Internet NAT gateway. For more information, see Associate an EIP with the Internet NAT gateway.

  3. Create an SNAT entry. For more information, see Create an SNAT entry.

    We recommend that you select Specify vSwitch for the SNAT entry.

  4. Configure the following key parameters in your Spark application. Example:

    {
        "comments": ["-- Here is just an example of SparkPi. Modify the content and run your spark program."],
        "args": ["1000"],
        "file": "local:///tmp/spark-examples.jar",
        "name": "SparkPi",
        "className": "org.apache.spark.examples.SparkPi",
        "conf": {
            "spark.driver.resourceSpec": "small",
            "spark.executor.instances": 1,
            "spark.executor.resourceSpec": "small",
            "spark.adb.eni.enabled": "true",
            "spark.adb.eni.vswitchId": "vsw-bp1ghmwrkeaw3xvnd****",
            "spark.adb.eni.securityGroupId": "sg-bp1airvjxl5vpr2****"	
        }
    }

    Parameters:

    Key parameter

    Description

    spark.adb.eni.enabled

    Specifies whether to enable Elastic Network Interface (ENI) when you use external tables to access external data sources. Set this parameter to true.

    spark.adb.eni.vswitchId

    The vSwitch ID of the ENI. Set this parameter to the vSwitch ID that is specified when you create an SNAT entry.

    spark.adb.eni.securityGroupId

    The security group ID of the ENI. Set this parameter to the ID of the security group that belongs to the VPC of the Internet NAT gateway.

    For more information about Spark applications, see Overview of Spark application development.