This topic describes how to configure public network environments that are required for Spark applications in AnalyticDB for MySQL Data Lakehouse Edition (V3.0) to access self-managed databases or third-party cloud services.
Background information
Internet NAT gateways provide the public address translation feature. You can create an Internet NAT gateway for a virtual private cloud (VPC) to allow all instances or clusters in the VPC to access the Internet and provide Internet-facing services. For more information, see What is an Internet NAT gateway?
Usage notes
If a self-managed database or third-party cloud service provides network security settings such as firewalls and IP address whitelists, you must add the public IP addresses that are specified when you create an SNAT entry or the vSwitch CIDR block to the network security settings. This way, Spark applications can access data sources.
Procedure
Create an Internet NAT gateway. For more information, see Create an Internet NAT gateway.
The Internet NAT gateway must reside in the same region as the AnalyticDB for MySQL cluster.
Associate an elastic IP address (EIP) with the Internet NAT gateway. For more information, see Associate an EIP with the Internet NAT gateway.
Create an SNAT entry. For more information, see Create an SNAT entry.
We recommend that you select Specify vSwitch for the SNAT entry.
Configure the following key parameters in your Spark application. Example:
{ "comments": ["-- Here is just an example of SparkPi. Modify the content and run your spark program."], "args": ["1000"], "file": "local:///tmp/spark-examples.jar", "name": "SparkPi", "className": "org.apache.spark.examples.SparkPi", "conf": { "spark.driver.resourceSpec": "small", "spark.executor.instances": 1, "spark.executor.resourceSpec": "small", "spark.adb.eni.enabled": "true", "spark.adb.eni.vswitchId": "vsw-bp1ghmwrkeaw3xvnd****", "spark.adb.eni.securityGroupId": "sg-bp1airvjxl5vpr2****" } }
Parameters:
Key parameter
Description
spark.adb.eni.enabled
Specifies whether to enable Elastic Network Interface (ENI) when you use external tables to access external data sources. Set this parameter to
true
.spark.adb.eni.vswitchId
The vSwitch ID of the ENI. Set this parameter to the vSwitch ID that is specified when you create an SNAT entry.
spark.adb.eni.securityGroupId
The security group ID of the ENI. Set this parameter to the ID of the security group that belongs to the VPC of the Internet NAT gateway.
For more information about Spark applications, see Overview of Spark application development.