Add a Spark data source - Lindorm - Alibaba Cloud Documentation Center

You can add a Spark data source for a Lindorm instance to quickly import data to the instance in batches. This topic describes how to add a Spark data source.

Prerequisites

A Lindorm Tunnel Service (LTS) instance is purchased.
An Lindorm instance with Lindorm Distributed Processing System (LDPS) activated is created. For more information, see Create an instance.

Procedure

Use the Lindorm console to add a Spark data source

Log on to the Lindorm console.
On the Instances page, click the ID of an instance whose engine type is LTS.
In the left-side navigation pane, click Data Sources.
Click the Compute Engine Data Source tab, and then click Add Data Source.

In the Add Data Source dialog box, configure the parameters described in the following table.

Parameters	Description
Instance type	Select Lindorm.
Region	Select the region in which your Lindorm instance is deployed.
Instance ID	Select the ID of your Lindorm instance. Note Make sure that LDPS is activated for the Lindorm instance. For more information, see Activate LDPS and modify the configurations. Make sure the Lindorm instance and the LTS instance are in the same virtual private cloud (VPC). For more information about how to associates instances across different VPCs, see Overview of VPC connections.

Click OK. If the state of the Spark data source is Associated, the data source is added.

Use LTS to add a Spark data source

You are logged on to the web UI of the LTS instance. For more information, see Activate and log on to LTS.
In the left-side navigation pane of the LTS console, choose Data Source Manage > Add Data Source.

On the Add data source page, configure the parameters described in the following table.

Parameter	Description
Name	Enter lts_bulkload_spark.
Data Source Type	Select Spark.
Parameters	Configure parameters for the Spark data source. `{ "virtualClusterName":"token", "hdfsUri":"hdfs://nn1:8020,nn2:8020", "sparkEndpoint":"http://192.168.XX.XX:10099" }` virtualClusterName: The token of the JAR address of LDPS. You can obtain the token of the Lindorm instance by selecting Database Connections in the left-side navigation pane on the instance details page and then clicking the Compute Engine tab, as shown in the following figure. hdfsUri: The HDFS endpoint of the Lindorm instance in the following format: `hdfs://nn1:8020,nn2:8020`. Note To obtain the value of `nn1` and `nn2` in the endpoint, submit a ticket. sparkEndpoint: The JAR VPC address of LDPS. You can obtain the address by selecting Database Connections in the left-side navigation pane on the instance details page and then clicking the Compute Engine tab, as shown in the following figure.

Click Add.