You can add a Spark data source for a Lindorm instance to quickly import data to the instance in batches. This topic describes how to add a Spark data source.
Prerequisites
A Lindorm Tunnel Service (LTS) instance is purchased.
An Lindorm instance with Lindorm Distributed Processing System (LDPS) activated is created. For more information, see Create an instance.
Procedure
Use the Lindorm console to add a Spark data source
Log on to the Lindorm console.
On the Instances page, click the ID of an instance whose engine type is LTS.
In the left-side navigation pane, click Data Sources.
Click the Compute Engine Data Source tab, and then click Add Data Source.
In the Add Data Source dialog box, configure the parameters described in the following table.
Parameters
Description
Instance type
Select Lindorm.
Region
Select the region in which your Lindorm instance is deployed.
Instance ID
Select the ID of your Lindorm instance.
NoteMake sure that LDPS is activated for the Lindorm instance. For more information, see Activate LDPS and modify the configurations.
Make sure the Lindorm instance and the LTS instance are in the same virtual private cloud (VPC). For more information about how to associates instances across different VPCs, see Overview of VPC connections.
Click OK. If the state of the Spark data source is Associated, the data source is added.
Use LTS to add a Spark data source
You are logged on to the web UI of the LTS instance. For more information, see Activate and log on to LTS.
In the left-side navigation pane of the LTS console, choose .
On the Add data source page, configure the parameters described in the following table.
Parameter
Description
Name
Enter lts_bulkload_spark.
Data Source Type
Select Spark.
Parameters
Configure parameters for the Spark data source.
{ "virtualClusterName":"token", "hdfsUri":"hdfs://nn1:8020,nn2:8020", "sparkEndpoint":"http://192.168.XX.XX:10099" }
virtualClusterName: The token of the JAR address of LDPS. You can obtain the token of the Lindorm instance by selecting Database Connections in the left-side navigation pane on the instance details page and then clicking the Compute Engine tab, as shown in the following figure.
hdfsUri: The HDFS endpoint of the Lindorm instance in the following format:
hdfs://nn1:8020,nn2:8020
.NoteTo obtain the value of
nn1
andnn2
in the endpoint, submit a ticket.sparkEndpoint: The JAR VPC address of LDPS. You can obtain the address by selecting Database Connections in the left-side navigation pane on the instance details page and then clicking the Compute Engine tab, as shown in the following figure.
Click Add.