All Products
Search
Document Center

Lindorm:Add a Spark data source

Last Updated:Dec 19, 2024

You can add a Spark data source for a Lindorm instance to quickly import data to the instance in batches. This topic describes how to add a Spark data source.

Prerequisites

  • A Lindorm Tunnel Service (LTS) instance is purchased.

  • An Lindorm instance with Lindorm Distributed Processing System (LDPS) activated is created. For more information, see Create an instance.

Procedure

Use the Lindorm console to add a Spark data source

  1. Log on to the Lindorm console.

  2. On the Instances page, click the ID of an instance whose engine type is LTS.

  3. In the left-side navigation pane, click Data Sources.

  4. Click the Compute Engine Data Source tab, and then click Add Data Source.

  5. In the Add Data Source dialog box, configure the parameters described in the following table.

    Parameters

    Description

    Instance type

    Select Lindorm.

    Region

    Select the region in which your Lindorm instance is deployed.

    Instance ID

    Select the ID of your Lindorm instance.

    Note
  6. Click OK. If the state of the Spark data source is Associated, the data source is added.

Use LTS to add a Spark data source

  1. You are logged on to the web UI of the LTS instance. For more information, see Activate and log on to LTS.

  2. In the left-side navigation pane of the LTS console, choose Data Source Manage > Add Data Source.

  3. On the Add data source page, configure the parameters described in the following table.

    Parameter

    Description

    Name

    Enter lts_bulkload_spark.

    Data Source Type

    Select Spark.

    Parameters

    Configure parameters for the Spark data source.

    {
        "virtualClusterName":"token",
        "hdfsUri":"hdfs://nn1:8020,nn2:8020",
        "sparkEndpoint":"http://192.168.XX.XX:10099"
    }
    • virtualClusterName: The token of the JAR address of LDPS. You can obtain the token of the Lindorm instance by selecting Database Connections in the left-side navigation pane on the instance details page and then clicking the Compute Engine tab, as shown in the following figure.

    • hdfsUri: The HDFS endpoint of the Lindorm instance in the following format: hdfs://nn1:8020,nn2:8020.

      Note

      To obtain the value of nn1 and nn2 in the endpoint, submit a ticket.

    • sparkEndpoint: The JAR VPC address of LDPS. You can obtain the address by selecting Database Connections in the left-side navigation pane on the instance details page and then clicking the Compute Engine tab, as shown in the following figure.

  4. Click Add.