All Products
Search
Document Center

DataWorks:Configure the Kyuubi connection information

Last Updated:Nov 13, 2024

After you register an E-MapReduce (EMR) cluster to DataWorks, you can configure the Kyuubi connection information for the EMR cluster based on your business requirements. You can use a pair of custom username and password to log on to Kyuubi to run related tasks. This topic describes how to configure the Kyuubi connection information for an EMR cluster in DataWorks.

Background information

Apache Kyuubi is a distributed and multi-tenant gateway that provides query services such as SQL queries for data lake query engines. The data lake query engines include Spark, Flink, and Trino. For more information, see Overview.

Prerequisites

An EMR cluster is registered to DataWorks. For more information, see Register an EMR cluster to DataWorks.

Configure the Kyuubi connection information

  1. Go to the Kyuubi configuration page.

    1. Go to the Management Center page.

      Log on to the DataWorks console. In the top navigation bar, select the desired region. In the left-side navigation pane, choose More > Management Center. On the page that appears, select the desired workspace from the drop-down list and click Go to Management Center.

    2. In the left-side navigation pane of the SettingCenter page, click Cluster Management. The Cluster Management page appears.

    3. Find the desired EMR cluster, click the Kyuubi Configuration > Edit Kyuubi Configuration. The Kyuubi configuration page appears.

  2. Configure the Kyuubi connection information.

    Follow the on-screen instructions to set the Connection Mode parameter based on your business requirements.

    • Connection Information of Alibaba Cloud EMR Cluster: If you select this connection mode, the default access identity that is specified when you register the EMR cluster is used to log on to Kyuubi. This mode is selected by default.

    • Custom Configuration Information: If you select this connection mode, a pair of custom username and password is used to log on to Kyuubi. The value for the JDBC URL parameter is in the jdbc:hive2://host:port/;user=<Username for logon>;password=<Password for logon> format.

      Note
      • The first time you select Custom Configuration Information, the value of the JDBC URL parameter is automatically filled based on the account information that you configure when you register the EMR cluster. You can modify the JDBC URL based on your business requirements.

      • If you select Pass Proxy User Information when you register the EMR cluster, the configuration information of hive.server2.proxy.user is concatenated to the JDBC URL after an EMR task is run in DataWorks. Concatenation rules:

        • If the placeholder DATAWORKS_PROXY_USER is not specified in the JDBC URL for the custom configuration information, the platform concatenates the configuration information of hive.server2.proxy.user at the end of the JDBC URL by default when the EMR task is executed.

        • If the placeholder DATAWORKS_PROXY_USER is specified in the JDBC URL for the custom configuration information, the platform dynamically replaces the placeholder with the configuration information of hive.server2.proxy.user when the EMR task is executed.

What to do next

For information about how to configure relevant component environments and perform data development operations in DataWorks, see General development process.