All Products
Search
Document Center

E-MapReduce:Use the CLI to connect to Trino

Last Updated:Dec 10, 2024

After you create a cluster that contains the Trino service, you can use the CLI to connect to Trino and query data in a simple and fast manner.

Prerequisites

An E-MapReduce (EMR) cluster that contains the Trino service is created. For more information about how to create a cluster, see Create a cluster.

Limits

You cannot enable Ranger authentication and Kerberos authentication for a cluster at the same time.

DataLake clusters and custom clusters

Note

If you turn on Kerberos Authentication when you create a cluster, the cluster is a high-security cluster. Otherwise, the cluster is a common cluster.

Common clusters

  1. Log on to your cluster in SSH mode. For more information, see Log on to a cluster.

  2. Run the following command to connect to the Trino CLI:

    Note

    If your cluster is of EMR V3.44.0 or a later minor version, or of EMR V5.10.0 or a later minor version, the service name is Trino. If your cluster is of a minor version earlier than EMR V3.44.0 or EMR V5.10.0, the service name is Presto.

    In EMR V3.44.0, EMR V5.10.0, or a minor version later than EMR V3.44.0 or EMR V5.10.0

    trino --server master-1-1:9090

    In a minor version earlier than EMR V3.44.0 or EMR V5.10.0

    presto --server master-1-1:9090
  3. Execute the following statement to view the table data:

    select * from <catalog>.<schema>.<table>;

    Parameters in the preceding statement:

    • <catalog>: specifies the name of the data source that you want to connect to.

    • <schema> specifies the name of the database that you want to use.

    • <table> specifies the name of the table from which you want to query data.

      For example, if you want to query data from the test table in the default database of Hive, you can execute the select * from hive.default.test; statement.

  4. Optional. Run the quit;command to exit the Trino CLI.

High-security clusters

  1. Log on to your cluster in SSH mode. For more information, see Log on to a cluster.

  2. Run the following command to connect to the Trino CLI:

    In EMR V3.44.0, EMR V5.10.0, or a minor version later than EMR V3.44.0 or EMR V5.10.0

    trino --server https://${FQDN}:7778 \
           --krb5-config-path /etc/krb5.conf \
           --keystore-path /etc/emr/trino-conf/keystore \
           --keystore-password ${pwd} \
           --krb5-keytab-path /etc/emr/trino-conf/trino.keytab \
           --krb5-principal trino/${FQDN}@${REALM} \
           --krb5-remote-service-name trino \
           --user trino/${FQDN}
           --catalog ${CATALOG}

    In a minor version earlier than EMR V3.44.0 or EMR V5.10.0

    Note

    If your cluster is of EMR V3.44.0 or a later minor version, or of EMR V5.10.0 or a later minor version, the service name is Trino. If your cluster is of a minor version earlier than EMR V3.44.0 or EMR V5.10.0, the service name is Presto.

    presto --server https://${FQDN}:7778 \
           --krb5-config-path /etc/krb5.conf \
           --keystore-path /etc/emr/trino-conf/keystore \
           --keystore-password ${pwd} \
           --krb5-keytab-path /etc/emr/trino-conf/trino.keytab \
           --krb5-principal trino/${FQDN}@${REALM} \
           --krb5-remote-service-name trino \
           --user trino/${FQDN}

    Parameter

    Description

    ${FQDN}

    The fully qualified domain name (FQDN) of the master-1-1 node. The FQDN must be in the master-1-1.c-xxxxxxx.cn-xxxxxx.emr.aliyuncs.com format. You can run the hostname -f command to obtain the FQDN.

    --krb5-config-path

    The value of the http.authentication.krb5.config parameter in the config.properties file. The value is fixed as /etc/krb5.conf.

    --keystore-path

    The value of the http-server.https.keystore.path parameter in the config.properties file. The value is fixed as /etc/emr/trino-conf/keystore.

    --keystore-password

    The value of the http-server.https.keystore.key parameter in the config.properties file. In this example, the value is ${pwd}. You can run the awk -F= '/http-server.https.keystore.key/{print $2}' ${TRINO_CONF_DIR}/config.properties command on the master-1-1 node to obtain the value.

    --krb5-keytab-path

    The value of the http-server.authentication.krb5.keytab parameter in the config.properties file. The value is fixed as /etc/emr/trino-conf/trino.keytab.

    ${REALM}

    The value of the http-server.authentication.krb5.user-mapping.pattern parameter in the config.properties file. You need to obtain the value on your own. The Kerberos realm is in the EMR.C-XXXXXX.COM format.

    --krb5-remote-service-name

    The value of the http-server.authentication.krb5.service-name parameter in the config.properties file. The value is fixed as trino.

    ${CATALOG}

    The name of the data source to which you want to connect. For example, --catalog hive indicates the Hive data source.

  3. Run the following command to view the schemas contained in the existing catalog:

    show schemas;
  4. Optional. Run the quit;command to exit the Trino CLI.

Hadoop clusters

Note

For Hadoop clusters, the service name is Presto.

Common clusters

  1. Log on to your cluster in SSH mode. For more information, see Log on to a cluster.

  2. Run the following command to connect to the Presto CLI:

    presto --server emr-header-1:9090 --catalog hive --schema default --user hadoop

    Parameters in the preceding command:

    • --server emr-header-1:9090: specifies the address and port number of the Presto service.

    • --catalog hive: specifies the name of the data source that you want to connect to. In this example, Hive is used. If you want to connect to other data sources, you can modify this parameter based on your business requirements.

    • --schema default: specifies the name of the database or schema that you want to use. In this example, the default database is used. You can modify this parameter based on your business requirements.

    • --user hadoop: specifies the username for authentication.

  3. Run the following command to view the schemas contained in the existing catalog:

    show schemas;
  4. Optional. Run the quit; command to exit the Presto CLI.

High-security clusters

  1. Log on to your cluster in SSH mode. For more information, see Log on to a cluster.

  2. Add a principal and export the keytab file.

    1. Run the following command to enable the Kerberos administration tool:

      • In EMR V3.30.0, EMR V4.5.1, or a minor version later than EMR V3.30.0 or EMR V4.5.1

        sh /usr/lib/has-current/bin/admin-local.sh /etc/ecm/has-conf -k /etc/ecm/has-conf/admin.keytab
      • In a minor version earlier than EMR V3.30.0 or EMR V4.5.1

        sh /usr/lib/has-current/bin/hadmin-local.sh /etc/ecm/has-conf -k /etc/ecm/has-conf/admin.keytab
    2. Run the following command to add the principal of a specific key:

      addprinc -randkey test
      Note

      In this example, the principal of test is added.

    3. Run the following command to export the keytab file:

      xst -k /home/test.keytab test

      By default, the keytab file is exported to the /home/ directory.

  3. Run the following command to open the Presto CLI:

    presto --server https://<hostname>:7778 \
           --catalog hive \
           --schema default \
           --keystore-path /etc/ecm/presto-conf/keystore \
           --keystore-password <passwd> \
           --krb5-keytab-path <keytab_file> \
           --krb5-principal <username>@EMR.<cluster_id>.COM \
           --krb5-remote-service-name presto \
           --user <username>

    Parameter

    Description

    <hostname>

    The hostname. You can run the hostname command on the emr-header-1 node to obtain the value. The value is in the emr-header-1.cluster-xxx format.

    <passwd>

    The password. You can run the sed -n 's/http-server.https.keystore.key=\([^;]*\)/\1/p' /etc/ecm/presto-conf/config.properties command on the emr-header-1 node to obtain the value.

    <keytab_file>

    The path of the exported keytab file. In this example, the path is /home/test.keytab.

    <username>

    The principal of the created keytab file. In this example, the principal is test.

    <cluster_id>

    The cluster ID. You can run the hostname | grep -Eo '[0-9]+$' command on the emr-header-1 node to obtain the value.

  4. Run the following command to view the schemas contained in the existing catalog:

    show schemas;
  5. Optional. Run the quit; command to exit the Presto CLI.

References

You can use JDBC to connect to Trino to query, analyze, and process complex data, or integrate query results into Java applications. For more information, see Use JDBC.