All Products
Search
Document Center

E-MapReduce:Use Log Service to collect the logs of Spark jobs

Last Updated:Nov 14, 2024

This topic describes how to use Log Service to collect the logs of Spark jobs.

Prerequisites

Procedure

  1. Install Logtail. For more information, see Step 1: Install Logtail.

    Note

    If Logtail is already installed, skip this step and go to Step 2.

  2. Access the project that you want to manage in the Log Service console.

    1. Log on to the ACK console.

    2. On the Clusters page, find the cluster that you want to manage and click the name of the cluster in the Cluster Name/ID column or click Details in the Actions column.

    3. In the Cluster Resources area on the Basic Information tab,click the link to the right of Log Service Project.

      The details page of the project appears.

  3. On the Logstores tab, create two Logstores.

    In this example, the two Logstores are named spark-driver-log and spark-executor-log. For more information about how to create a Logstore, see Step 1: Create a project and a Logstore.

  4. In the spark-driver-log Logstore, perform the following operations:

    1. Configure Logtail. Select Kubernetes - Standard Output for Data Import and select an existing Kubernetes machine group.

    2. Click the arrow to the left of the spark-driver-log Logstore. In the left-side navigation pane, choose Data Import > Logtail Configurations and select an existing Kubernetes machine group.

    3. In the Plug-in Config field, enter the following code:

      {
          "inputs": [
              {
                  "detail": {
                      "IncludeEnv": {
                          "SPARKLOGENV": "spark-driver"
                      },
                      "Stderr": true,
                      "Stdout": true,
                      "BeginLineCheckLength": 10,
                      "BeginLineRegex": "\\d+/\\d+/\\d+.*"
                  },
                  "type": "service_docker_stdout"
              }
          ]
      }
  5. In the spark-executor-log Logstore, enter the following code in the Plug-in Config field based on the operations in Step 4.

    {
        "inputs": [
            {
                "detail": {
                    "IncludeEnv": {
                        "SPARKLOGENV": "spark-executor"
                    },
                    "Stderr": true,
                    "Stdout": true,
                    "BeginLineCheckLength": 10,
                    "BeginLineRegex": "\\d+/\\d+/\\d+.*"
                },
                "type": "service_docker_stdout"
            }
        ]
    }
  6. Enable the indexing feature for the Logstores. For more information, see Create indexes.

    After you complete the preceding steps, you can use Log Service to query the logs of Spark jobs.