All Products
Search
Document Center

Platform For AI:Configure log collection for a resource group

Last Updated:Jun 13, 2024

You can configure log collection for a resource group in Platform for AI (PAI). After the configuration is complete, the system collects the logs generated by Elastic Algorithm Service (EAS) services that are deployed in the resource group and stores the logs in your Simple Log Service Logstore. The logs include standard outputs or custom files. This topic describes how to configure log collection for a public or dedicated resource group.

Prerequisites

Step 1: Create a machine group for a resource group

Create a machine group for a public resource group

You need to manually create a machine group for a public resource group.

  1. On the page for creating machine groups in the Simple Log Service console, create a custom identifier-based machine group. For more information, see Create a custom identifier-based machine group.

    创建机器组

    Note

    The custom ID dedicated to EAS is eas-log-group-{region_id}. For example, the custom ID of the machine group for EAS in the China (Zhangjiakou) region is eas-log-group-cn-zhangjiakou.

  2. After you deploy a service, go to the Machine Group Settings page of the machine group. In the Machine Group Status section, view the heartbeat status of service instances. A value of OK indicates that the machine group works as expected.

    Note

    If no services are deployed, the instance list is empty.

    机器组状态

Create a machine group for a dedicated resource group

To create a machine group for a dedicated resource group, you need to turn on Configure SLS on the group details page in the PAI console.

  1. Go to the Elastic Algorithm Service (EAS) page.

    1. Log on to the PAI console.

    2. In the left-side navigation pane, click Workspaces. On the Workspaces page, click the name of the workspace that you want to manage.

    3. In the left-side navigation pane, choose Model Deployment > Elastic Algorithm Service (EAS). The Elastic Algorithm Service (EAS) page appears.

  2. On the Resource Group tab, click the name of the dedicated resource group that you want to manage. The details page of the dedicated resource group appears.

  3. On the details page of your dedicated resource group, click Configure SLS.

    打开开关

  4. In the Configure Log Service Settings dialog box, configure SLSProject and LogStore and click OK. The following table describes the parameters.

    Parameter

    Description

    SLSProject

    Log Service projects are used to isolate and manage resources in Simple Log Service. If no project is available in the drop-down list, you can click Create SLSProject to create one. For more information, see Create a project.

    LogStore

    Simple Log Service Logstores are used to collect, store, and query log data. If no Logstore is available in the drop-down list, you can click Create LogStore to create one. For more information, see Create a Logstore.

    After you configure the preceding parameters, a Simple Log Service machine group whose name is in the eas-sls-{resource-id} format is automatically created. Sample name: eas-sls-eas-r-9u2lq6ij1pk5yvvh****. For information about how to obtain a value for resource-id, see Manage dedicated resource groups.

    EAS automatically creates a Logtail configuration in the specified Logstore to collect logs generated by EAS services.

Step 2: Create a custom Logtail configuration

You can create a custom Logtail configuration to collect the desired logs. After the configuration is completed, you can apply the configuration to the machine group that you created in Step 1 based on the resource group type. The following sections describe two configuration items that are frequently used. For information about other configuration items, see Collect text logs from servers.

Collect container standard outputs

  1. Log on to the Simple Log Service console.

  2. In the Import Data section, click Kubernetes - Standard Output.

    标准输出

  3. Select a project and a Logstore. Then, click Next.

  4. Click Use Existing Machine Groups, select the machine group that you created in Step 1, and then click Next.

  5. On the Specify Data Source page, specify the plug-in configuration and click Next.

    In the Plug-in Config editor, enter the following content:

    {
        "inputs": [
            {
                "detail": {
                    "Stderr": true,
                    "IncludeLabel": {
                        "io.kubernetes.container.name": "^(easworker|worker[0-9])$"
                    },
                    "Stdout": true
                },
                "type": "service_docker_stdout"
            }
        ]
    }

    If you use a custom image for service deployment and do not want EAS logs to be collected, enter the following content to allow only standard output logs of custom containers to be collected:

    {
        "inputs": [
            {
                "detail": {
                    "Stderr": true,
                    "IncludeLabel": {
                        "io.kubernetes.container.name": "^(worker[0-9])$"
                    },
                    "Stdout": true
                },
                "type": "service_docker_stdout"
            }
        ]
    }
  6. Click Next on this page and the subsequent wizard page. Then, the configuration is completed.

Collect container objects

  1. In the Import Data section, click Kubernetes - Object.

    Kubernetes-文件

  2. Select a project and a Logstore. Then, click Next.

  3. Click Use Existing Machine Groups, select the machine group that you created in Step 1, and then click Next.

  4. On the Logtail Config page, configure the parameters and click Next.

    You need to set the Log Path parameter to the path of the log files that you want to collect. For information about how to configure other parameters on this page, see Use the Simple Log Service console to collect container text logs in DaemonSet mode.

  5. Click Next on this page and the subsequent wizard page. Then, the configuration is completed.

References

  • After you configure log collection for the resource group, you can view the logs in the specified Logstore. For more information, see Query and analyze logs.

  • For information about how to access the Internet and other cloud services that allow access only from specific IP addresses, see Configure Internet access and a whitelist.