Collect text logs from Kubernetes containers in DaemonSet mode - Container Service for Kubernetes

You can collect text logs from Kubernetes containers in a Container Service for Kubernetes (ACK) cluster in DaemonSet mode. In DaemonSet mode, each node runs a logging agent to improve O&M efficiency. You can install Logtail, the logging agent of Simple Log Service, on each node in an ACK cluster. This way, Logtail can collect the logs of all containers on each node. You can analyze container status and manage containers based on the collected logs.

Implementation

DaemonSet mode

In DaemonSet mode, only one Logtail container runs on a node in a Kubernetes cluster. You can use Logtail to collect logs from all containers on a node.
When a node is added to a Kubernetes cluster, the cluster automatically creates a Logtail container on the new node. When a node is removed from a Kubernetes cluster, the cluster automatically destroys the Logtail container on the node. DaemonSets and custom identifier-based machine groups eliminate the need to manually manage Logtail processes.

Container discovery

Before a Logtail container can collect logs from other containers, the Logtail container must identify and determine which containers are running. This process is called container discovery. In the container discovery phase, the Logtail container does not communicate with the kube-apiserver component of the Kubernetes cluster. Instead, the Logtail container communicates with the container runtime daemon of the node on which the Logtail container runs to obtain information about all containers on the node. This prevents the container discovery process from generating pressure on the kube-apiserver component.
When you use Logtail to collect logs, you can specify conditions such as namespaces, pod names, pod labels, and container environment variables to determine containers from which Logtail collects or does not collect logs.

File path mapping for containers

Pods in a Kubernetes cluster are isolated. As a result, the Logtail container in a pod cannot directly access the files of containers in a different pod. The file system of a container is created by mounting the file system of the container host to the container. A Logtail container can access any file on the container host only after the file system that includes the root directory of the container host is mounted to the Logtail container. This way, the Logtail container can collect logs from the files in the file system. The relationship between file paths in a container and file paths on the container host is called file path mapping.

For example, a file path in a container is /log/app.log. After file path mapping, the file path on the container host is /var/lib/docker/containers/<container-id>/log/app.log. By default, the file system that includes the root directory of the container host is mounted to the /logtail_host directory of the Logtail container. Therefore, the Logtail container collects logs from /logtail_host/var/lib/docker/containers/<container-id>/log/app.log.

Step 1: Install Logtail

What is Logtail? is the logging agent of Simple Log Service. Logtail can collect logs from containers in ACK clusters without intrusion into the application code. When you use Logtail to collect logs, you do not need to modify your application code. In addition, Logtail does not affect your applications when it collects logs from applications. After you install Logtail in an ACK cluster, a DaemonSet named logtail-ds is deployed in the cluster.

Important

We recommend that you use only one logging tool to collect container logs and send the logs to Simple Log Service. If you use two logging tools to collect logs at the same time, duplicate logs may be collected. This may incur additional fees and cause resource waste.
For more information about how to install Logtail, view the version and IP address of Logtail, and view the operational logs of Logtail, see Install Logtail components in an ACK cluster.

Install Logtail when you create a cluster

Log on to the ACK console. In the left-side navigation pane, click Clusters.
On the Clusters page, click Create Kubernetes Cluster.
In this example, only the steps to enable Simple Log Service are described. For more information about how to create an ACK cluster, see Create an ACK managed cluster.
On the Component Configurations wizard page, select Enable Log Service to install Logtail in the cluster.
After you select Enable Log Service, the console prompts you to create a Simple Log Service project. For more information about the structure of the logs managed by Simple Log Service, see Projects. You can use one of the following methods to create a Simple Log Service project:
- Click Select Project and select an existing project to manage the collected logs.
- Click Create Project. Then, a project named k8s-log-{ClusterID} is automatically created to manage the collected log. ClusterID indicates the unique ID of the cluster to be created.
After you set the parameters, click Create Cluster in the lower-right corner. In the message that appears, click OK.
After the cluster is created with Logtail installed, you can view the cluster on the Clusters page.

Install Logtail in an existing cluster

Log on to the ACK console. In the left-side navigation pane, click Clusters.
On the Clusters page, find the cluster that you want to manage and click its name. In the left-side navigation pane, choose Operations > Add-ons.
In the Logs and Monitoring section of the Add-ons page, find logtail-ds.
Click Install in the lower-right corner of the logtail-ds card. In the Install logtail-ds dialog box, click OK.

If an earlier version of logtail-ds is already installed, you can click Upgrade in the lower-right corner of the logtail-ds card to update the component.

Important

After you update logtail-ds, the parameters of logtail-ds are reset. The settings and environment variables of logtail-ds or alibaba-log-controller will be overwritten. If you have customized the settings and environment variables, you need to reconfigure them later. For more information, see Manual upgrade.

Step 2: Configure Logtail

Select one of the following methods to configure Logtail:
- Method 1: Use Kubernetes CustomResourceDefinitions (CRDs). CRDs allow you to batch configure Logtail configurations and enable versioning for Logtail configurations. This method is suitable for configuring different log collection configurations for Logtail. The Logtail configurations created by using CRDs are not synchronized to the Simple Log Service console. To modify the Logtail configurations created by using CRDs, you must modify the relevant CRDs. If you modify the Logtail configurations in the Simple Log Service console, Logtail configurations become inconsistent.
- Method 2: Use the Simple Log Service console to configure Logtail. This method is suitable for creating and configuring a few Logtail configurations. This method allows you to configure Logtail through a few steps without the need to log on to the cluster. However, you cannot batch create Logtail configurations. Method 1 takes precedence over Method 2.
- Method 3: Use environment variables to configure Logtail. This method supports only single-line text logs. If you want to collect multi-line text logs or logs of other formats, use Method 1 or 2.
When you configure Logtail, you can configure container filtering options, specify the directories or files that are ignored during log collection (collection blacklists), and allow a file to be collected multiple times.

(Recommended) CRD - AliyunPipelineConfig

Create a Logtail configuration

Important

Only the Logtail components V0.5.1 or later support AliyunPipelineConfig.

To create a Logtail configuration, you need to only create a CR from the AliyunPipelineConfig CRD. After the Logtail configuration is created, it is automatically applied. If you want to modify a Logtail configuration that is created based on a CR, you must modify the CR.

Obtain the kubeconfig file of a cluster and use kubectl to connect to the cluster.
Run the following command to create a YAML file.
In the following command, cube.yaml is a sample file name. You can specify a different file name based on your business requirements.
```
vim cube.yaml
```

Enter the following script in the YAML file and configure the parameters based on your business requirements.

Important

The value of the configName parameter must be unique in the Simple Log Service project that you use to install the Logtail components.
You must configure a CR for each Logtail configuration. If multiple CRs are associated with the same Logtail configuration, the CRs other than the first CR do not take effect.
For more information about the parameters related to the AliyunPipelineConfig CRD, see (Recommended) Use AliyunPipelineConfig to manage a Logtail configuration. In this example, the Logtail configuration includes settings for text log collection. For more information, see CreateLogtailPipelineConfig.
Make sure that the Logstore specified by the config.flushers.Logstore parameter exists. You can configure the spec.logstore parameter to automatically create a Logstore.

Collect single-line text logs from specific containers

In this example, a Logtail configuration named example-k8s-file is created to collect single-line text logs from the containers whose names contain app in a cluster. The file is test.LOG, and the path is /data/logs/app_1.

The collected logs are stored in a Logstore named k8s-file, which belongs to a project named k8s-log-test.

apiVersion: telemetry.alibabacloud.com/v1alpha1
# Create a CR from the ClusterAliyunPipelineConfig CRD.
kind: ClusterAliyunPipelineConfig
metadata:
  # Specify the name of the resource. The name must be unique in the current Kubernetes cluster. The name is the same as the name of the Logtail configuration that is created.
  name: example-k8s-file
spec:
  # Specify the project to which logs are collected.
  project:
    name: k8s-log-test
  # Create a Logstore to store logs.
  logstores:
    - name: k8s-file
  # Configure the parameters for the Logtail configuration.
  config:
    # Configure the Logtail input plug-ins.
    inputs:
      # Use the input_file plug-in to collect text logs from containers.
      - Type: input_file
        # Specify the file path in the containers.
        FilePaths:
          - /data/logs/app_1/**/test.LOG
        # Enable the container discovery feature. 
        EnableContainerDiscovery: true
        # Add conditions to filter containers. Multiple conditions are evaluated by using a logical AND. 
        ContainerFilters:
          # Specify the namespace of the pod to which the required containers belong. Regular expression matching is supported. 
          K8sNamespaceRegex: default
          # Specify the name of the required containers. Regular expression matching is supported. 
          K8sContainerRegex: ^(.*app.*)$
    # Configure the Logtail output plug-ins.
    flushers:
      # Use the flusher_sls plug-in to send logs to a specific Logstore. 
      - Type: flusher_sls
        # Make sure that the Logstore exists.
        Logstore: k8s-file
        # Make sure that the endpoint is valid.
        Endpoint: cn-hangzhou.log.aliyuncs.com
        Region: cn-hangzhou
        TelemetryType: logs

Collect multi-line text logs from all containers and use regular expressions to parse the logs

In this example, a Logtail configuration named example-k8s-file is created to collect multi-line text logs from all containers in a cluster. The file is test.LOG, and the path is /data/logs/app_1. The collected logs are parsed in JSON mode and stored in a Logstore named k8s-file, which belongs to a project named k8s-log-test.

The sample log provided in the following example is read by the input_file plug-in in the {"content": "2024-06-19 16:35:00 INFO test log\nline-1\nline-2\nend"} format. Then, the log is parsed based on a regular expression into {"time": "2024-06-19 16:35:00", "level": "INFO", "msg": "test log\nline-1\nline-2\nend"}.

apiVersion: telemetry.alibabacloud.com/v1alpha1
# Create a CR from the ClusterAliyunPipelineConfig CRD.
kind: ClusterAliyunPipelineConfig
metadata:
  # Specify the name of the resource. The name must be unique in the current Kubernetes cluster. The name is the same as the name of the Logtail configuration that is created.
  name: example-k8s-file
spec:
  # Specify the project to which logs are collected.
  project:
    name: k8s-log-test
  # Create a Logstore to store logs.
  logstores:
    - name: k8s-file
  # Configure the parameters for the Logtail configuration.
  config:
    # Specify the sample log. You can leave this parameter empty.
    sample: |
      2024-06-19 16:35:00 INFO test log
      line-1
      line-2
      end
    # Configure the Logtail input plug-ins.
    inputs:
      # Use the input_file plug-in to collect multi-line text logs from containers.
      - Type: input_file
        # Specify the file path in the containers.
        FilePaths:
          - /data/logs/app_1/**/test.LOG
        # Enable the container discovery feature. 
        EnableContainerDiscovery: true
        # Enable multi-line log collection.
        Multiline:
          # Specify the custom mode to match the beginning of the first line of a log based on a regular expression.
          Mode: custom
          # Specify the regular expression that is used to match the beginning of the first line of a log.
          StartPattern: \d+-\d+-\d+.*
    # Specify the Logtail processing plug-ins.
    processors:
      # Use the processor_parse_regex_native plug-in to parse logs based on the specified regular expression.
      - Type: processor_parse_regex_native
        # Specify the name of the input field.
        SourceKey: content
        # Specify the regular expression that is used for the parsing. Use capturing groups to extract fields.
        Regex: (\d+-\d+-\d+\s*\d+:\d+:\d+)\s*(\S+)\s*(.*)
        # Specify the fields that you want to extract.
        Keys: ["time", "level", "msg"]
    # Configure the Logtail output plug-ins.
    flushers:
      # Use the flusher_sls plug-in to send logs to a specific Logstore. 
      - Type: flusher_sls
        # Make sure that the Logstore exists.
        Logstore: k8s-file
        # Make sure that the endpoint is valid.
        Endpoint: cn-hangzhou.log.aliyuncs.com
        Region: cn-hangzhou
        TelemetryType: logs

Run the following command to apply the Logtail configuration. After the Logtail configuration is applied, Logtail starts to collect text logs from the specified containers and send the logs to Simple Log Service.
In the following command, cube.yaml is a sample file name. You can specify a different file name based on your business requirements.
```
kubectl apply -f cube.yaml
```
Important
After logs are collected, you must create indexes. Then, you can query and analyze the logs in the Logstore. For more information, see Create indexes.

CRD - AliyunLogConfig

To create a Logtail configuration, you need to only create a CR from the AliyunLogConfig CRD. After the Logtail configuration is created, it is automatically applied. If you want to modify a Logtail configuration that is created based on a CR, you must modify the CR.

Obtain the kubeconfig file of a cluster and use kubectl to connect to the cluster.
Run the following command to create a YAML file.
In the following command, cube.yaml is a sample file name. You can specify a different file name based on your business requirements.
```
vim cube.yaml
```

Enter the following script in the YAML file and configure the parameters based on your business requirements.

Important

The value of the configName parameter must be unique in the Simple Log Service project that you use to install the Logtail components.
If multiple CRs are associated with the same Logtail configuration, the Logtail configuration is affected when you delete or modify one of the CRs. After a CR is deleted or modified, the status of other associated CRs becomes inconsistent with the status of the Logtail configuration in Simple Log Service.
For more information about CR parameters, see Use AliyunLogConfig to manage a Logtail configuration. In this example, the Logtail configuration includes settings for text log collection. For more information, see CreateConfig.

Collect single-line text logs from specific containers

In this example, a Logtail configuration named example-k8s-file is created to collect single-line text logs from the containers of all the pods whose names begin with app in the cluster. The file is test.LOG, and the path is /data/logs/app_1. The collected logs are stored in a Logstore named k8s-file, which belongs to a project named k8s-log-test.

apiVersion: log.alibabacloud.com/v1alpha1
kind: AliyunLogConfig
metadata:
  # Specify the name of the resource. The name must be unique in the current Kubernetes cluster. 
  name: example-k8s-file
  namespace: kube-system
spec:
  # Specify the name of the project. If you leave this parameter empty, the project named k8s-log-<your_cluster_id> is used.
  project: k8s-log-test
  # Specify the name of the Logstore. If the specified Logstore does not exist, Simple Log Service automatically creates a Logstore. 
  logstore: k8s-file
  # Configure the parameters for the Logtail configuration. 
  logtailConfig:
    # Specify the type of the data source. If you want to collect text logs, set the value to file. 
    inputType: file
    # Specify the name of the Logtail configuration. 
    configName: example-k8s-file
    inputDetail:
      # Specify the simple mode to collect text logs. 
      logType: common_reg_log
      # Specify the log file path. 
      logPath: /data/logs/app_1
      # Specify the log file name. You can use wildcard characters (* and ?) when you specify the log file name. Example: log_*.log. 
      filePattern: test.LOG
      # Set the value to true if you want to collect text logs from containers. 
      dockerFile: true
      # Specify conditions to filter containers.
      advanced:
        k8s:
          K8sPodRegex: '^(app.*)$'

Run the following command to apply the Logtail configuration. After the Logtail configuration is applied, Logtail starts to collect text logs from the specified containers and send the logs to Simple Log Service.
In the following command, cube.yaml is a sample file name. You can specify a different file name based on your business requirements.
```
kubectl apply -f cube.yaml
```
Important
After logs are collected, you must create indexes. Then, you can query and analyze the logs in the Logstore. For more information, see Create indexes.

Console

Log on to the Simple Log Service console.
In the Quick Data Import section, click Import Data. In the Import Data dialog box, click the Kubernetes - File card.
Select the required project and Logstore. Then, click Next. In this example, select the project that you use to install the Logtail components and the Logstore that you create.
In the Machine Group Configurations step, perform the following operations. For more information, see Introduction to machine groups.
1. Use one of the following settings based on your business requirements:
  - Kubernetes Clusters > ACK Daemonset
  - Kubernetes Clusters > Self-managed Cluster in DaemonSet Mode
    Important
    Subsequent settings vary based on the preceding settings.
2. Confirm that the required machine groups are added to the Applied Server Groups section. Then, click Next. After you install Logtail components in a Container Service for Kubernetes (ACK) cluster, Simple Log Service automatically creates a machine group named k8s-group-${your_k8s_cluster_id}. You can directly use this machine group.
  Important
  - If you want to create a machine group, click Create Machine Group. In the panel that appears, configure the parameters to create a machine group. For more information, see Use the ACK console.
  - If the heartbeat status of a machine group is FAIL, click Automatic Retry. If the issue persists, see How do I troubleshoot an error that is related to a Logtail machine group in a host environment?

Create a Logtail configuration and click Next. Simple Log Service starts to collect logs after the Logtail configuration is created.

Note

A Logtail configuration requires up to 3 minutes to take effect.

Global Configurations

Parameter	Description
Configuration Name	Enter a name for the Logtail configuration. The name must be unique in a project. After you create the Logtail configuration, you cannot change the name of the Logtail configuration.
Log Topic Type	Select a method to generate log topics. For more information, see Log topics. Machine Group Topic: The topics of the machine groups are used as log topics. If you want to distinguish the logs from different machine groups, select this option. File Path Extraction: You must specify a custom regular expression. A part of the file path that matches the regular expression is used as the log topic. If you want to distinguish the logs from different sources, select this option. Custom: You must specify a custom log topic.
Advanced Parameters	Optional. Configure the advanced parameters that are related to global configurations. For more information, see CreateLogtailPipelineConfig.

Input Configurations

Parameter	Description
Logtail Deployment Mode	Select the deployment mode of Logtail. In this example, Daemonset is selected.
File Path Type	Select the type of the file path that you want to use to collect logs. Valid values: Path in Container and Host Path. If a hostPath volume is mounted to a container and you want to collect logs from files based on the mapped file path on the container host, set this parameter to Host Path. In other scenarios, set this parameter to Path in Container.
File Path	If the required container runs on a Linux host, specify a path that starts with a forward slash (/). Example: `/apsara/nuwa//app.Log`. If the required container runs on a Windows host, specify a path that starts with a drive letter. Example: `C:\Program Files\Intel\\.Log`. You can specify an exact directory and an exact name. You can also use wildcard characters to specify the directory and name. For more information, see Wildcard matching. When you configure this parameter, you can use only asterisks () or question marks (?) as wildcard characters. Simple Log Service scans all levels of the specified directory for the log files that match specified conditions. Examples: If you specify `/apsara/nuwa/*/.log`, Simple Log Service collects logs from the log files whose names are suffixed by .log in the `/apsara/nuwa` directory and the recursive subdirectories of the directory. If you specify `/var/logs/app_//.log`, Simple Log Service collects logs from the log files that meet the following conditions: The file name is suffixed by `.log`. The file is stored in a subdirectory under the `/var/logs` directory or in a recursive subdirectory of the subdirectory. The name of the subdirectory matches the `app_` pattern. If you specify `/var/log/nginx//access`, Simple Log Service collects logs from the log files whose names start with `access` in the `/var/log/nginx` directory and the recursive subdirectories of the directory.
Maximum Directory Monitoring Depth	Specify the maximum number of levels of subdirectories that you want to monitor. The subdirectories are in the log file directory that you specify. This parameter specifies the levels of subdirectories that can be matched for the wildcard characters `` included in the value of File Path. A value of 0 specifies that only the log file directory that you specify is monitored. Warning** We recommend that you configure this parameter based on the minimum requirement. If you specify a large value, Logtail may consume more monitoring resources and cause collection latency.
Enable Container Metadata Preview	If you turn on Enable Container Metadata Preview, you can view the container metadata after you create the Logtail configuration, including the matched container information and full container information.
Container Filtering	Logtail version If the version of Logtail is earlier than 1.0.34, you can use only environment variables and container labels to filter containers. If the version of Logtail is 1.0.34 or later, we recommend that you use different levels of Kubernetes information to filter containers. The information includes K8s Pod Name Regular Matching, K8s Namespace Regular Matching, K8s Container Name Regular Matching, and Kubernetes Pod Label Whitelist. Filter conditions Important Container labels are retrieved by running the docker inspect command. Container labels are different from Kubernetes labels. For more information, see Obtain labels. Environment variables are the same as the environment variables that are configured to start containers. For more information, see Obtain environment variables. Kubernetes namespaces and container names can be mapped to container labels. The label for a namespace is `io.kubernetes.pod.namespace`. The label for a container name is `io.kubernetes.container.name`. We recommend that you use the two labels to filter containers. For example, the namespace of a pod is `backend-prod`, and the name of a container in the pod is `worker-server`. If you want to collect the logs of the worker-server container, you can specify `io.kubernetes.pod.namespace : backend-prod` or `io.kubernetes.container.name : worker-server` in the container label whitelist. If the two labels do not meet your business requirements, you can use the environment variable whitelist or the environment variable blacklist to filter containers. K8s Pod Name Regular Matching Enter the pod name. The pod name specifies the containers from which text logs are collected. Regular expression matching is supported. For example, if you specify `^(nginx-log-demo.)$`, all containers in the pod whose name starts with nginx-log-demo are matched. K8s Namespace Regular Matching* Enter the namespace name. The namespace name specifies the containers from which text logs are collected. Regular expression matching is supported. For example, if you specify `^(default\|nginx)$`, all containers in the nginx and default namespaces are matched. K8s Container Name Regular Matching Enter the container name. The container name specifies the containers from which text logs are collected. Regular expression matching is supported. Kubernetes container names are defined in spec.containers. For example, if you specify `^(container-test)$`, all containers whose name is container-test are matched. Container Label Whitelist Configure a container label whitelist. The whitelist specifies the containers from which text logs are collected. Note Do not specify duplicate values for the Label Name parameter. If you specify duplicate values, only one value takes effect. If you specify a value for the Label Name parameter but do not specify a value for the Label Value parameter, containers whose container labels contain the specified label name are matched. If you specify a value for the Label Name and Label Value parameters, containers whose container labels contain the specified `Label Name:Label Value` are matched. By default, string matching is performed for the values of the Label Value parameter. Containers are matched only if the values of the container labels are the same as the values of the Label Value parameter. If you specify a value that starts with a caret `(^)` and ends with a dollar sign `($)` for the Label Value parameter, regular expression matching is performed. For example, if you set the Label Name parameter to `app` and set the Label Value parameter to `^(test1\|test2)$`, containers whose container labels contain `app:test1` or `app:test2` are matched. Key-value pairs are evaluated by using the OR operator. If a container has a container label that consists of one of the specified key-value pairs, the container is matched. Container Label Blacklist Configure a container label blacklist. The blacklist specifies the containers from which text logs are not collected. Note Do not specify duplicate values for the Label Name parameter. If you specify duplicate values, only one value takes effect. If you specify a value for the Label Name parameter but do not specify a value for the Label Value parameter, containers whose container labels contain the specified label name are filtered out. If you specify a value for the Label Name and Label Value parameters, containers whose container labels contain the specified `Label Name:Label Value` are filtered out. By default, string matching is performed for the values of the Label Value parameter. Containers are filtered out only if the values of the container labels are the same as the values of the Label Value parameter. If you specify a value that starts with a caret `(^)` and ends with a dollar sign `($)` for the Label Value parameter, regular expression matching is performed. For example, if you set the Label Name parameter to `app` and set the Label Value parameter to `^(test1\|test2)$`, containers whose container labels contain app:test1 or app:test2 are filtered out. Key-value pairs are evaluated by using the OR operator. If a container has a container label that consists of one of the specified key-value pairs, the container is filtered out. Environment Variable Whitelist Configure an environment variable whitelist. The whitelist specifies the containers from which text logs are collected. If you specify a value for the Environment Variable Name parameter but do not specify a value for the Environment Variable Value parameter, containers whose environment variables contain the specified environment variable name are matched. If you specify a value for the Environment Variable Name and Environment Variable Value parameters, containers whose environment variables contain the specified Environment Variable Name:Environment Variable Value are matched. By default, string matching is performed for the values of the Environment Variable Value parameter. Containers are matched only if the values of the environment variables are the same as the values of the Environment Variable Value parameter. If you specify a value that starts with a caret `(^)` and ends with a dollar sign `($)` for the Environment Variable Value parameter, regular expression matching is performed. For example, if you set the Environment Variable Name parameter to `NGINX_SERVICE_PORT` and set the Environment Variable Value parameter to `^(80\|6379)$`, containers whose port number is 80 or 6379 are matched. Key-value pairs are evaluated by using the OR operator. If a container has an environment variable that consists of one of the specified key-value pairs, the container is matched. Environment Variable Blacklist Configure an environment variable blacklist. The blacklist specifies the containers from which text logs are not collected. If you specify a value for the Environment Variable Name parameter but do not specify a value for the Environment Variable Value parameter, containers whose environment variables contain the specified environment variable name are filtered out. If you specify a value for the Environment Variable Name and Environment Variable Value parameters, containers whose environment variables contain the specified Environment Variable Name:Environment Variable Value are filtered out. By default, string matching is performed for the values of the Environment Variable Value parameter. Containers are filtered out only if the values of the environment variables are the same as the values of the Environment Variable Value parameter. If you specify a value that starts with a caret `(^)` and ends with a dollar sign `($)` for the Environment Variable Value parameter, regular expression matching is performed. For example, if you set the Environment Variable Name parameter to `NGINX_SERVICE_PORT` and set the Environment Variable Value parameter to `^(80\|6379)$`, containers whose port number is 80 or 6379 are filtered out. Key-value pairs are evaluated by using the OR operator. If a container has an environment variable that consists of one of the specified key-value pairs, the container is filtered out. Kubernetes Pod Label Whitelist Configure a Kubernetes pod label whitelist. The whitelist specifies the containers from which text logs are collected. If you specify a value for the Label Name parameter but do not specify a value for the Label Value parameter, containers whose pod labels contain the specified label name are matched. If you specify a value for the Label Name and Label Value parameters, containers whose pod labels contain the specified `Label Name:Label Value` are matched. By default, string matching is performed for the values of the Label Value parameter. Containers are matched only if the values of the pod labels are the same as the values of the Label Value parameter. If you specify a value that starts with a caret `(^)` and ends with a dollar sign `($)`, regular expression matching is performed. For example, if you set the Label Name parameter to `environment` and set the Label Value parameter to `^(dev\|pre)$`, containers whose pod labels contain `environment:dev` or `environment:pre` are matched. Key-value pairs are evaluated by using the OR operator. If a container has a pod label that consists of one of the specified key-value pairs, the container is matched. Kubernetes Pod Label Blacklist Configure a Kubernetes pod label blacklist. The blacklist specifies the containers from which text logs are not collected. If you specify a value for the Label Name parameter but do not specify a value for the Label Value parameter, containers whose pod labels contain the specified label name are filtered out. If you specify a value for the Label Name and Label Value parameters, containers whose pod labels contain the specified Label Name:Label Value are filtered out. By default, string matching is performed for the values of the Label Value parameter. Containers are filtered out only if the values of the pod labels are the same as the values of the Label Value parameter. If you specify a value that starts with a caret `(^)` and ends with a dollar sign `($)` for the Label Value parameter, regular expression matching is performed. For example, if you set the Label Name parameter to `environment` and set the Label Value parameter to `^(dev\|pre)$`, containers whose pod labels contain `environment:dev` or `environment:pre` are filtered out. Key-value pairs are evaluated by using the OR operator. If a container has a pod label that consists of one of the specified key-value pairs, the container is filtered out.
Log Tag Enrichment	Specify log tags by using environment variables and pod labels.
File Encoding	Select the encoding format of log files.
First Collection Size	Specify the size of data that Logtail can collect from a log file the first time Logtail collects logs from the file. The default value of First Collection Size is 1024. Unit: KB. If the file size is less than 1,024 KB, Logtail collects data from the beginning of the file. If the file size is greater than 1,024 KB, Logtail collects the last 1,024 KB of data in the file. You can specify First Collection Size based on your business requirements. Valid values: 0 to 10485760. Unit: KB.
Collection Blacklist	If you turn on Collection Blacklist, you must configure a blacklist to specify the directories or files that you want Simple Log Service to skip when it collects logs. You can specify exact directories and file names. You can also use wildcard characters to specify directories and file names. When you configure this parameter, you can use only asterisks () or question marks (?) as wildcard characters. Important* If you use wildcard characters to configure File Path and you want to skip some directories in the specified directory, you must configure Collection Blacklist and enter a complete directory. For example, if you set File Path to `/home/admin/app/log/.log` and you want to skip all subdirectories in the `/home/admin/app1` directory, you must select Directory Blacklist* and enter `/home/admin/app1/` in the Directory Name field. If you enter `/home/admin/app1`, the blacklist does not take effect. When a blacklist is in use, computational overhead is generated. We recommend that you add up to 10 entries to the blacklist. You cannot specify a directory path that ends with a forward slash (/). For example, if you set the path to `/home/admin/dir1/`, the directory blacklist does not take effect. The following types of blacklists are supported: File Path Blacklist, File Blacklist, and Directory Blacklist. File Path Blacklist If you select File Path Blacklist and enter `/home/admin/private.log` in the File Path Name field, all files whose names are prefixed by private and suffixed by .log in the `/home/admin/` directory are skipped. If you select File Path Blacklist* and enter `/home/admin/private/_inner.log` in the File Path Name field, all files whose names are suffixed by _inner.log in the subdirectories whose names are prefixed by private in the `/home/admin/` directory are skipped. For example, the `/home/admin/private/app_inner.log` file is skipped, but the `/home/admin/private/app.log` file is not skipped. File Blacklist If you select File Blacklist and enter `app_inner.log` in the File Name field, all files whose names are `app_inner.log` are skipped. Directory Blacklist If you select Directory Blacklist and enter `/home/admin/dir1` in the Directory Name field, all files in the `/home/admin/dir1` directory are skipped. If you select Directory Blacklist and enter `/home/admin/dir` in the Directory Name field, the files in all subdirectories whose names are prefixed by dir in the `/home/admin/` directory are skipped. If you select Directory Blacklist* and enter `/home/admin/*/dir` in the Directory Name field, all files in the dir subdirectory in each second-level subdirectory of the `/home/admin/` directory are skipped. For example, the files in the `/home/admin/a/dir` directory are skipped, but the files in the `/home/admin/a/b/dir` directory are not skipped.
Allow File to Be Collected for Multiple Times	By default, you can use only one Logtail configuration to collect logs from a log file. To use multiple Logtail configurations to collect logs from a log file, turn on Allow File to Be Collected for Multiple Times.
Advanced Parameters	You must manually configure specific parameters of a Logtail configuration. For more information, see Create a Logtail pipeline configuration.

Processor Configurations

Parameter	Description
Log Sample	Add a sample log that is collected from an actual scenario. You can use the sample log to configure parameters that are related to log processing with ease. You can add multiple sample logs. Make sure that the total length of the logs does not exceed 1,500 characters. `[2023-10-01T10:30:01,000] [INFO] java.lang.Exception: exception happened at TestPrintStackTrace.f(TestPrintStackTrace.java:3) at TestPrintStackTrace.g(TestPrintStackTrace.java:7) at TestPrintStackTrace.main(TestPrintStackTrace.java:16)`
Multi-line Mode	Specify the type of multi-line logs. A multi-line log spans multiple consecutive lines. You can configure this parameter to identify each multi-line log in a log file. Custom: A multi-line log is identified based on the value of Regex to Match First Line. Multi-line JSON: Each JSON object is expanded into multiple lines. Example: `{ "name": "John Doe", "age": 30, "address": { "city": "New York", "country": "USA" } }` Configure Processing Method If Splitting Fails. `Exception in thread "main" java.lang.NullPointerException at com.example.MyClass.methodA(MyClass.java:12) at com.example.MyClass.methodB(MyClass.java:34) at com.example.MyClass.main(MyClass.java:½0)` For the preceding sample log, Simple Log Service can discard the log or retain each single line as a log when it fails to split the log. Discard: The log is discarded. Retain Single Line: Each line of log text is retained as a log. A total of four logs are retained.
Processing Method	Select Processors. You can add native plug-ins and extended plug-ins for data processing. For more information about Logtail plug-ins for data processing, see Logtail plug-ins overview. Important You are subject to the limits of Logtail plug-ins for data processing. For more information, see the on-screen instructions in the Simple Log Service console. Logtail V2.0 You can arbitrarily combine native plug-ins for data processing. You can combine native plug-ins and extended plug-ins. Make sure that extended plug-ins are added after native plug-ins. Logtail earlier than V2.0 You cannot add native plug-ins and extended plug-ins at the same time. You can use native plug-ins only to collect text logs. When you add native plug-ins, take note of the following items: You must add one of the following Logtail plug-ins for data processing as the first plug-in: Data Parsing (Regex Mode), Data Parsing (Delimiter Mode), Data Parsing (JSON Mode), Data Parsing (NGINX Mode), Data Parsing (Apache Mode), and Data Parsing (IIS Mode). After you add the first plug-in, you can add a Time Parsing plug-in, a Data Filtering plug-in, and multiple Data Masking plug-ins. When you configure the Retain Original Field if Parsing Fails and Retain Original Field if Parsing Succeeds parameters, you can use only the following parameter combinations. For other parameter combinations, Simple Log Service does not ensure configuration effects. Upload logs that are parsed. Upload logs that are obtained after parsing if the parsing is successful, and upload raw logs if the parsing fails. Upload logs that are obtained after parsing and add a raw log field to the logs if the parsing is successful, and upload raw logs if the parsing fails. For example, if a raw log is `"content": "{"request_method":"GET", "request_time":"200"}"` and the raw log is successfully parsed, the system adds a raw log field to the log that is obtained after parsing. The raw log field is specified by the New Name of Original Field parameter. If you do not configure the parameter, the original field name is used. The field value is `{"request_method":"GET", "request_time":"200"}`.

Create indexes and preview data. Then, click Next. By default, full-text indexing is enabled in Simple Log Service. You can also configure field indexes based on collected logs in manual mode or automatic mode. To configure field indexes in automatic mode, click Automatic Index Generation. This way, Simple Log Service automatically creates field indexes. For more information, see Create indexes.
Important
If you want to query all fields in logs, we recommend that you use full-text indexes. If you want to query only specific fields, we recommend that you use field indexes. This helps reduce index traffic. If you want to analyze fields, you must create field indexes. You must include a SELECT statement in your query statement for analysis.
Click Query Log. Then, you are redirected to the query and analysis page of your Logstore.
You must wait approximately 1 minute for the indexes to take effect. Then, you can view the collected logs on the Raw Logs tab. For more information, see Guide to log query and analysis.

Environment variables

1. Configure Simple Log Service when you create an application

Use the ACK console

Log on to the ACK console. In the left-side navigation pane, click Clusters.
On the Clusters page, find the cluster that you want to manage and click its name. In the left-side pane, choose Workloads > Deployments.
On the Deployments page, select a namespace from the Namespace drop-down list. Then, click Create from Image in the upper-right corner of the page.
On the Basic Information wizard page, specify Name, Replicas, and Type. Then, click Next to go to the Container wizard page.
Only parameters related to Simple Log Service are described in the following section. For more information about other application parameters, see Create a Deployment.
In the Log section, configure log collection parameters.
1. Configure Collection Configuration.
  Click the plus sign (+) to add a configuration entry. Each configuration entry consists of the Logstore and Log Path in Container (Can be set to stdout) parameters.
  - Logstore: Specify the name of the Logstore that is used to store the collected log data. If the Logstore does not exist, ACK automatically creates a Logstore in the Simple Log Service project that is associated with your ACK cluster.
    Note
    The default log retention period of Logstores is 90 days.
  - Log Path in Container (Can be set to stdout): the path from which you want to collect log data. A value of /usr/local/tomcat/logs/catalina.*.log indicates that the log files of a Tomcat application are collected.
    Note
    When you set the value to stdout, stdout and stderr are collected.
    All settings are added as configuration entries to the corresponding Logstore. By default, logs are collected in simple mode (by row). If you want to use other methods to collect log data, see Collect text logs from Kubernetes containers in DaemonSet mode and Collect stdout and stderr from Kubernetes containers in DaemonSet mode (old version).
2. Set Custom Tag.
  Click the plus sign (+) to add custom tags. Each tag is a key-value pair that is appended to the collected log data. You can use custom tags to mark log data. For example, you can use a tag to denote the application version.
After you configure the parameters, click Next to configure advanced settings.
For more information about the subsequent steps, see Create a Deployment.

Use a YAML template

Log on to the ACK console. In the left-side navigation pane, click Clusters.
On the Clusters page, find the cluster that you want to manage and click its name. In the left-side pane, choose Workloads > Deployments.
On the Deployments page, select a namespace from the Namespace drop-down list. Then, click Create from YAML in the upper-right corner of the page.
Configure a YAML template.
YAML templates comply with the Kubernetes syntax. You can use env to define log collection configurations and custom tags. You must also set the volumeMounts and volumes parameters. The following sample code provides an example of pod configurations:
```
apiVersion: v1
kind: Pod
metadata:
  name: my-demo
spec:
  containers:
  - name: my-demo-app
    image: 'registry.cn-hangzhou.aliyuncs.com/log-service/docker-log-test:latest'
    env:
    # Specify environment variables.
    - name: aliyun_logs_log-stdout
      value: stdout
    - name: aliyun_logs_log-varlog
      value: /var/log/*.log
    - name: aliyun_logs_mytag1_tags
      value: tag1=v1
    # Configure volume mounting.
    volumeMounts:
    - name: volumn-sls-mydemo
      mountPath: /var/log
    # If the pod is repetitively restarted, you can add a sleep command to the startup parameters of the pod.
    command: ["sh", "-c"]  # Run commands in the shell.
    args: ["sleep 3600"]   # Make the pod sleep 3,600 seconds (1 hour).
  volumes:
  - name: volumn-sls-mydemo
    emptyDir: {}
```
Perform the following steps in sequence based on your business requirements:
Note
If you have other log collection requirements, see 2. Use environment variables to configure advanced settings.
1. Add log collection configurations and custom tags by using environment variables. All environment variables related to log collection must use aliyun_logs_ as the prefix.
  - Add environment variables in the following format:
```
- name: aliyun_logs_log-stdout
  value: stdout
- name: aliyun_logs_log-varlog
  value: /var/log/*.log                        
```
    In the preceding example, two environment variables in the following format are added to the log collection configuration: aliyun_logs_{key}. The {keys} of the environment variables are log-stdout and log-varlog.
    - The aliyun_logs_log-stdout environment variable indicates that a Logstore named log-stdout is created to store the stdout collected from containers. The name of the collection configuration is log-stdout. This way, the stdout of containers is collected to the Logstore named log-stdout.
    - The aliyun_logs_log-varlog environment variable indicates that a Logstore named log-varlog is created to store the /var/log/*.log files collected from containers. The name of the collection configuration is log-varlog. This way, the /var/log/*.log files are collected to the Logstore named log-varlog.
  - Add custom tags in the following format:
```
- name: aliyun_logs_mytag1_tags
  value: tag1=v1                       
```
    After a tag is added, the tag is automatically appended to the log data that is collected from the container. mytag1 specifies the tag name without underscores (_).
2. If you specify a log path to collect log files other than stdout, you must set the volumeMounts parameter.
  In the preceding YAML template, the mountPath field in volumeMounts is set to /var/log. This allows Logtail to collect log data from the /var/log/*.log file.
After you modify the YAML template, click Create to submit the configurations.

2. Use environment variables to configure advanced settings

You can configure container environment variables to customize log collection. You can use environment variables to configure advanced settings to meet your log collection requirement.

Important

You cannot use environment variables to configure log collection in edge computing scenarios.

Variable	Description	Example	Remarks
aliyun_logs_{key}	This variable is required. {key} can contain only lowercase letters, digits, and hyphens (-). If the specified aliyun_logs_{key}_logstore does not exist, a Logstore named {key} is created. To collect the stdout of a container, set the value to stdout. You can also set the value to a path inside the container to collect the log files.	`- name: aliyun_logs_catalina value: stdout` `- name: aliyun_logs_access-log value: /var/log/nginx/access.log`	The default log collection mode is simple mode. If you want to parse log data, we recommend that you use the Simple Log Service console and configure the parameters based on the steps in Collect text logs from Kubernetes containers in DaemonSet mode or Collect stdout and stderr from Kubernetes containers in DaemonSet mode (old version). {key} specifies the name of the Logtail configuration. The configuration name must be unique in the Kubernetes cluster.
aliyun_logs_{key}_tags	This variable is optional. This variable is used to add tags to log data. The value must be in the following format: {tag-key}={tag-value}.	`- name: aliyun_logs_catalina_tags value: app=catalina`	N/A
aliyun_logs_{key}_project	This variable is optional. The variable specifies a project in Simple Log Service. The default project is the one that you specified when you created the cluster.	`- name: aliyun_logs_catalina_project value: my-k8s-project`	The project must be deployed in the same region as Logtail.
aliyun_logs_{key}_logstore	This variable is optional. The variable specifies a Logstore in Simple Log Service. By default, the Logstore is named {key}.	`- name: aliyun_logs_catalina_logstore value: my-logstore`	N/A
aliyun_logs_{key}_shard	This variable is optional. The variable specifies the number of shards of the Logstore. Valid values: 1 to 10. Default value: 2. Note If the Logstore that you specify already exists, this variable does not take effect.	`- name: aliyun_logs_catalina_shard value: '4'`	N/A
aliyun_logs_{key}_ttl	This variable is optional. The variable specifies the log retention period. Valid values: 1 to 3650. To retain log data permanently, set the value to 3650. The default retention period is 90 days. Note If the Logstore that you specify already exists, this variable does not take effect.	`- name: aliyun_logs_catalina_ttl value: '3650'`	N/A
aliyun_logs_{key}_machinegroup	This variable is optional. This variable specifies the node group in which the application is deployed. The default node group is the one in which Logtail is deployed. For more information about how to use the variable, see Scenario 2: Collect log data from different applications and store the log data in different projects.	`- name: aliyun_logs_catalina_machinegroup value: my-machine-group`	N/A
aliyun_logs_{key}_logstoremode	This variable is optional. This variable specifies the type of Logstore. Default value: standard. Valid values: Note If the Logstore that you specify already exists, this variable does not take effect. standard: standard Logstore. This type of Logstore supports the log analysis feature and is suitable for scenarios such as real-time monitoring and interactive analysis. You can also use this type of Logstore to build a comprehensive observability system. query: query Logstore. This type of Logstore supports high-performance queries. The index traffic fee of a query Logstore is approximately half that of a standard Logstore. Query Logstores do not support SQL analysis. Query Logstores are suitable for scenarios in which the amount of data is large, the log retention period is long, or log analysis is not required. If logs are stored for weeks or months, the log retention period is considered long.	`- name: aliyun_logs_catalina_logstoremode value: standard` `- name: aliyun_logs_catalina_logstoremode value: query`	To use this variable, make sure that the logtail-ds image version is 1.3.1 or later.

Scenario 1: Collect log data from multiple applications and store the data in the same Logstore
In this scenario, set the aliyun_logs_{key}_logstore variable. The following example shows how to collect stdout from two applications and store the output in stdout-logstore.
The {key} of Application 1 is set to app1-stdout. The {key} of Application 2 is set to app2-stdout.
Configure the following environment variables for Application 1:
```
# Specify environment variables.
    - name: aliyun_logs_app1-stdout
      value: stdout
    - name: aliyun_logs_app1-stdout_logstore
      value: stdout-logstore
```
Configure the following environment variables for Application 2:
```
# Specify environment variables.
    - name: aliyun_logs_app2-stdout
      value: stdout
    - name: aliyun_logs_app2-stdout_logstore
      value: stdout-logstore
```
Scenario 2: Collect log data from different applications and store the data in different projects
In this scenario, perform the following steps:
1. Create a machine group in each project and set the custom identifier of the machine group in the following format: k8s-group-{cluster-id}, where {cluster-id} is the ID of the cluster. You can specify a custom machine group name.
2. Specify the project, Logstore, and machine group in the environment variables for each application. The name of the machine group is the same as the one you created in the previous step.
  In the following example, the {key} of Application 1 is set to app1-stdout. The {key} of Application 2 is set to app2-stdout. If the two applications are deployed in the same ACK cluster, you can use the same machine group for the applications.
  Configure the following environment variables for Application 1:
```
# Specify environment variables.
    - name: aliyun_logs_app1-stdout
      value: stdout
    - name: aliyun_logs_app1-stdout_project
      value: app1-project
    - name: aliyun_logs_app1-stdout_logstore
      value: app1-logstore
    - name: aliyun_logs_app1-stdout_machinegroup
      value: app1-machine-group
```
  Configure the following environment variables for Application 2:
```
# Specify environment variables for Application 2.
    - name: aliyun_logs_app2-stdout
      value: stdout
    - name: aliyun_logs_app2-stdout_project
      value: app2-project
    - name: aliyun_logs_app2-stdout_logstore
      value: app2-logstore
    - name: aliyun_logs_app2-stdout_machinegroup
      value: app1-machine-group
```

Step 3: Query and analyze logs

Container logs collected by Logtail are stored in Logstores of Simple Log Service. You can view the logs in the Simple Log Service console or ACK console. For more information about the query syntax, see Overview of log query and analysis.

ACK console

Log on to the ACK console. In the left-side navigation pane, click Clusters.
On the Clusters page, find the cluster that you want to manage and click its name. In the left-side pane, choose Operations > Log Center.
On the Log Center page, click the Application Logs tab and specify the filter conditions. Then, click Select Logstore to view the logs of containers.

Simple Log Service console

Log on to the Simple Log Service console.
In the Projects section, click the project that is associated with the Kubernetes cluster to go to the Logstores tab. By default, the project name is in the format of k8s-log-{Kubernetes cluster ID}.
In the Logstore list, find the Logstore that is specified when you configure log collection. Move the pointer over the Logstore name and click the icon. Then, click Search & Analysis.
In this example, you can view the stdout of the Tomcat application and text log files of containers. You can also find that custom tags are appended to the collected log data.

Default fields in container text logs

The following table describes the fields that are included by default in each container text log.

Field name	Description
__tag__:__hostname__	The name of the container host.
__tag__:__path__	The log file path in the container.
__tag__:_container_ip_	The IP address of the container.
__tag__:_image_name_	The name of the image that is used by the container.
__tag__:_pod_name_	The name of the pod.
__tag__:_namespace_	The namespace to which the pod belongs.
__tag__:_pod_uid_	The unique identifier (UID) of the pod.

References

For more information about how to use sidecar containers to collect container logs, see Collect text logs from Kubernetes containers in Sidecar mode.
For more information about how to collect stdout from containers, see Collect stdout and stderr from Kubernetes containers in DaemonSet mode (old version).
You can add alerting rules to the logs to effectively monitor the cluster status. For more information, see Manage an alert monitoring rule.
For more information about how to troubleshoot log collection errors, see What do I do if errors occur when I use Logtail to collect logs?
For more information about how to use Logtail to collect container logs across Alibaba Cloud accounts, see Use Logtail to collect container logs across Alibaba Cloud accounts.
If you have questions about Logstores, such as how to change the log retention period and disable log collection, see FAQ about Logstores.

Implementation

DaemonSet mode

Container discovery

File path mapping for containers

Step 1: Install Logtail

Install Logtail when you create a cluster

Install Logtail in an existing cluster

Step 2: Configure Logtail

(Recommended) CRD - AliyunPipelineConfig

Create a Logtail configuration

Collect single-line text logs from specific containers

Collect multi-line text logs from all containers and use regular expressions to parse the logs

CRD - AliyunLogConfig

Collect single-line text logs from specific containers

Console

File Path Blacklist

File Blacklist

Directory Blacklist

Environment variables

1. Configure Simple Log Service when you create an application

Use the ACK console

Use a YAML template

2. Use environment variables to configure advanced settings

Scenario 1: Collect log data from multiple applications and store the data in the same Logstore

Scenario 2: Collect log data from different applications and store the data in different projects

Step 3: Query and analyze logs

ACK console

Simple Log Service console

Default fields in container text logs

References