×
Community Blog OpenKruise × iLogtail: The Best Practice for Managing Sidecar Containers for Observable Data Collection

OpenKruise × iLogtail: The Best Practice for Managing Sidecar Containers for Observable Data Collection

This article analyzes the challenges in managing sidecar containers for data collection and provides solutions using the management capabilities offered by OpenKruise.

By Xunfei, from Alibaba Cloud Storage Team

Sidecar containers are widely used in Kubernetes clusters to collect observable data from application containers. However, their intrusive nature regarding service deployment and the complexity of lifecycle management make this deployment model costly and prone to errors. This article analyzes the challenges in managing sidecar containers for data collection and provides solutions using the management capabilities offered by OpenKruise. It uses iLogtail as an example to describe the best practice for managing sidecar containers that collect observable data based on OpenKruise..

Scenarios for Sidecar Deployment

Observable systems serve as the eyes of IT operations. The Kubernetes official documentation introduces various methods for collecting observable data. In summary, there are three main methods: the native method, the DaemonSet method, and the Sidecar method. Each method has its advantages and disadvantages, and no single method can address all the challenges perfectly. Therefore, the choice of method depends on the specific scenario. In the Sidecar mode, each pod is deployed with a separate agent for collection. This approach consumes more resources but offers high stability, flexibility, and multi-tenant isolation. We recommend using the Sidecar mode in job collection scenarios or when serving multiple services on a PaaS platform.

1

  • Stability: The Sidecar method utilizes the feature that containers in the same Kubernetes pod have shared storage and network, so no container discovery process is required. At the same time, as long as the sidecar container does not exit, the files on the shared storage volume will not be deleted. Therefore, even if the lifecycle of the application container is extremely short or the log collection is delayed, data integrity is ensured.
  • Flexibility: For every agent deployed separately on each pod, you can customize the resources to be allocated and the data to be collected based on your business requirements. You can also deploy different types and versions of agents based on your requirements.
  • Multi-tenant isolation: The agent deployed in each sidecar container can only observe the data in the pod it belongs to. It cannot obtain the information of other tenants. The agents run independently of each other. If one crashes or fails to collect data, the others on the same node still work.

Pain Points of Kubernetes Sidecar Mode

To use the Kubernetes Sidecar mode to collect observable data, the deployment declaration of the business needs to be modified. The sidecar container as an agent and the application container should be deployed in the same pod, and the volume and network should be shared to enable the agent to collect and upload data. To ensure that no data loss occurs during the data collection under any circumstances, the agent process must start up before the application process starts up and exit after the application process ends. For job workloads, how to enable the agent to proactively exit after the application container is completed is another problem to be considered. Taking iLogtail as an example, the following shows how to configure sidecar containers for data collection.

iLogtail is an open-source, lightweight, and high-performance agent for observable data collection provided by Alibaba Cloud. It allows you to collect multiple types of observable data including logs, traces, and metrics to Kafka, ElasicSearch, and ClickHouse. Its stability has been verified by Alibaba Cloud and Alibaba Cloud customers.

apiVersion: batch/v1
kind: Job
metadata:
  name: nginx-mock
  namespace: default
spec:
  template:
    metadata:
      name: nginx-mock
    spec:
      restartPolicy: Never
      containers:
        - name: nginx
          image: registry.cn-hangzhou.aliyuncs.com/log-service/docker-log-test:latest
          command: ["/bin/sh", "-c"]
          args:
            - until [[ -f /tasksite/cornerstone ]]; do sleep 1; done;
              /bin/mock_log --log-type=nginx --path=/var/log/nginx/access.log --total-count=100;
              retcode=$?;
              touch /tasksite/tombstone;
              exit $retcode
              # Notify the sidecar container that the collection is completed. Otherwise, the sidecar container will not exit volumeMounts with the application container.
          volumeMounts:
            # Share the log directory with the iLogtail sidecar container through volumeMounts.
            - mountPath: /var/log/nginx
              name: log
            # Use volumeMounts to communicate the process status with the iLogtail sidecar container.
            - mountPath: /tasksite
              name: tasksite
        # iLogtail Sidecar
        - name: ilogtail
          image: sls-opensource-registry.cn-shanghai.cr.aliyuncs.com/ilogtail-community-edition/ilogtail:latest
          command: ["/bin/sh", "-c"]
          # The purpose of the first `sleep 10` is to wait for iLogtail to start the collection. The iLogtail may start to collect data only after the configuration is pulled from the remote end.
          # The purpose of the second `sleep 10` is to wait for iLogtail to complete the collection. The collection is not completed until the iLogtail sends all collected data downstream.
          args:
            # Notify the application container that the sidecar container is ready.
            - /usr/local/ilogtail/ilogtail_control.sh start;
              sleep 10;
              touch /tasksite/cornerstone;
              until [[ -f /tasksite/tombstone ]]; do sleep 1; done;
              sleep 10;
              /usr/local/ilogtail/ilogtail_control.sh stop;
          volumeMounts:
            - name: log
              mountPath: /var/log/nginx
            - name: tasksite
              mountPath: /tasksite
            - mountPath: /usr/local/ilogtail/ilogtail_config.json
              name: ilogtail-config
              subPath: ilogtail_config.json
      volumes:
        - name: log
          emptyDir: {
   
   }
        - name: tasksite
          emptyDir:
            medium: Memory
        - name: ilogtail-config
          secret:
            defaultMode: 420
            secretName: ilogtail-secret

The volumeMounts section in the configuration declares the shared storage, of which the storage named log is for data sharing and the storage named tasksite is for process coordination. A large amount of code in the args section is used to control the startup and exit sequence of the sidecar container and the application container. The preceding configuration code shows the following disadvantages of Sidecar mode:

  • Application pods are coupled to multiple sidecar containers, which increases the learning costs of business developers.
  • The running dependencies between multiple sidecar containers and application containers in a pod need to be considered, which increases the complexity of configuration.
  • The upgrade of sidecar containers will lead to the rebuilding of application pods. Sidecar containers are generally handled by independent middleware teams, so upgrades may face resistance from the service.

OpenKruise: A Powerful Tool for Managing Sidecar Containers

SidecarSet

SidecarSet is an abstraction of sidecar container management in OpenKruise. SidecarSet is used to inject and upgrade sidecar containers in Kubernetes clusters. It is one of the core workloads of OpenKruise. For more information, see SidecarSet documentation.

  • Automatic injection into sidecar containers: The configurations of sidecar containers are decoupled from the configurations of business workloads, such as Deployments and CloneSets, which reduces the cost of users.
  • Separate upgrade of sidecar containers: Sidecar containers are upgraded separately without rebuilding pods or interrupting your service.

Container Launch Priority

Container Launch Priority provides a method to control the startup sequence of containers in a pod. For more information, see Container Launch Priority documentation.

  • Control of the startup sequence of the containers in a pod: Containers in a pod start up in the sequence specified in the declaration or according to a custom sequence, which simplifies the configuration.

Job Sidecar Terminator

For a job workload, the main application container completes the task and exits, and then notifies the sidecar containers such as those for log collection to exit. For more information, see Job Sidecar Terminator documentation.

  • Proactive exit of the job sidecar: After the main container exits, it notifies the sidecar container to exit. This allows the job controller to complete the task and simplifies the configuration.

The Practices of iLogtail Sidecar Deployment

There are many cases where iLogtail is used to deploy observable scenarios in the community. For more information, see Use of Kubernetes and Use Cases of iLogtail Community Edition. This article focuses on how to use the preceding capabilities of OpenKruise to simplify the management of iLogtail sidecar containers. If you have any questions about iLogtail, query at GitHub Discussions.

OpenKruise Feature Gate

If the Sidecar Terminator feature is enabled in the kruise-controller-manager Deployment, the other two features are enabled by default.

    spec:
      containers:
        - args:
            - '--enable-leader-election'
            - '--metrics-addr=:8080'
            - '--health-probe-addr=:8000'
            - '--logtostderr=true'
            - '--leader-election-namespace=kruise-system'
            - '--v=4'
            - '--feature-gates=SidecarTerminator=true'
            - '--sync-period=0'

iLogtail SidecarSet

Create an iLogtail collection configuration in ConfigServer Open Source Edition as follows:

enable: true
inputs:
  - Type: file_log
    LogPath: /var/log/nginx
    FilePattern: access.log
flushers:
  - Type: flusher_kafka
    Brokers:
      - <kafka_host>:<kafka_port>
      - <kafka_host>:<kafka_port>
    Topic: nginx-access-log

Associate to the machine group

2

Define the iLogtail SidecarSet configuration as follows:

apiVersion: apps.kruise.io/v1alpha1
kind: SidecarSet
metadata:
  name: ilogtail-sidecarset
spec:
  selector:
    # Pod labels to be injected into the sidecar container
    matchLabels:
      kruise.io/inject-ilogtail: "true"
  # By default, the SidecarSet takes effect on the entire cluster. You can use the namespace field to specify the effective scope.
  namespace: default
  containers:
    - command:
        - /bin/sh
        - '-c'
        - '/usr/local/ilogtail/ilogtail_control.sh start_and_block 10'
        # The purpose of parameter 10 is to wait 10 seconds for data to be sent and exit
      image: sls-opensource-registry.cn-shanghai.cr.aliyuncs.com/ilogtail-community-edition/ilogtail:edge
      livenessProbe:
        exec:
          command:
            - /usr/local/ilogtail/ilogtail_control.sh
            - status
      name: ilogtail
      env:
        - name: KRUISE_CONTAINER_PRIORITY
          value: '100'
        - name: KRUISE_TERMINATE_SIDECAR_WHEN_JOB_EXIT
          value: 'true'
        - name: ALIYUN_LOGTAIL_USER_DEFINED_ID
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: 'metadata.labels[''ilogtail-tag'']'
      volumeMounts:
        - mountPath: /var/log/nginx
          name: log
        - mountPath: /usr/local/ilogtail/checkpoint
          name: checkpoint
        - mountPath: /usr/local/ilogtail/ilogtail_config.json
          name: ilogtail-config
          subPath: ilogtail_config.json
  volumes:
    - name: log
      emptyDir: {
   
   }
    - name: checkpoint
      emptyDir: {
   
   }
    - name: ilogtail-config
      secret:
        defaultMode: 420
        secretName: ilogtail-secret

Note that only livenessProbe is specified in this example. Do not use readinessProbe. Otherwise, the pod status changes to Not Ready when the sidecar container is upgraded.

In scenarios where machine resources are insufficient, you can set Sidecar container request.cpu to 0 to reduce the number of requests for pod resources. In this case, the QoS of pods is Burstable.

Automatic Injection into iLogtail Sidecar Containers

Define the Nginx Mock Job to contain only nginx-related configurations as follows:

apiVersion: batch/v1
kind: Job
metadata:
  name: nginx-mock
  namespace: default
spec:
  template:
    metadata:
      name: nginx-mock
      labels:
        # The label injected into the iLogtail sidecar container.
        kruise.io/inject-ilogtail: "true"
    spec:
      restartPolicy: Never
      containers:
        - name: nginx
          image: registry.cn-hangzhou.aliyuncs.com/log-service/docker-log-test:latest
          command: ["/bin/sh", "-c"]
          args:
            - /bin/mock_log --log-type=nginx --path=/var/log/nginx/access.log --total-count=100
          volumeMounts:
          # Use volumeMounts to share the log directory with the iLogtail sidecar container.
          - mountPath: /var/log/nginx
            name: log
      volumes:
        - name: log
          emptyDir: {
   
   }
        - name: tasksite
          emptyDir:
            medium: Memory

After the nginx mock job is applied to the Kubernetes cluster, the created pods are injected into the iLogtail sidecar container. This indicates that kruise.io/inject-ilogtail: "true" has taken effect as follows:

3

Without this configuration, only the nginx container will be used instead of the iLogtail container.

According to the Kubernetes events, the iLogtail container takes precedence over the nginx container. This indicates that KRUISE_CONTAINER_PRIORITY has taken effect as follows:

4

Without this configuration, the iLogtail container may start up later than the nginx container, resulting in incomplete data collection.

According to the Kubernetes events, the iLogtail container exits after the nginx container completes the collection. This indicates that the KRUISE_TERMINATE_SIDECAR_WHEN_JOB_EXIT has taken effect as follows:

5

Without this configuration, pods will remain running because the sidecar container will not exit with the application container unless you manually delete the job.

The correct startup and exit sequence of the preceding two sidecar containers ensures the integrity of the data collected by iLogtail. Query the data collected to Kafka by offset. It is shown that the collected data is complete, with a total of 100 pieces (92 - 40 + 106 - 58 = 100). The results are as follows:

6
7

Upgrade an iLogtail Sidecar Container (edge -> latest)

We can update the configuration of the iLogtail SidecarSet to the latest version and then apply it.

image: sls-opensource-registry.cn-shanghai.cr.aliyuncs.com/ilogtail-community-edition/ilogtail:latest

The Kubernetes events show that the pod is not rebuilt. The iLogtail container is rebuilt due to the change in the declaration, while the nginx container is not interrupted. The results are as follows:

8

This feature depends on the in-place update of Kruise. For more information, see Kruise InPlace Update. However, when you upgrade a sidecar container, risks may arise. If a sidecar container fails to be upgraded, you may encounter the situation of Pod Not Ready, and the business is affected. Therefore, SidecarSet provides a variety of canary release capabilities to avoid risks. For more information, see Kruise SidecarSet.

apiVersion: apps.kruise.io/v1alpha1
kind: SidecarSet
metadata:
  name: sidecarset
spec:
  # ...
  updateStrategy:
    type: RollingUpdate
    # Maximum number of unavailability
    maxUnavailable: 20%
    # Batch release 
    partition: 90
    # Canary release, through pod label
    selector:
      matchLabels:
        # Some Pods contain canary labels,
        # or any other labels where a small number of pods can be selected
        deploy-env: canary

In addition, if the container is similar to a service mesh or an envoy mesh, it requires the hot upgrade featureof SidecarSet. For more information, see Hot Upgrade Sidecar.

Argo-cd Deploy SidecarSet (Optional)

If you want to release a Kruise SidecarSet by using an Argo-cd, you must configure the SidecarSet Custom CRD Health Checks. Based on this configuration, Argo-cd can check whether the SidecarSet is released and whether the pod is ready.

apiVersion: v1
kind: ConfigMap
metadata:
  labels:
    app.kubernetes.io/name: argocd-cm
    app.kubernetes.io/part-of: argocd
  name: argocd-cm
  namespace: argocd
data:
  resource.customizations.health.apps.kruise.io_SidecarSet: |
    hs = {}
    -- if paused
    if obj.spec.updateStrategy.paused then
      hs.status = "Suspended"
      hs.message = "SidecarSet is Suspended"
      return hs
    end

    -- check sidecarSet status
    if obj.status ~= nil then
      if obj.status.observedGeneration < obj.metadata.generation then
        hs.status = "Progressing"
        hs.message = "Waiting for rollout to finish: observed sidecarSet generation less then desired generation"
        return hs
      end

      if obj.status.updatedPods < obj.spec.matchedPods then
        hs.status = "Progressing"
        hs.message = "Waiting for rollout to finish: replicas hasn't finished updating..."
        return hs
      end

      if obj.status.updatedReadyPods < obj.status.updatedPods then
        hs.status = "Progressing"
        hs.message = "Waiting for rollout to finish: replicas hasn't finished updating..."
        return hs
      end

      hs.status = "Healthy"
      return hs
    end

    -- if status == nil
    hs.status = "Progressing"
    hs.message = "Waiting for sidecarSet"
    return hs

Summary

The approach where a pod harbors multiple containers is increasingly being embraced by developers. As a result, an effective method for managing sidecar containers is becoming an urgent requirement within the Kubernetes ecosystem. Innovations like Kruise SidecarSet, Container Launch Priority, and Job Sidecar Terminator represent some of the advancements in sidecar container management.

Using OpenKruise to oversee iLogtail log collection significantly eases the management of sidecar containers, decoupling their configuration from application containers. This clarity in container startup order allows for sidecar container upgrades without the need to recreate pods. Nonetheless, certain challenges remain unresolved, such as determining the appropriate resource allocation for sidecar containers, planning log path mounting strategies, and differentiating sidecar configurations across various application pods. Hence, we look forward to collaborating with more developers in the community to address these issues. Contributions of ideas to enrich the Kubernetes ecosystem are also warmly welcomed.

Disclaimer: The views expressed herein are for reference only and don't necessarily represent the official views of Alibaba Cloud.

0 1 0
Share on

Alibaba Cloud Community

1,076 posts | 263 followers

You may also like

Comments