Resource Profiling Makes the Setting of Container Resource Specifications Easier

By Zhang Zuowei (Youyi)

Preface

Kubernetes provides abstract capabilities for resources (such as CPU and memory). Users can declare resource specifications of containers based on the actual requirements. This method improves the efficiency of cluster resource management. However, knowing how to set the resource specifications of containers has always been a key problem for application administrators. High resource specifications will lead to a large amount of resource waste, while low specifications will bring potential stability risks to applications.

Alibaba Cloud Container Service for Kubernetes (ACK) provides the resource profiling capability for Kubernetes-native workloads. This allows the resource specification recommendations at the granularity of the container, simplifying the complexity of configuring requests and limits for pods.

Kubernetes Resource Management for Containers

Kubernetes provides a resource semantic description of a request for container resource management. When a request is specified by the container, the scheduler matches it with the capacity of nodes to determine which node it should be assigned to. The request is specified based on experience. Application administrators will adjust the request based on the online performance of the container, the historical resource utilization, and stress test results.

However, this human experience-based pattern of resource specification configuration has the following limits:

The administrator will reserve a large number of resources as a buffer to ensure the stability of online applications and handle the workload fluctuations of the upstream and downstream procedures. As a result, the amount of requests you specify for containers is excessively greater than the actual amount of resources used by the containers. This causes large resource waste. The relevant statistics show that in the most productive environment of online services, the resource utilization rate of the cluster is at a fairly low level.
When the cluster allocation rate is high, the administrator will actively shrink the resource request configuration of the container to improve the utilization of cluster resources and coordinate more resource capacity. Such operations can improve the deployment density of containers in the short term but may cause risks of stability due to application traffic fluctuations in the long term.
The management method that relies entirely on expert experience cannot adapt to the scale growth. As the number of applications increases, management efficiency will decrease.

Resource Profile

The resource profile data of an application can provide help for administrators. The so-called resource profile refers to the characteristics of application resource consumption, including common physical cluster resources (such as CPU and memory) and relatively abstract resources (such as network bandwidth and disk IO). In daily resource management for containers, O&M personnel pay the most attention to CPU and memory. If we can collect the historical data on resource consumption and perform a summary analysis, the administrator can set specifications for the container flexibly.

In addition to the resource specification, the resource profile of the application contains the characteristics of the time dimension. The traffic of Internet services will be affected by people's activities, showing clear peaks, valleys, periodicity, and predictability. For example, there will be midday and evening peaks in local life applications, clock-in and clock-out peaks in office applications, and peaks in e-commerce applications during the promotion period, which are expected to be several times as high as the traffic valleys. If the scheduling system can capture this information and flexibly allocate resources according to the cluster and application status, the traffic peak cutting and traffic valley filling in resource allocation can be implemented.

In summary, a stable resource profiling system can improve the management efficiency of O&M personnel, ensure the stable operation of applications, and improve the utilization of cluster resources.

Technology Insider

Data Models

The profile results of container resource specifications can be used in various scenarios (including instructing application administrators to set the Request/Limit configurations of containers), providing data reference for resource scheduling and rescheduling algorithm optimization and instructing peripheral elastic components (such as VPA) to dynamically perform scaling on applications. These scenarios have general requirements for the algorithm model:

When the application load rises, the profile results need to respond to the changes to ensure that each component can quickly meet the emergent resource requirements of containers.
When the application load drops, the changes on the profile results need to be delayed to avoid the stability problems of the application caused by the over-positive reduction in configuration.

The requirements above can be met by a typical sliding window model. On this basis, to make the profile results smoother, the weight factor can be attenuated based on the timeliness of the data to ensure that the new data has a greater impact on the algorithm and the old data has a smaller impact on the algorithm results. The half-life sliding window is a typical algorithm model:

τ is the time point of the data sample and t1/2 is the half-life, indicating that for every t1/2 time interval, the weight of the data sample in the previous t1/2 time window is reduced by half.

Key Algorithms

Resource profiling depends on the usage data of container resources. The granularity of these data is usually in a minute or even a second level. The data model based on the half-life sliding window requires a large amount of historical data accumulation. If all the data is saved for the regular iterative analysis, the cost of storage and computing would be too high, and serious performance problems will occur during the cold start phase.

We have a variety of available statistical tools for the data model of the profile. In terms of historical data storage, we can use statistical histograms to achieve the effect of data compression. As shown in the following figure, we define the horizontal axis as the number of resources, and the vertical axis as the counts on the corresponding sampling points, with each statistical interval increasing by about 5%. As such, container historical data for each workload only needs around 200 counts to be completely saved. At the same time, this method is convenient for regular storage, which can improve the performance of the cold start phase.

There are various computing methods for profiling specification recommendations, such as peak sampling, weighted average, and quantile. Practical experience shows that the quantile algorithm can be applied to various scenarios, and the resource specifications for different types of workloads can be accurately described based on the quantile algorithm. It should be noted that the quantile algorithm here is not for simple statistics of sampling points but for the quantile of computing power requirements for workloads. The difference between the two is shown in the following figure. The memory usage of nine sampling points is low, only 100MB, while the memory usage of the tenth sampling point is high, reaching more than 900MB. If the 90% quantile obtained by the time sampling point is only 100MB, it is impossible to describe the resource requirements of the application accurately, while the 90% quantile obtained by computing power requirements is 900MB, which is more accurate. In addition, the profiling algorithm takes a variety of other factors into account, such as the half-life, the confidence coefficient of sample data, and the OOM of containers mentioned above.

Practice

We deployed a test application with deployment types to an ACK cluster, enabled the resource profile of the application, and summarized the container specifications (request), the actual resource usage, and the resource profile result (recommend) to the Prometheus monitoring page. As shown in the following figure:

The blue polyline indicates the actual CPU usage of the application, the orange polyline indicates the resource profile result, and the green polyline indicates the original request specification. The administrator can shrink the request specifications by referring to the profile results of the container, which can effectively reduce the consumption of cluster resources.

Future Plan

We are planning relevant product capabilities around resource profiling. Stay tuned!

Relevant Links

Koordinator Community:
https://github.com/koordinator-sh/koordinator

Joining Slack Channel:
https://koordinatorgroup.slack.com/archives/C0392BCPFNK

Community

Resource Profiling Makes the Setting of Container Resource Specifications Easier

Preface

Kubernetes Resource Management for Containers

Resource Profile

Technology Insider

Data Models

Key Algorithms

Practice

Future Plan

Relevant Links

Read previous post:

Read next post:

Alibaba Cloud Native

You may also like

Comments

Alibaba Cloud Native

Related Products

ACK One

Container Service for Kubernetes

Resource Management

Container Registry