By Cloud Kernel SIG
Different applications often require different scheduling strategies to optimize performance. However, since the kernel release cycle is long and the cost of upgrading a kernel is usually high, the optimization cannot be quickly deployed at scale. In addition, the optimization of schedulers for specific applications often causes performance regression in other scenarios, and it is difficult to roll back when problems happening. Traditional hotfix technology can partially update and optimize a kernel without upgrading it overall and improve the performance of some applications. However, the traditional technology cannot upgrade the entire subsystem, does not support large-scale scheduling features, and has long downtime. Scheduler hot upgrade solves the problems above.
Scheduler hot upgrade SDK uses technologies (such as modularization, data reconstruction, and hot replacement) to implement scheduler R&D, testing, rollout, and the agility and customization of maintenance. The modularization technology automatically decouples the scheduler module code from kernels and provides an SDK for agile development for kernel developers. Hot replacement technology allows administrators to deploy in milliseconds of downtime. The data reconstruction technique migrates the data state from the pre-upgrade scheduler to the post-upgrade scheduler. A customized scheduler can be implemented using these technologies. This solves the problem that different applications and loads require different schedulers and enables production availability. The related paper entitled Efficient Scheduler Live Update for Linux Kernel with Modularization was published at the top architecture meeting, ASPLOS '23. The following figure shows the architecture:
The scheme is compatible with multiple architectures and kernel versions. AArch64, x86-64, and Linux kernels 4.19 and 5.10 have passed the test of the scheme. The scheme provides limited support for Linux kernel 3.10. The scheme supports various scheduler features. The following features have been tested and verified: mini scheduler, core scheduling, removing CFS bandwidth control, OpenAnolis CPU mixed deployment features, and various bug fixes of the upstream Linux community.
The scheduler hot upgrade SDK is suitable for the following scenarios. These scenarios have been verified:
A cloud Serverless service uses the scheduler hot upgrade SDK to install the Linux upstream core scheduling feature and the self-developed computing power stabilization technology on Linux kernel 4.19 of Anolis OS. Through this optimization, their customers' instances have reduced P99 latency by about 10%, reduced performance jitter, and significantly reduced startup time. The ability of scheduler hot upgrade to support large-scale features and expand R&D is verified.
An Internet financial services company uses scheduler hot upgrade SDK to quickly optimize and install their self-developed scheduler into their core businesses. The optimization of the Linux CFS scheduler and the CPU resource isolation technology of Anolis OS kernels is included. It reduces the waste of CPU resources by 5% and reduces service RT. During the publishing, the downtime is less than 12 ms in an environment with 40,000 threads. The optimization effect is recognized by O&M personnel, and the ease of use of the scheduler hot upgrade SDK is recognized by R&D personnel. They hope to continue using the scheduler hot upgrade SDK for system optimization.
Home Page of Cloud Kernel SIG: https://openanolis.cn/sig
Async-fork: Mitigating Query Latency Spikes Incurred by the Fork-based Snapshot Mechanism
85 posts | 5 followers
FollowOpenAnolis - June 1, 2023
OpenAnolis - May 19, 2022
OpenAnolis - June 26, 2023
OpenAnolis - June 19, 2023
OpenAnolis - February 27, 2023
OpenAnolis - October 13, 2023
85 posts | 5 followers
FollowEMAS HTTPDNS is a domain name resolution service for mobile clients. It features anti-hijacking, high accuracy, and low latency.
Learn MoreA low-code development platform to make work easier
Learn MoreHelp enterprises build high-quality, stable mobile apps
Learn MoreAlibaba Cloud (in partnership with Whale Cloud) helps telcos build an all-in-one telecommunication and digital lifestyle platform based on DingTalk.
Learn MoreMore Posts by OpenAnolis