Solution | Description | Scaling metric | Scenario | References |
HPA | HPA scales out pods during peak hours to handle traffic spikes and scales in pods during off-peak hours to reduce resource costs. HPA is suitable for most scenarios. | | HPA is ideal for online services that include a large number of pods and require frequent scaling to handle traffic fluctuations, such as e-commerce services, online education, and financial services. | Implement horizontal pod autoscaling |
CronHPA | CronHPA uses a Crontab-like strategy to scale pods based on a predefined schedule. You can specify the time zone and date on which scaling is performed in the schedule. You can also exclude dates, such as holidays, from the schedule. CronHPA can be used together with HPA. | Scheduled scaling | CronHPA is ideal for applications that have predictable traffic patterns and scenarios where you need to run tasks at a scheduled time. | |
VPA | VPA monitors the resource consumption mode of pods and provides recommendations on CPU and memory allocation. VPA adjusts resource allocation but does not change the number of pod replicas. | VPA provides recommendations on the CPU request, CPU limit, memory request, and memory limit for pods. In addition, VPA can automatically adjust the preceding resource requests and limits. | VPA is ideal for scenarios where stable resource allocation is required, such as scale-out of stateful applications and deployment of large monolithic applications. In most cases, VPA takes effect when pods are recovered from anomalies. | Vertical Pod Autoscaler (VPA) |
KEDA | KEDA supports a rich variety of event sources and enables event-driven auto scaling for workloads. | Number of events, such as the queue length. | KEDA is ideal for scenarios where instant scaling is required, especially event-based offline jobs. For example, you can enable KEDA for offline video and audio transcoding jobs, event-driven jobs, and stream processing jobs. | ACK KEDA |
Advanced Horizontal Pod Autoscaler (AHPA) | AHPA can automatically learn the pattern of workload fluctuations and predict resource demand based on historical metric data to help you implement predictive scaling. | Resource metrics such as CPU, memory, and GPU utilization Traffic metrics such as queries per second (QPS) and response time (RT) Other custom metrics
| AHPA is ideal for scenarios where traffic periodically fluctuates, such as live streaming, online education, and gaming services. | Predictive scaling based on AHPA |