All Products
Search
Document Center

Container Service for Kubernetes:Container Service for Kubernetes (ACK) 2025 Release Notes

Last Updated:Feb 10, 2026

This topic describes the latest feature releases for Container Service for Kubernetes (ACK).

Background information

  • For supported Kubernetes (K8s) versions in Container Service for Kubernetes (ACK), see Version Guide.

  • Container Service for Kubernetes (ACK) supports operating systems including ContainerOS, Alibaba Cloud Linux 3 Container-Optimized Edition, Alibaba Cloud Linux 3, Alibaba Cloud Linux 3 Arm Edition, Alibaba Cloud Linux UEFI 3, Red Hat, Ubuntu, and Windows. For details, see Operating systems.

December 2025

Product

Feature name

Description

Available regions

Related documentation

Container Service for Kubernetes

The node pool supports Security Center Purchase Vulnerability Fixes (Pay-as-you-go)

Enable OS CVE vulnerability fix to scan for security vulnerabilities on nodes, get remediation suggestions and methods, and complete fixes quickly in the console. Before using this feature, activate Security CenterUltimate or purchase vulnerability fix (pay-as-you-go).

All regions

Fix CVE vulnerabilities in node pool OS

Support for APIG Ingress

Cloud-native API Gateway (APIG Ingress) is an enterprise edition gateway built on the open-source Higress gateway. It is compatible with Nginx Ingress and suitable for API management and microservice scenarios.

All regions

Manage APIG Ingress

New support for Kagent

Kagent is a framework for building, deploying, and running AI applications on Kubernetes. After deploying Kagent, you can create agents and MCP Server using declarative APIs and connect to multiple large language models.

All

Kagent

Use CNFS to manage tags for NAS, OSS, and CPFS

CNFS lets you add tags to NAS, OSS, and CPFS cloud storage resources. This enables fine-grained classification and permission management to improve resource governance efficiency.

All

Manage NAS, OSS, and CPFS tags

Use Kyverno as policy governance engine

Kyverno is a Kubernetes-native policy engine that defines and enforces security, compliance, and automation policies using Policy-as-Code. Compared to OPA Gatekeeper, which is integrated by default in clusters, Kyverno uses YAML definitions (no Rego learning required) and supports mutating and generating at the admission stage. It suits use cases requiring highly customized policies, operations automation, or multi-cluster policy governance.

All regions

Use Kyverno as policy governance engine

Use remote attestation to ensure trustworthiness of confidential containers

PeerPod remote attestation ensures confidential containers always run in genuine, unmodified confidential computing environments such as Intel TDX. It automatically validates nodes before container deployment and allows applications to fetch environment proofs on demand during runtime, providing end-to-end security for sensitive workloads.

All

Use remote attestation to ensure trustworthiness of confidential containers

Knative supports A2A protocol servers

Agent2Agent (A2A) is an open standard for seamless communication and collaboration between AI agents. By deploying A2A servers in Knative, you can leverage automatic scaling (including scale-to-zero) to use resources on demand and iterate versions quickly.

All

Deploy A2A in Knative

New ack-agent-gateway practice

  • Support Agent2Agent (A2A) traffic governance and authentication: To help AI agent applications quickly expose services externally, install the ack-agent-gateway extension based on the Gateway API to precisely manage A2A protocol traffic.

  • Support MCP service gateway: To expose MCP services in ACK clusters to external LLMs, install the ack-agent-gateway extension based on the Gateway API for fast and secure routing of MCP traffic.

All

Distributed Cloud Container Platform for Kubernetes

New multi-cluster HPA practice using vLLM custom metrics

Large language model (LLM) online services commonly use multi-cluster architectures due to traffic fluctuations. The multi-cluster solution provided by ACK One adapts to this scenario. This practice shows how to deploy vLLM inference services in cloud environments using ACK One fleets and use federated HPA (FederatedHPA) for cross-cluster elastic scaling.

All

Multi-cluster HPA practice using vLLM custom metrics

November 2025

Product

Feature name

Description

Available regions

Related documentation

Container Service for Kubernetes

Intelligent managed mode supports GPU workload deployment and operation

After enabling intelligent managed mode for a cluster, dynamically scale GPU resources using intelligent managed node pools. This significantly reduces costs for GPU workloads with clear peak and off-peak patterns, such as online inference.

All

Deploy and run GPU workloads

New servicemesh-operator component

The servicemesh-operator component simplifies the deployment, upgrade, and configuration management of service mesh (ASM) in ACK clusters. This helps you quickly enable powerful ASM features such as traffic management, security, and observability.

All regions

servicemesh-operator

New built-in FinOps rule library

You can configure security policies for pods to validate whether pod deployment and update requests are secure. ACK cluster policy management provides multiple built-in rule libraries, including Compliance, Infra, K8s-general, PSP, and FinOps.

All

Container security policy rule library overview

Support for MCP Server deployment in Knative

After hosting MCP Server in Knative, leverage its serverless architecture advantages to achieve on-demand scaling and event-driven AI services.

All

Deploy MCP Server in Knative

Rolling update and graceful shutdown configuration practice

To ensure zero-downtime application updates in ACK clusters, configure readiness probes, readiness gates, preStop hooks, and SLB graceful termination for deployments. This achieves smooth traffic migration and ensures high availability.

All

Achieve zero-downtime rolling deployments

Distributed Cloud Container Platform for Kubernetes

Cluster-level multi-cluster priority elastic scheduling practice

ACK One fleets support AI inference services. In cross-region multi-ACK cluster and hybrid cloud multi-cluster scenarios, define cluster priorities to preferentially use IDC or primary region resources while supplementing compute power with Alibaba Cloud or backup region resources. Combine this with inventory-aware scheduling to ensure business continuity.

All

Cluster-level multi-cluster priority elastic scheduling

October 2025

Product

Feature name

Description

Available regions

Related documentation

Support for DRA-based GPU scheduling

In AI training and inference scenarios where multiple applications share GPU resources, deploy NVIDIA DRA drivers in ACK clusters to overcome traditional device plugin scheduling limits. Use Kubernetes DRA APIs to dynamically allocate GPUs across pods and control resources at a fine granularity, improving GPU utilization and reducing costs.

All

Use DRA to schedule GPUs

Distributed Cloud Container Platform for Kubernetes

Registered clusters support ACS GPU-HPN capacity reservation

By registering on-premises Kubernetes clusters to the cloud and combining them with GPU-HPN capacity reservation, enterprises can centrally manage and intelligently schedule GPU resources across on-premises and cloud environments. This provides stable, high-performance computing for critical workloads such as AI training and inference.

All

Example: Use ACS GPU HPN compute power in ACK One registered clusters

Support for collecting control plane component metrics using self-managed Prometheus

For hybrid cloud environments using self-managed Prometheus monitoring systems, install the Metrics Aggregator component and configure ServiceMonitor to collect core component metrics from ACK One registered clusters. This integrates metrics into your existing monitoring system for unified alerting and observability.

All

Collect control plane component metrics using self-managed Prometheus

Cloud-native AI Suite

Submit eRDMA-accelerated PyTorch distributed training jobs using Arena

In multi-node GPU training, network communication latency often slows overall performance. To shorten model training time, submit PyTorch distributed jobs using Arena and configure eRDMA network acceleration. This delivers low-latency, high-throughput inter-node communication to improve training efficiency and cluster utilization.

All

Submit eRDMA-accelerated PyTorch distributed training jobs using Arena

September 2025

Product

Feature name

Description

Available regions

Related documentation

Container Service for Kubernetes

Support for Kubernetes 1.34

Support for Kubernetes 1.33. You can directly create a 1.34 cluster when creating a cluster or upgrade lower-version clusters to version 1.34.

All

Kubernetes 1.34

Support for hybrid cloud node pools

To include on-premises server resources in ACK cluster management, create hybrid cloud node pools using ACK managed clusters Pro Edition. Add existing hybrid cloud nodes to the cluster to enable elastic scheduling and cost optimization across cloud and on-premises resources while maintaining unified orchestration and leveraging existing IT assets.

All

Create and manage hybrid cloud node pools

Support for configuring DNS resolution for hybrid cloud node pools

If hybrid cloud node pools use cloud CoreDNS for domain name resolution, frequent access may increase leased line load and cause resolution failures due to leased line instability. Configure NodeLocal DNSCache to reduce these issues.

All

Configure NodeLocal DNSCache for hybrid cloud node pools

Support for Terway Hybrid CNI plugin

Hybrid cloud node pools connected to on-premises IDCs have complex network topologies and cross-domain routing requirements beyond the capabilities of standard container network plugins. The Terway Hybrid CNI plugin is designed specifically for hybrid cloud node pools and ensures network connectivity between pods in both IDC and cloud environments.

All

Use the Terway Hybrid CNI plugin

ossfs 2.0 supports RRSA authentication

For applications requiring persistent storage or sharing data across multiple pods, mount OSS buckets as ossfs 2.0 dynamic PVs. We recommend using RRSA authentication, which offers higher security, automatic rotation of temporary credentials, and pod-level permission isolation. This is suitable for production and multi-tenant environments with high security requirements.

All

Use ossfs 2.0 dynamic volumes

Distributed Cloud Container Platform for Kubernetes

Support for accessing cloud GPU compute power

ACK One registered clusters support unified scheduling and operations management of heterogeneous computing resources. This significantly improves resource utilization in heterogeneous computing clusters.

All

Access cloud GPU compute power

Support for migrating single-cluster applications to fleets and distributing them across multiple clusters

To solve problems like repetitive operations, errors, and synchronization difficulties in multi-cluster application deployments, use the AMC command-line tool to quickly deploy applications across multiple clusters. This enables unified management and automatic synchronization of updates.

All

Migrate single-cluster applications to fleets and distribute them across multiple clusters

August 2025

Product

Feature name

Description

Available regions

Related documentation

Container Service for Kubernetes

Support for intelligent inference routing with KV cache awareness

KV cache-aware load balancing is designed for generative AI inference. It dynamically routes requests to optimal compute nodes to significantly improve large language model (LLM) service efficiency.

All

Use prefix caching-aware routing in precise mode

Support for custom CNI plugins

The default Terway and Flannel CNI plugins in ACK meet most container networking needs. However, if you need specific features from other CNI plugins, ACK supports installing custom CNI plugins using the Bring Your Own Container Network Interface (BYOCNI) model.

All

Use custom CNI plugins in ACK clusters

Intelligent managed mode clusters support managed policy governance components

To meet cluster compliance requirements and improve security, enable security policy management. Security policy rules include Infra, Compliance, PSP, and K8s-general.

All

Enable security policy management

Knative supports ACS compute power

Knative Service supports configuring container compute service (ACS) compute power. Its diverse compute types and quality meet different workload requirements and optimize costs.

All

Use ACS resources

Gateway with Inference Extension supports more flexible configurations

  • Support custom inference extension configuration: Adjust routing policies using annotations or modify or override extension deployment configurations using ConfigMaps.

  • Support custom Gateway configuration: Adjust actual Gateway parameters such as service type, deployment replica count, and resources by modifying EnvoyProxy resource configurations.

All regions

Support for securely deploying vLLM inference services in ACK heterogeneous confidential computing clusters

Large language model (LLM) inference involves sensitive data and core model assets. Running in untrusted environments risks data and model leakage. ACK's confidential AI solution (ACK-CAI) integrates hardware confidential computing technologies such as Intel TDX and GPU TEE to provide end-to-end security for model inference.

All

Securely deploy vLLM inference services in ACK heterogeneous confidential computing clusters

Cloud-native AI Suite

Launch AI inference suite

As large language models (LLMs) become widely adopted, efficiently, stably, and massively deploying and operating them in production has become a core challenge for enterprises. The cloud-native AI inference suite (AI Serving Stack) is an end-to-end solution designed for cloud-native AI inference on Alibaba Cloud Container Service. It addresses the full lifecycle of LLM inference, offering integrated capabilities for deployment management, intelligent routing, elastic scaling, and deep observability. Whether you are just getting started or already operate large-scale AI businesses, the cloud-native AI inference suite handles complex cloud-native AI inference scenarios with ease.

All

AI inference suite

July 2025

Product

Feature name

Description

Available regions

Related documentation

Container Service for Kubernetes

Support for hardened-only mode access to ECS instance metadata

Access ECS instance metadata (such as instance ID, VPC information, and NIC information) through the metadata service inside ECS instances. In ACK clusters, node instance metadata access defaults to supporting both standard and hardened modes. You can switch to hardened-only mode (IMDSv2) to further enhance metadata service security.

All

Access ECS instance metadata in hardened-only mode

Support for subscribing to overseas source images

To regularly synchronize images from overseas source image repositories such as Docker Hub, GCR, and Quay to Enterprise instances, use the artifact subscription capability of Enterprise instances.

All

Get overseas source images using artifact subscription

Support for mounting NAS using EFC clients via CNFS

EFC provides distributed caching and other capabilities to improve File Storage NAS access performance. It supports high concurrency and parallel access to large-scale datasets, making it suitable for data-intensive and high-concurrency containerized scenarios such as big data analytics, AI training, and inference. Compared to mounting NAS using the default NFS protocol, using EFC accelerates file access and improves read and write performance.

All

Mount NAS using EFC clients via CNFS

Distributed Cloud Container Platform for Kubernetes

GitOps capabilities with GUI experience

You can use the console to manage full GitOps capabilities, such as enabling or disabling features, configuring public network access and ACLs, using the ApplicationSet UI, configuring Argo CD ConfigMap, restarting components, and monitoring and log observability.

All

Quick start with GitOps

Multi-cluster GitOps supports Argo CD ConfigMap configuration

ACK One supports managing GitOps-related features and permissions by configuring Argo CD's ConfigMap.

All

Configure Argo CD ConfigMap

Support for enabling inventory-aware elastic scheduling for multi-cluster fleets

ACK One multi-cluster fleets address multi-region resource allocation challenges in multi-region application service deployments. They implement an inventory-aware intelligent scheduler. When combined with instant elasticity, this scheduler routes application services to clusters with available inventory when existing resources in managed clusters are insufficient. Instant elasticity then scales out the required nodes in those clusters to handle the application services, improving scheduling success and reducing resource costs.

All

Inventory-aware cross-region multi-cluster elastic scheduling

Container Service for Kubernetes Edge Edition

Support for private network connection configuration using leased lines

ACK Edge clusters support network access via leased lines. This enables secure and efficient access to cloud services such as ACK and ACR from ACK Edge cluster edge nodes, addressing network conflicts and the lack of fixed IP addresses.

All

Configure private network connection using leased lines

June 2025

Product

Feature name

Description

Available regions

Related documentation

Container Service for Kubernetes

Use AI Profiling in the console

AI Profiling is a non-intrusive performance analysis tool based on eBPF and dynamic process injection, natively designed for Kubernetes container scenarios. It supports online detection of container processes running GPU tasks, covering multiple data collection capabilities. You can dynamically start and stop performance data collection on running GPU tasks. For online services, a profiling tool that can be dynamically attached and detached allows real-time, detailed analysis without modifying business code.

All

AI Profiling

GPU node self-recovery

Node self-recovery now supports automatic recovery for instance failures caused by GPU software and hardware anomalies.

ACK provides Kubernetes-side node instance failure self-recovery for GPU software and hardware anomalies on underlying EGS and Lingjun nodes. This includes full automation for fault discovery, notification and alerting, automatic isolation, node draining, and automatic repair. It also supports user authorization before performing repairs, further enhancing automated fault operations and reducing cluster O&M costs.

All

Enable node self-recovery

CPFS for Lingjun static persistent volumes

CPFS for Lingjun delivers ultra-high throughput and IOPS performance and supports end-to-end RDMA network acceleration. It is suitable for intelligent computing scenarios such as AIGC and autonomous driving. You can create CPFS for Lingjun static persistent volumes in your cluster and use them in workloads.

All

Use CPFS for Lingjun static persistent volumes

ACK VPD CNI component

The ACK VPD CNI provides container network management for Lingjun nodes in ACK managed clusters Pro Edition. As a CNI plugin for Lingjun nodes, ACK VPD CNI allocates and manages container network resources for Lingjun nodes using Lingjun connections.

All

ACK VPD CNI

ack-kms-agent-webhook-injector component

The ack-kms-agent-webhook-injector injects the KMS Agent as a sidecar container into pods. Business applications can use local HTTP interfaces to retrieve credentials from KMS instances via the KMS Agent and cache them in memory. This avoids hard coding sensitive information and improves data security.

All

Import Alibaba Cloud KMS service credentials for applications

Gateway with Inference Extension component capability expansion

Gateway with Inference Extension supports multiple generative AI inference frameworks such as vLLM and SGLang. It enhances generative AI inference services deployed on different frameworks with features including phased release strategies, inference load balancing, model-name-based routing, rate limiting, and circuit breaking for inference services.

All

Gateway with Inference Extension, Traffic Management, and Inference Service Management

CAA confidential container solution leveraging confidential virtual machines

In scenarios requiring confidential computing, such as financial risk control and healthcare, deploy confidential computing workloads in ACK clusters using the Cloud API Adaptor (CAA) solution. This protects sensitive data from external attacks or potential cloud provider threats using Intel® TDX technology to meet industry compliance requirements.

All

Confidential AI container solution using confidential VMs and CAA

Cloud-native AI Suite

Schedule Dify workflows using XXL-JOB

Dify workflows often require scheduling for automated tasks in many scenarios, such as risk monitoring, data analytics, content generation, and data synchronization. However, Dify does not natively support scheduling. To solve this, this practice shows how to integrate the XXL-JOB distributed task scheduler to schedule workflow applications and monitor their status, ensuring stable workflow operation.

All

Schedule Dify workflow applications using XXL-JOB

May 2025

Product

Feature name

Description

Available regions

Related documentation

Container Service for Kubernetes

Support for Kubernetes 1.33

New support for Kubernetes 1.33. You can directly create a 1.33 cluster when creating a cluster or upgrade lower-version clusters to version 1.33.

All

Kubernetes 1.33

Default installation of ack-ram-authenticator component

Starting with Kubernetes 1.33, newly created ACK managed clusters automatically install the latest version of the managed ack-ram-authenticator component without consuming additional cluster node resources.

All

[Service notice] Default installation of ack-ram-authenticator component for ACK managed clusters starting with version 1.33

containerd 2.1.1 released

containerd 2.1.1 supports Node Resource Interface (NRI), Container Device Interface (CDI), and Sandbox API.

All

containerd runtime release notes

Support for ossfs 2.0

ossfs 2.0 is a client based on Filesystem in Userspace (FUSE). It mounts Alibaba Cloud OSS as a local file system so business containers can access OSS data using POSIX operations like local files. Compared to ossfs 1.0, ossfs 2.0 improves performance for sequential read/write and high-concurrency small-file reads, making it suitable for scenarios with high storage access performance requirements such as AI training, inference, big data processing, and autonomous driving.

All regions

ossfs 2.0

Distributed Cloud Container Platform for Kubernetes

Use ApplicationSet to coordinate multi-environment deployments and application dependencies

New best practice showing how to build an automated deployment system for multi-application dependency management between development and staging environments. This combines Argo CD's Progressive Syncs feature with ApplicationSet's multi-environment resource orchestration capability.

All

Use ApplicationSet to coordinate multi-environment deployments and application dependencies

April 2025

Product

Feature name

Description

Available regions

Related documentation

Container Service for Kubernetes

Create and manage Lingjun node pools

Support creating and managing Lingjun node pools in ACK managed clusters Pro Edition.

All

Lingjun node pools

Configure node pools using specified instance attributes

Configure node pool instance types using specified instance attributes such as vCPU and memory. The node pool automatically selects matching instance types for scaling, improving scaling success rates.

All

Configure node pools using specified instance attributes

Real-time AI Profiling

In Kubernetes container scenarios, AI Profiling is a non-intrusive performance analysis tool based on eBPF and dynamic process injection. It supports online detection of container processes running GPU tasks. For online services, a profiling tool that can be dynamically attached and detached allows real-time, detailed analysis without modifying business code.

All

Use AI Profiling from the command line

Enable preemption

When cluster resources are tight, high-priority tasks may fail to run due to insufficient resources. After enabling preemption, the ACK Scheduler can simulate resource usage, evict low-priority pod tasks, and free up compute resources to prioritize rapid startup of high-priority tasks.

All

Enable preemption

Access services using Gateway with Inference Extension

Gateway with Inference Extension is built on the Envoy Gateway project and supports full Gateway API capabilities and open-source Envoy Gateway extension resources.

All

Access services using Gateway with Inference Extension

Generative AI service enhancements

Use the Gateway with Inference Extension component to implement intelligent routing and efficient traffic management, phased releases for generative AI inference services, request circuit breaking for inference services, and traffic mirroring for inference services.

All

Generative AI service enhancements

PVC-to-PVC volume backup and restore

Back up and restore cloud disk data within ACK clusters in the cloud, across ACK clusters in the same region, and across ACK clusters in different regions. After completing a backup in the source cluster, use the backup center to restore new persistent volume claims and corresponding volumes in the current or another cluster. No changes to workload YAML configurations are needed to mount and use them.

All

Backup center

Release alibabacloud-privateca-issuer

Release AlibabaCloud Private CA Issuer to support creating and managing Alibaba Cloud PCA certificates in clusters using cert-manager. It is now available in the ACK App Market.

All

None

Deploy workloads and enable load balancing in ACK managed clusters (intelligent managed mode)

Learn how to deploy a workload in an ACK managed cluster (intelligent managed mode) and enable public network access using ALB Ingress. After completion, access the application using the configured domain name to efficiently manage external traffic and enable load balancing.

All.

Deploy workloads and enable load balancing

Datapath V2 best practice

Learn how to optimize cluster network configurations after enabling Datapath V2 in clusters using the Terway CNI plugin. Examples include Conntrack parameter configuration and Identity resource management to improve cluster performance and stability.

All

Datapath V2 best practice

Dify component upgrade guide

New best practice showing how to upgrade ack-dify from older versions to v1.0.0 or later. Steps include backing up data, installing the plugin migration tool in the plugin system, and enabling the new plugin ecosystem.

All

Upgrade Dify components in ACK clusters

Distributed Cloud Container Platform for Kubernetes

Use PrivateLink to resolve data center CIDR block conflicts

After connecting on-premises Kubernetes clusters to ACK One registered clusters via leased lines, conflicts may occur when using serverless compute resources because other services in the internal network use the same CIDR block. Use PrivateLink to resolve data center CIDR block conflicts.

All

Use PrivateLink to resolve data center CIDR block conflicts

Cross-region scheduling of ACS pods

ACK One registered clusters support seamlessly integrating cross-region serverless compute resources into Kubernetes clusters. This enables dynamic cross-region GPU resource scheduling and unified management.

All

Cross-region scheduling of ACS pods

Log collection

Configure log collection using SLS CRDs or environment variables to automatically collect container logs using Alibaba Cloud Simple Log Service (SLS).

All

Container Service for Kubernetes Edge Edition

Release version 1.32

Support for version 1.32, including optimized CoreDNS, kube-proxy, and kubelet requests to kube-apiserver, and reduced cloud-edge communication traffic.

All

ACK Edge Kubernetes 1.32 release notes

Network element configuration in leased line environments

Support connecting on-premises data center (IDC) server devices to containerized management via public networks or leased lines. When connecting via leased lines, complete infrastructure network element configuration before connecting.

All

Network element configuration in leased line environments

Cloud-native AI Suite

HistoryServer component support

The Ray native Dashboard is only available while the cluster runs. After cluster termination, users cannot access historical logs and monitoring data. Use the RayCluster HistoryServer to collect node logs in real time during cluster operation and persist them to OSS.

All

Install the HistoryServer component in ACK

KubeRay component support

Support deploying the KubeRay Operator component and integrating Alibaba Cloud SLS and Prometheus monitoring to enhance log management, system observability, and high availability.

All

Install the KubeRay component in ACK

March 2025

Product

Feature name

Description

Available regions

Related documentation

Container Service for Kubernetes

ACK managed clusters Pro Edition supports intelligent managed mode

When creating an ACK managed cluster, enable intelligent managed mode to quickly create a Kubernetes cluster following best practices.

After cluster creation, a smart managed node pool is created by default. This node pool dynamically scales based on workloads. ACK handles OS version upgrades, software version upgrades, and security vulnerability fixes.

All

Enable tracing for cluster control plane and data plane components

After enabling tracing for the cluster API Server or kubelet, trace data automatically reports to Managed Service for OpenTelemetry, providing visualized trace details and real-time topology monitoring data.

All

High-risk KubeConfig SMS and email notification

Send SMS and email notifications to users when high-risk KubeConfigs exist under their accounts, even if they have been deleted.

All

None

Use ACK Gateway with Inference Extension for intelligent routing and traffic management

Use the ACK Gateway with Inference Extension component to configure inference service extensions for intelligent routing and efficient traffic management.

All

Use Gateway with Inference Extension for intelligent routing and traffic management

Distributed Cloud Container Platform for Kubernetes

Unified multi-cluster fleet component management

ACK One fleets provide cluster O&M engineers with unified and automated component management. Define baselines containing multiple components and versions, deploy them to multiple clusters, and support component configuration, deployment batches, and rollback to improve system stability.

All regions

Multi-cluster component management

Support for dynamic distribution and descheduling

ACK One fleets can split workload replicas across child clusters based on available resources using PropagationPolicy. ACK One fleets enable descheduling by default, checking every two minutes. If a pod remains unschedulable for over 30 seconds, descheduling triggers for that replica.

All

Dynamic distribution and descheduling

Cloud-native AI Suite

You can set the priority for Slurm queues.

New best practice showing how to configure appropriate queue policies in Slurm environments to maximize task scheduling and achieve optimal performance when submitting jobs or changing job states.

All

Set Slurm queue priority in ACK clusters

February 2025

Product

Feature name

Description

Available regions

Related documentation

Container Service for Kubernetes

Modify control plane security group and time zone

If the selected security group and time zone for cluster creation no longer meet requirements, you can modify the control plane security group and cluster time zone in the cluster's basic information.

All

View cluster information

Node pools support custom containerd configuration

Customize containerd parameter configurations for nodes in node pools. For example, configure multiple mirror repositories for a specific image repository or skip certificate verification for a specific image repository.

All

Customize containerd parameter configurations for node pools

New node pool elasticity strength indicator

Node pool scaling may fail because of insufficient instance inventory or unsupported ECS instance types in the specified zone. Use elasticity strength to assess node pool configuration availability and instance supply health, and obtain configuration recommendations.

All

View node pool elasticity strength

Enable batch task orchestration

Argo Workflows is a Kubernetes-native workflow engine that supports orchestrating parallel tasks using YAML or Python. It simplifies automation and management of containerized applications for CI/CD pipelines, data processing, and machine learning. You can install the Argo Workflows component to enable batch task orchestration. Then, use Alibaba Cloud Argo CLI or the console to create and manage workflow tasks.

All

Enable batch task orchestration

GPU fault detection

The ack-node-problem-detector component provided by ACK enhances cluster node anomaly event monitoring based on the open-source node-problem-detector project. It includes rich GPU-specific fault detection items to improve fault discovery in GPU scenarios. When faults are detected, it generates corresponding Kubernetes Events or Kubernetes Node Conditions based on fault type.

All

GPU anomaly detection and automatic isolation

Distributed Cloud Container Platform for Kubernetes

Multi-cluster Spark job scheduling and distribution based on actual remaining resources

This practice shows how to use ACK One fleets and ACK Koordinator components to schedule and distribute multi-cluster Spark jobs based on actual remaining resources (not requested resources) across clusters. This maximizes use of idle resources in multi-cluster environments and ensures online service stability through priority control and offline hybrid deployment.

All

Multi-cluster Spark job scheduling and distribution based on actual remaining resources

Container Service for Kubernetes Edge Edition

Support for new pod virtual switches

If ACK Edge clusters use the Terway Edge plugin in ENS edge scenarios, you can add new pod virtual switches when IP addresses in existing virtual switches are insufficient or when you need to expand the pod CIDR block. This increases available IP address resources for the cluster.

All

Add pod virtual switches

GPU resource monitoring

ACK Edge clusters can manage GPU nodes in data centers and at the edge, providing unified management of heterogeneous compute power across multiple regions and environments. You can integrate Alibaba Cloud Prometheus monitoring into ACK Edge clusters to provide data center and edge computing GPU nodes with the same observability capabilities as cloud-based resources.

All regions

Best practice for GPU resource monitoring in ACK Edge clusters

Cloud-native AI Suite

Deploy DeepSeek distillation model inference services using ACK

Using the DeepSeek-R1-Distill-Qwen-7B model as an example, you can learn how to deploy production-ready DeepSeek distillation model inference services using KServe in Alibaba Cloud Container Service for Kubernetes (ACK).

All

Deploy DeepSeek distillation model inference services using ACK

Multi-node distributed deployment of DeepSeek full-version inference services using ACK

This practice shows a hands-on solution for distributed DeepSeek-R1-671B large model inference using ACK. It uses hybrid parallelism strategies combined with Alibaba Cloud Arena tools to achieve efficient distributed deployment across two nodes. It also covers integrating deployed DeepSeek-R1 into Dify platforms to quickly build enterprise-grade intelligent Q&A systems supporting long-context understanding.

All

Multi-node distributed deployment of DeepSeek full-version inference services using ACK

January 2025

Product

Feature name

Description

Available regions

Related documentation

Container Service for Kubernetes

Node pools support on-demand image acceleration

ACK is built on the DADI (Data Accelerator for Disaggregated Infrastructure) image acceleration technology to support on-demand loading of container images, thereby eliminating the need for full downloads and enabling online decompression to significantly reduce application startup time.

All

Accelerate container startup using on-demand container image loading

New support for the Alibaba Cloud Linux 3 Container-Optimized Edition operating system

Alibaba Cloud Linux 3 Container-Optimized Edition (Alibaba Cloud Linux 3.2104 LTS 64-bit Container-Optimized Edition) is an image version optimized for container scenarios based on the default Alibaba Cloud Linux standard image. To address higher container scenario requirements—such as denser business deployments, faster startup speeds, and stronger security isolation—Alibaba Cloud has developed this cloud-native operating system based on extensive customer experience with Container Service for Kubernetes.

All

Support for Kubernetes 1.32

ACK now supports Kubernetes 1.32. You can directly create a 1.32 cluster when creating a cluster or upgrade lower-version clusters to version 1.32.

All

Kubernetes 1.32

Improve resource utilization using ElasticQuotaTree and task queues

To allow different teams and tasks to share cluster compute resources while ensuring fair allocation and isolation, you can use ack-kube-queue, ElasticQuotaTree, and ack-scheduler for flexible and reasonable resource allocation.

All

None

New best practice for fine-grained resource control using resource groups

To manage Container Service for Kubernetes resources more efficiently, you can use resource groups to organize resources. Resource groups allow you to group resources by department, project, environment, or other dimensions. Combined with Resource Access Management (RAM), they enable resource isolation and fine-grained permission management within a single Alibaba Cloud account.

All

Use resource groups for fine-grained resource control

Distributed Cloud Container Platform for Kubernetes

ACK One registered clusters access ACS compute power

You can use container compute power provided by ACS in ACK One registered clusters.

All

Schedule pods to ACS using virtual nodes

Support for native Service domain name cross-cluster service access

ACK One multi-cluster Service supports cross-cluster service access using native Service domain names via MultiClusterService. You do not need to modify business code, DNSConfig in business pods, or CoreDNS configurations. Instead, you can simply use native Services for cross-cluster traffic routing.

All

Use native Service domain names for cross-cluster service access

Support for accessing multi-cluster resources using the Go SDK

If you want to integrate ACK One fleets into your platform to access resources from child clusters, you can use the Go SDK.

All

Access multi-cluster resources using Go SDK

Container Service for Kubernetes Edge Edition

Support for cloud node scaling

When on-premises node resources are insufficient, automatic node scaling adds cloud nodes to ACK Edge clusters to supplement scheduling capacity.

All

Cloud ECS node elasticity

Support for hybrid cloud LLM elastic inference service deployment

You can install the ack-kserve component and combine it with the cloud elasticity features of ACK Edge clusters to deploy hybrid cloud LLM elastic inference services. This helps you flexibly schedule cloud and on-premises resources and reduce LLM inference service operating costs.

All

Support for shared GPU scheduling

Shared GPU scheduling lets you schedule multiple pods onto the same GPU card to share GPU compute resources, improving GPU utilization and saving costs.

  • Cloud node pools in ACK Edge clusters fully support shared GPU scheduling, GPU memory fencing, and computing power fencing.

  • Edge node pools in ACK Edge clusters support only shared GPU scheduling, and do not support GPU memory fencing or computing power fencing.

All

Use shared GPU scheduling

Support for unified management of multi-region ECS resources

This new best practice shows how to use ACK Edge clusters to unify management of compute resources distributed across multiple regions. This enables full lifecycle management of cloud-native applications and efficient resource scheduling.

All

Unify management of multi-region ECS resources

More information

For historical ACK release notes, see Historical release notes (before 2025).