Container Service for Kubernetes (ACK) strictly abides by the terms of the Certified Kubernetes Conformance Program. This topic describes the updates in Kubernetes 1.28, including update notes, major changes, new features, deprecated features and APIs, and feature gates.
Version updates
The following key components are updated and optimized by ACK to support Kubernetes 1.28.
Key component | Version |
Kubernetes | 1.28.15-aliyun.1, 1.28.9-aliyun.1, and 1.28.3-aliyun.1 |
etcd | v3.5.9 |
CoreDNS | v1.9.3.10-7dfca203-aliyun |
CRI | containerd 1.6.20 |
CSI | 1.24.10-7ae4421-aliyun |
CNI | Flannel v0.15.1.22-20a397e6-aliyun |
Terway v1.5.0 and later, and TerwayControlplane v1.5.0 and later | |
NVIDIA Container Runtime | v3.13.0 |
Ingress Controller | v1.8.0-aliyun.1 |
Update notes
Component | Description |
CephFS and Ceph RBD volume plug-ins | If your cluster uses the CephFS and Ceph RBD volume plug-ins, you need to check whether the plug-ins use the off-tree driver instead of the plug-in driver provided by Kubernetes. You also need to check the compatibility, stability, and performance of the off-tree driver. |
Terms
We recommend that you learn the following terms before you read this topic.
Major changes
The scheduling logic of the scheduler is optimized in Kubernetes 1.28 to reduce invalid retries and improve performance.
If your cluster uses a custom scheduler plug-in, we recommend that you optimize and update the plug-in to improve the performance. For more information, see Changes in the scheduling framework.
The Kubernetes community provides the CSI migration solution to replace existing storage plug-in drivers with an off-tree driver that uses standard CSI APIs. The CSI migration feature is in the GA stage in Kubernetes 1.25. The
storage.k8s.io/v1beta1
API and Elastic Block Service (EBS) plug-in are removed in Kubernetes 1.27. The CephFS volume plug-in code is removed andkubernetes.io/rbd
is deprecated in Kubernetes 1.28. Instead, the CephFS CSI driver is used. In addition, you can no longer migrate Ceph RBD volumes to a plug-in that uses the off-tree CSI driver in Kubernetes 1.28.The vulnerability CVE-2024-10220 was fixed in version 1.28.15-aliyun.1.
The following Common Vulnerabilities and Exposures (CVE) were fixed in version 1.28.9-aliyun.1:
CVE-2023-45288
CVE-2024-3177
CVE-2024-24786
Features
Kubernetes 1.27
The issue that pods cannot correctly transition to a terminal phase is fixed. The Failed phase is assigned to pods that are deleted while they are pending. The Succeeded or Failed phase (depending on the exit states of the pods) is assigned to pods that are deleted while they are running. This fixes the issue that pods may become stuck in the Pending state while they are being deleted if the Jobs that create the pods use a pod failure policy.
If the pods are configured with
RestartPolicy=Always
, the Succeeded phase may be assigned to the pods after the pods are deleted. Therefore, you may need to modify your controllers. For more information, see Give terminal phase correctly to all pods that will not be restarted.The ReadWriteOncePod feature for persistent volumes (PVs) has reached Beta. This feature allows you to limit volume access to a single pod. For more information, see Single Pod Access Mode for PersistentVolumes Graduates to Beta.
You can use pod topology spread constraints to control how pods in a cluster are spread across zones. The following features have reached Beta:
minDomains
(specifies the minimum number of domains in which pods are spread),nodeTaintsPolicy
(specifies how node taints are treated during pod spreading),nodeAffinityPolicy
(specifies how node affinity is treated during pod spreading), andwhenUnsatisfiable
(specifies how a pod is treated if the pod does not comply with the spread constraint). For more information, see More fine-grained pod topology spread policies reached beta.
The server side field validation feature for validating resources sent to the API server has reached GA. Kubectl functionality validation will be skipped on the client and performed on the server side in
strict
mode. Errors are thrown when the kubectl fails the validation. For more information, see Server Side Field Validation and OpenAPI V3 move to GA.OpenAPI V3 is a new OpenAPI standard. OpenAPI V3 is introduced in Kubernetes 1.23 and has reached GA in Kubernetes 1.27. For more information, see Server Side Field Validation and OpenAPI V3 move to GA.
The Horizontal Pod Autoscaler (HPA) API allows you to configure container resource metrics to enable the HPA to track the resource usage of individual containers and scale resources accordingly. This feature has reached Beta in Kubernetes 1.27. Compared with the resource metrics that indicate the average resource usage of pods, the container resource metrics indicate the resource usage of individual containers. This helps resolve the issue that the average resource usage of a pod cannot trigger scale-out activities because the resource usage of sidecar containers in the pod is low but the resource usage of application containers is high.
Multiple StatefulSet features have reached Beta, including the feature for assigning sequence numbers to pods from a number other than zero, the feature for deleting the specified persistent volume claims (PVCs), and the feature for automatically deleting PVCs that are created during scale-in activities.
A new feature is added to resize the CPU and memory resources specified in the
resources
field of a pod without restarting the pod and containers. A node allocates resources to a pod based on therequests
of the pod and limit the resource usage of the pod based on thelimits
. Some fields are added to pods to support in-place resizing of pod resources. For more information, see Resize CPU and Memory Resources assigned to Containers. This feature has reached Alpha in Kubernetes 1.27 and is disabled by default.You can set the
serializeImagePulls
field of the kubelet tofalse
to enable parallel image pulls instead of using the default serial image pulls mode. The maxParallelImagePulls field is added in Kubernetes 1.27 to limit the number of the images that can be pulled in parallel. This helps prevent image pulls from consuming excessive bandwidth or disk I/O.In addition to the volume snapshot API, a crash consistent volume group snapshot API is introduced in Kubernetes 1.27 to allow you to create snapshots for multiple PVs at a point in time. For more information, see Introducing An API For Volume Group Snapshots.
Kubernetes 1.28
The non-graceful node shutdown feature has reached GA. When a node is shut down due to an exception such as power shortage, the StatefulSet needs to create pods with the same name on another node to avoid business interruptions.
The NodeOutOfServiceVolumeDetach feature gate has reached GA. After this feature is enabled, when a node is shut down due to an exception, volume detach operations are immediately performed for the terminated pods on the node. This allows pods on the out-of-service node to quickly recover on other nodes.
The retroactive default StorageClass assignment feature has reached GA. Before this feature is introduced, if you create a PVC without the
storageClassName
when no default StorageClass exists, the PVC remains in the Pending state. After this feature is introduced, when a default StorageClass is created, the PVC without thestorageClassName
automatically uses the default StorageClass.Two features are introduced to avoid Job failures.
Compared with the
deletionTimestamp
feature gate that creates replacement pods immediately after pods are deleted, the JobPodReplacementPolicy feature gate (in the Alpha stage) creates replacement pods only after pods are assigned the Failed phase (status.phase: Failed
). The policy prevents two pods from using the same index and node resources at the same time.The JobBackoffLimitPerIndex feature gate (in the Alpha phase) allows you to set
.spec.backoffLimitPerIndex
to limit the maximum number of retries for pod failures per index. Before this feature is introduced, if the number of consecutive pod failures of an index reaches.spec.backoffLimit
, the corresponding indexed Job fails.
If the
completion
field of an indexed Job is set to a value greater than 100,000, theparallelism
field of the Job is set to a value greater than 10,000, and large numbers of pods fail, pod terminal phase tracing may fail. To prevent this issue, warnings are displayed if you set the preceding fields to excessively large values when you create a Job.The
reason
andfieldPath
fields are added to CustomResourceDefinition (CRD) validation rules to return the reason and field path when CRD validation fails. For more information, see CRD Validation Expression Language.Common Expression Language (CEL) expressions can be used in webhook matching requests. Up to 64 matching conditions are supported. For more information, see Matching requests: matchConditions.
The SidecarContainers feature gate is introduced to allow you to specify the time when sidecar containers are launched. For example, you can launch log collection containers before other containers to improve the reliability of log collection. For more information, see Kubernetes v1.28: Introducing native sidecar containers. This feature has reached Alpha in Kubernetes 1.28 and is disabled by default.
The
.status.resizeStatus
field of a PVC is replaced with the.status.allocatedResourceStatus
map field to store the states of resources that are being resized for the PVC. For more information, see PersistentVolumeClaimStatus.Pod indexes (sequence numbers) are added as labels to pods created by indexed Jobs and StatefulSets.
The ValidatingAdmissionPolicy feature gate (in the Beta phase) provides a declarative alternative to the method of validating admission webhooks to validate resource requests. The feature gate also allows you to use CEL expressions to write complex validation rules. The API server will validate resource requests against CEL expressions.
The
--concurrent-cron-job-syncs
flag is added to the Kubernetes controller manager to set the concurrency of the CronJob controller and the--concurrent-job-syncs
flag is added to set the concurrency of the Job controller. For more information, see --concurrent-cron-job-syncs and --concurrent-job-syncs.The API server is optimized:
The memory usage of getting a list (GetList) from the cache is reduced. For more information, see GetList test data.
The issue that the endpoint of a Kubernetes Service is not removed when only one replicated API server exists is fixed. This ensures that the endpoints of Kubernetes Services are removed during graceful shutdown.
The OpenAPI v2 controller is made lazy to aggregate information from CRDs and the OpenAPI v2 specifications are reduced. When no client sends requests to the OpenAPI v2, the CPU and memory usage of the API server is reduced. In addition, the efficiency of installing large numbers of CRDs is improved. However, this slows down the processing of first-time requests. We recommend that you update your client to a version that supports OpenAPI v3.
The Consistent Reads from Cache feature gate is introduced to allow you to use the watch cache to guarantee consistent reads for LIST requests.
A variety of metrics can be collected by calling the metrics API.
Deprecated features
Kubernetes 1.27
The in-tree AWS EBS storage plug-in is replaced with the off-tree CSI plug-in. For more information, see cloud-provider-aws.
The
spec.externalID
field of nodes is deprecated. Warnings are returned if clients send requests to update this field. For more information about how to return warnings to clients, see Helpful Warnings Ahead.Secure Computing Mode (seccomp) has reached GA in Kubernetes 1.19. This mode allows you to limit the system calls that a pod or container can make to improve the security of workloads. The
seccomp.security.alpha.kubernetes.io/pod
andcontainer.seccomp.security.alpha.kubernetes.io
annotations in the Alpha stage are deprecated in Kubernetes 1.19 and removed in Kubernetes 1.27.We recommend that you use the
securityContext.seccompProfile
field for pods or containers.The following flags are removed from Kubernetes controller manager commands:
--pod-eviction-timeout
(specifies the graceful period for pod eviction on a NotReady node) and--enable-taint-manager
(evicts pods from nodes with specified taints). By default, the feature for evicting pods from nodes with specified taints is enabled.The following flags are removed from kubelet commands:
--container-runtime
,--container-runtime-endpoint
, and--image-service-endpoint
. After dockershim is removed from--container-runtime
,remote
is kept, deprecated in Kubernetes 1.24, and then removed in Kubernetes 1.27. You can no longer specify the--container-runtime-endpoint
and--image-service-endpoint
flags in kubelet commands. To use these settings, modify the kubelet configuration file instead.The SecurityContextDeny admission controller is deprecated and will be removed in later versions.
Kubernetes 1.28
The in-tree CephFS plug-in code is removed.
We recommend that you use the CephFS CSI driver.
Support for migrating Ceph RBD volumes to a plug-in that uses the off-tree CSI driver is deprecated and will be removed in later versions.
We recommend that you complete the migration before the removal of the in-tree plug-in code.
The RBD volume plug-in (kubernetes.io/rbd) is deprecated and will be removed in later versions.
We recommend that you use the CephFS CSI driver.
KMSv1 is deprecated. If you want to continue to use KMSv1, set
--feature-gates=KMSv1=true
. For more information, see Mark KMS v1beta1 as deprecated with no further fixes.We recommend that you use KMSv2.
The
--volume-host-cidr-denylist
and--volume-host-allow-local-loopback
flags in Kubernetes controller manager commands are deprecated.The
--azure-container-registry-config
flag in kubelet commands is deprecated.We recommend that you use the
image-credential-provider-config
and--image-credential-provider-bin-dir
flags.You can no longer create Windows node pools.
You can create node pools that use other operating systems, such as Alibaba Cloud Linux 3 and ContainerOS 3.1. For more information, see Create a node pool.
Deprecated APIs
The CSIStorageCapacity API allows you to query the current available storage capacity to ensure that your pods are scheduled to a node with sufficient storage resources. The version of the storage.k8s.io/v1beta1
API is deprecated in Kubernetes 1.24 and removed in Kubernetes 1.27.
We recommend that you use the storage.k8s.io/v1
version. This version is available in Kubernetes 1.24 and later versions. For more information, see Storage Capacity Constraints for Pod Scheduling KEP.
Feature gates
This section lists only the major changes. For more information, see Feature Gates.
Kubernetes 1.27
The
NodeLogQuery
feature gate in the Alpha stage is added. After you setenableSystemLogHandler
andenableSystemLogQuery
totrue
for the kubelet, you can use kubeclt to query node logs.The
StatefulSetStartOrdinal
feature gate has reached Beta. This feature gate allows you to assign sequence numbers to pods created by StatefulSets from a number other than zero. By default, this feature gate is enabled.The
StatefulSetAutoDeletePVC
feature gate has reached Beta. The new policy controls whether and when StatefulSets delete PVCs created fromvolumeClaimTemplate
.The
IPv6DualStack
feature gate has reached GA in Kubernetes 1.23 and is enabled by default. The component code is removed in Kubernetes 1.27.If you manually configured IPv4/IPv6 dual stack for your cluster, delete the configuration before you can update your cluster.
The
ServiceNodePortStaticSubrange
feature gate in the Alpha stage is added to reduce conflicts in assigning ports to NodePort Services. This feature gate divides the port range for NodePort Services into two bands. Dynamic port assignment uses the high band. The low band with a lower risk of port conflicts can be used to statically assign ports to NodePort Services. For more information, see Avoid Collisions Assigning Ports to NodePort Services.The
InPlacePodVerticalScaling
feature gate in the Alpha stage is added to allow you to resize the CPU and memory resources of a pod without restarting the pod and containers.The following feature gates for expanding volumes have reached GA and are enabled by default:
ExpandCSIVolumes
(expands CSI volumes),ExpandInUsePersistentVolumes
(expands PVs that are in use), andExpandPersistentVolumes
(expands PVs).The
CSIMigration
feature gate for migrating in-tree storage plug-ins to a plug-in that uses the off-tree CSI driver is always enabled by default. This feature gate is removed in Kubernetes 1.27.The
CSIInlineVolume
feature gate for inline volumes has reached GA in Kubernetes 1.25 and is always enabled by default. This feature gate is removed in Kubernetes 1.27.The
EphemeralContainers
feature gate for ephemeral containers has reached GA in Kubernetes 1.25 and is always enabled by default. This feature gate is removed in Kubernetes 1.27.The
LocalStorageCapacityIsolation
feature gate provides support for ephemeral storage capacity isolation ofemptyDir
volumes. A pod can be hard limited in its local storage usage by evicting the pod when the local storage usage exceeds the limit. This feature gate has reached GA in Kubernetes 1.25 and is always enabled by default. The feature gate is removed in Kubernetes 1.27.The
NetworkPolicyEndPort
feature gate allows you to set theendPort
field in network policies to specify multiple ports. Before this feature gate is introduced, you can specify only one port. This feature gate has reached GA in Kubernetes 1.25 and is always enabled by default. The feature gate is removed in Kubernetes 1.27.The
StatefulSetMinReadySeconds
feature gate allows you to configureminReadySeconds
for StatefulSets. This feature gate has reached GA in Kubernetes 1.25 and is always enabled by default. The feature gate is removed in Kubernetes 1.27.The
DaemonSetUpdateSurge
feature gate allows you to configuremaxSurge
for DaemonSets. This feature gate has reached GA in Kubernetes 1.25 and is always enabled by default. The feature gate is removed in Kubernetes 1.27.The
IdentifyPodOS
feature gate allows you to specify an operating system for pods. This feature gate has reached GA in Kubernetes 1.25 and is always enabled by default. The feature gate is removed in Kubernetes 1.27.The
ReadWriteOncePod
feature gate has reached Beta and is enabled by default. This feature gate allows you to access PVs inReadWriteOncePod
mode.
Kubernetes 1.28
When the
NodeOutOfServiceVolumeDetach
feature gate adds thenode.kubernetes.io/out-of-service
taint to mark a node as out-of-service, pods that do not match tolerations on the node are forcefully evicted and volumes are immediately detached. This feature gate has reached GA in Kubernetes 1.28 and is always enabled by default.The
AdmissionWebhookMatchCondition
feature gate is enabled by default to allow you to use CEL expressions as webhook matching conditions.The
UnknownVersionInteroperabilityProxy
feature gate has reached Alpha. This feature gate can send requests to the correct API server when multiple API server versions exist. For more information, see Mixed Version Proxy.The
IPTablesOwnershipCleanup
feature gate has reached GA. This feature gate causes the kubelet to no longer create KUBE-MARK-DROP and KUBE-MARK-MASQ iptables rules.The
ConsistentListFromCache
feature gate has reached Alpha. This feature gate allows the API server to use the watch cache to guarantee consistent reads for LIST requests.The
ProbeTerminationGracePeriod
feature gate has reached GA and is enabled by default. This feature gate allows you to use probe-level terminationGracePeriodSeconds.The following feature gates in the GA stage are removed:
DelegateFSGroupToCSIDriver
,DevicePlugins
,KubeletCredentialProviders
,MixedProtocolLBService
,ServiceInternalTrafficPolicy
,ServiceIPStaticSubrange
, andEndpointSliceTerminatingCondition
.
References
For more information about the release notes for Kubernetes 1.27 and Kubernetes 1.28, see CHANGELOG-1.27 and CHANGELOG-1.28.