All Products
Search
Document Center

Container Service for Kubernetes:Update ACK Lingjun clusters

Last Updated:Aug 20, 2024

Outdated cluster versions may have security and stability issues. To ensure business continuity, Container Service for Kubernetes (ACK) uses in-place updates to update ACK Lingjun clusters. You can update the Kubernetes version of a cluster in the ACK console, or update the control plane and node pools of the cluster separately. This topic describes the usage notes before and after an update and the procedure for updating an ACK Lingjun cluster.

Why ACK Lingjun clusters need updates

You can update the Kubernetes versions of ACK Lingjun clusters from 1.20 to 1.22.

Proactive updates provide the following benefits:

  • Reduced security and stability risks: New Kubernetes versions are usually released to add optimizations and patch security and stability vulnerabilities. Using outdated Kubernetes clusters may pose security and stability risks to your businesses.

  • New features: The iteration of open source Kubernetes usually comes with new features and improvements. ACK will also support these features to optimize your development and maintenance experience.

We recommend that you perform the following steps to proactively update your clusters.

Important

When you update a cluster, ACK performs a precheck on the cluster, but ACK does not guarantee that all incompatible features, configurations, and APIs can be identified. According to the shared responsibility model, we recommend that you pay attention to the release of Kubernetes versions by checking the documentation, information in the console, and internal messages, and learn the update notes of the corresponding version before you update the cluster.

For more information about how ACK Lingjun clusters support Kubernetes versions, see Support for Kubernetes versions.

Usage notes (important)

Kubernetes versions

To view the Kubernetes version of an ACK Lingjun cluster, log on to the ACK console and check the Version column of the cluster on the Clusters page. Before you update to a Kubernetes version, read the following release notes for the corresponding Kubernetes version to learn the version details, deprecated APIs, and usage notes for updates. This helps you avoid compatibility issues caused by feature updates in new Kubernetes versions.

Note

If the YAML file of your Helm chart uses deprecated resources, modify the file at the earliest opportunity. For more information, see the preceding release notes and Deprecated APIs.

Features and custom configurations

If your ACK Lingjun cluster uses the features listed in the following table, read the considerations and suggested solutions.

Feature

Consideration

Suggested solution

FlexVolume

Object Storage Service (OSS) volumes that are mounted by using FlexVolume 1.11.2.5 or earlier are remounted during a cluster update.

After the update is complete, you need to recreate the pods that use OSS volumes.

FlexVolume is deprecated. We recommend that you upgrade from FlexVolume to CSI. For more information, see Upgrade from FlexVolume to CSI.

Auto scaling

If auto scaling is enabled, the cluster automatically updates Cluster Autoscaler to the latest version after the cluster is updated. This ensures that the auto scaling feature can work as expected.

Make sure that Cluster Autoscaler is updated to the latest version. For more information, see Auto scaling of nodes.

Resource reservation

After you update the Kubernetes version of an ACK Lingjun cluster to 1.18, ACK automatically configures resource reservation. If resource reservation is not configured for the cluster and the resource usage of nodes is high, ACK may fail to schedule evicted pods to the nodes after the cluster is updated.

Reserve sufficient resources on the nodes. We recommend that you reserve at least 50% of CPU resources and at least 70% of memory resources. For more information, see Resource reservation policy.

LoadBalancer configurations

ACK Lingjun clusters require Server Load Balancer (SLB) instances to handle external access. However, if externalTrafficPolicy: Local is specified for an SLB instance, traffic is forwarded only to node-local pods. If your application pods are deployed on other nodes, traffic cannot reach these pods.

Check whether externalTrafficPolicy: Local is specified for the SLB instance in case the SLB instance cannot forward traffic to the application pods. For more information, see What Can I Do if the Cluster Cannot Access the IP Address of the SLB Instance Exposed by the LoadBalancer Service.

API Server

When ACK updates a cluster, ACK attempts to update the control plane without interrupting communication with the applications in the cluster. However, communication with the API server may be temporarily interrupted. The interruption affects applications that strongly rely on the API server. For example, if your application needs to list and watch resources, the watch operation is interrupted when the API server restarts. To resolve this problem, you need to configure the application to automatically retry the watch operation when an interruption occurs.

If your application does not need to access the API server, the application is not affected by the update.

kubectl

After a cluster is updated, we recommend that you update kubectl on your on-premises machine.

If you do not update kubectl, the kubectl version may be incompatible with the API server version. As a result, the error message invalid object doesn't have additional properties may appear.

Install or update kubectl. For more information, see Install kubectl.

If your cluster uses custom configurations, read the descriptions in the following table.

Item

Description

Network

To update a cluster, you need to use Yum to download the required software packages. If your cluster uses custom network configurations or a custom OS image, you need to ensure that Yum can run as normal. You can run the yum makecache command to check the status of Yum.

OS image

Custom OS images are not strictly validated by ACK. ACK does not guarantee the success of cluster updates if your cluster uses a custom OS image.

Others

If your cluster uses other custom configurations, such as swap partitions or kubelet configurations modified by using the CLI, the cluster may fail to be updated or the custom configurations may be lost during the update.

Procedure

Update control planes and all node pools

  1. Log on to the ACK console. In the left-side navigation pane, click Clusters.

  2. On the Clusters page, find the cluster that you want to upgrade and choose More > Operations > Upgrade Cluster in the Actions column.

  3. In the Update Items section of the Upgrade Cluster page, select an available Kubernetes version and set the Update Mode parameter to Control Planes and All Node Pools. In the Batch Update Policy section, configure the Maximum Number of Nodes to Repair per Batch parameter and click Precheck.

    After the precheck is complete, click View Details to view the report.

  4. After the cluster passes the precheck, click Start Update.

    During the update, do not add or remove nodes. To add or remove nodes, you need to first cancel the update. You can check the update progress in the Event Rotation section of the Upgrade Cluster page and perform the following operations based on your business requirements:

    • Pause and resume the update: Click Pause to pause the update. To resume the update, click Continue.

      After you pause the update, the cluster remains in an intermediate state. Do not perform any operations on the cluster when the update is paused and complete the update at the earliest opportunity. The update is terminated after the cluster remains in the Paused state for seven days. ACK will automatically delete the events and logs related to the update.

    • Cancel the update: Click Cancel. In the message that appears, click OK. After you cancel the update, ACK continues to update the nodes in the current batch and the update cannot be rolled back. The remaining batches are not updated.

      Note
      • If an error occurs during the update, ACK pauses the update. The cause of the failure is displayed in the lower part of the page. You can follow the suggestions to troubleshoot the error.

      • Do not modify the resources in the kube-upgrade namespace during the update unless an error occurs.

    After the update is complete, you can go to the Clusters page and check the Kubernetes version of your cluster to verify that the control plane components are updated. You can also go to the cluster details page and choose Nodes > Nodes in the left-side navigation pane to view the Kubernetes version of the nodes.

Update only the control plane

Procedure

Before you update the node pools, you need to update the control planes first.

  1. Log on to the ACK console. In the left-side navigation pane, click Clusters.

  2. On the Clusters page, find the cluster that you want to upgrade and choose More > Operations > Upgrade Cluster in the Actions column.

  3. In the Update Items section of the Upgrade Cluster page, select an available Kubernetes version, set the Update Mode parameter to Control Planes Only, and then click Precheck.

    After the precheck is complete, click Details to view the report.

    • If the result is normal in the report, the cluster passes the precheck and you can update the cluster.

    • If the result is abnormal in the report, the cluster can still run as expected and the cluster status does not change. Click the Troubleshoot tab and follow the suggestions displayed on the page to resolve the issues. For more information, see Cluster check items and suggestions on how to fix cluster issues.

      Note

      If your cluster runs Kubernetes 1.20 or later, the precheck checks whether discontinued APIs are used in your cluster. The precheck result is for reference only and does not determine whether the cluster can be updated. For more information, see the Deprecated APIs section of the "Cluster check items and suggestions on how to fix cluster issues" topic.

  4. After the cluster passes the precheck, click Start Update.

    You can view the update progress on the Upgrade Cluster page. After the update is complete, you can go to the Clusters page, find the cluster that you want to manage, and check the Kubernetes version of your cluster to verify that the control plane components are updated.

Next step: Update node pools

After the control plane is updated, new nodes are added to the cluster based on the updated Kubernetes version. We recommend that you update existing nodes during off-peak hours at the earliest opportunity and confirm the kubelet version after the update is complete. For more information, see Update a node pool.

FAQ about cluster updates

What do I do if a cluster update fails and the "the aliyun service is not running on the instance" error message is returned?

Cause

The Cloud Assistant agent becomes unavailable. As a result, the update command fails to be sent to the cluster.

Solution

Start or stop the Cloud Assistant agent. Then, update the cluster again. For more information, see Start, stop, or uninstall the Cloud Assistant Agent.

How do I handle the PLEG not healthy error?

The containers or container runtime does not respond. You need to restart the nodes and initiate the update again.

Procedures for updating the control plane and node pools

Update the control plane

ACK updates the control plane of your ACK Lingjun cluster based on the following procedure. The update policy specifies the following rules:

  1. Update the control plane and managed components, such as kube-apiserver, kube-controller-manager, and kube-scheduler.

  2. Update Kubernetes components, such as kube-proxy.

Update node pools

ACK updates the nodes in your cluster in batches. The batch update policy specifies the following rules:

  • ACK updates node pools one after one.

  • The nodes in a node pool are updated in batches. The first batch includes one node. The number of nodes increases based on the powers of two in subsequent batches. The batch update policy still applies after you resume a paused update. You can specify the maximum batch size on the Node Pool Upgrade page. We recommend that you set the maximum batch size to 10. For more information, see Update a node pool.

References

If errors occur when you update ACK Lingjun clusters, refer to Cluster check items and suggestions on how to fix cluster issues to troubleshoot the errors.