How to update the NGINX Ingress controller and solutions to update-related issues - Container Service for Kubernetes

When you update the Kubernetes version of a cluster, you may also need to update the NGINX Ingress controller because some APIs in the new Kubernetes version are deprecated. To ensure stability and use the latest features, we recommend that you update the NGINX Ingress controller to the latest version at the earliest opportunity. Unlike other components, the NGINX Ingress controller is updated in a progressive way. You need to pay close attention to the status of the component and your business to ensure that no business interruptions occur during and after the update.

Procedure

You must ensure the stability of the NGINX Ingress controller because it is a key component of the data plane.

Incompatibility issues may occur if you update the NGINX Ingress controller to the latest version. This is because the latest version is a major update that includes significant changes to the NGINX Ingress controller, including substantial custom changes. In addition, an in-place update imposes higher potential risks because the incompatibility issues may not immediately occur after the NGINX Ingress controller is updated.

To ensure that the NGINX Ingress controller can run as expected and prevent business interruptions, we recommend that you update the NGINX Ingress controller in a progressive way. This allows you to check the status of your workloads and roll back the update if exceptions occur.

Phase 1: Precheck

A precheck is automatically performed to check whether the NGINX Ingress controller meets the prerequisites for updates. If the NGINX Ingress controller does not meet the prerequisites or the status of the component is unhealthy, you must manually fix the issues before you can update the component.

Phase 2: Verification

A pod is created for the new version of the NGINX Ingress controller. The pod is used to verify the status and the Ingress rules of the new version. After the pod is created, a proportion of user traffic is distributed to the pod. You can analyze the container logs or use Simple Log Service or Managed Service for Prometheus to check whether the user traffic is processed as expected.

After a pod is created for the new version, the update process is paused. After you confirm that no exceptions occur to the component and workloads, you can manually resume the update process. If you identify an exception, you can roll back the update and delete the pod for the new version.

To roll back an update, modify the spec.minReadySeconds and spec.strategy parameters in the Deployment.

Phase 3: Release

A rolling update is performed during the release phase to replace the old version of the NGINX Ingress controller with the new version. After all pods of the NGINX Ingress controller are updated, the update process is paused to allow you to check the status of the component and workloads. After you confirm that no exceptions occur, you can complete the update. If you identify an exception, you can roll back the NGINX Ingress controller in all pods to the old version and then end the update.

Phase 4: Rollback (optional)

The update process is automatically paused during the verification and release phases to allow you to identify exceptions that occur during the update. If you identify an exception, you can roll back the NGINX Ingress controller to the old version.

Prerequisites

Before you update the NGINX Ingress controller, make sure that you have approaches to monitor the business traffic and identify exceptions. You can use Simple Log Service or Managed Service for Prometheus to monitor the business traffic. For more information, see Analyze and monitor the access log of nginx-ingress-controller and Managed Service for Prometheus.
Make sure that the status of the NGINX Ingress controller is healthy, all pods of the NGINX Ingress controller are in the Ready state, and no error logs are generated.
Make sure that no auto scaling rules, such as HorizontalPodAutoscaler (HPA), are used. If auto scaling rules exist, delete the auto scaling rules, complete the update, and then recreate the rules.
Make sure that the LoadBalancer Service runs as expected.
Do not modify the NGINX Ingress controller or Ingress rules during the update.
If the version of your NGINX Ingress is earlier than 0.44, take note of the differences in prefix matching when you update it. For more information, see Update notes.
Ensure that the cluster has sufficient nodes to allow the NGINX Ingress pod to be properly created and scheduled. The component is updated using the canary release strategy. First, a pod running the target version of NGINX Ingress is created. After traffic is verified, click Continue to trigger the rolling update.

Procedure

Log on to the ACK console. In the left-side navigation pane, click Clusters.
On the Clusters page, find the cluster that you want to manage and click its name. In the left-side navigation pane, choose Operations > Add-ons.
On the Add-ons page, find the NGINX Ingress controller and click Upgrade in the lower-right part.
On the page that appears, confirm the information and click Start to initiate an update.
Note
You can return to the update page anytime during the update process, or click Progress on the Add-ons page to view the update progress.
A precheck is performed in Phase 1. After the precheck is complete, the update automatically proceeds to Phase 2.
If the NGINX Ingress controller fails to pass the precheck, you can click View Details below Precheck to go to the View Report page. Then, troubleshoot the issues displayed on this page. For more information, see the Precheck section of this topic. After you fix the issues, click Retry to reinitiate an update.
After the verification phase ends, the update process is paused. You can check the status of the NGINX Ingress controller and workloads. For more information, see the Verification section of this topic.
- For more information about how to fix the issue that pods fail to be created, see the What do I do if pods fail to be created during the verification and release phases? section of this topic. You can also analyze the pod error log to identify the cause of failure to start the pod. After you fix the issue, click Retry to perform a verification again.
- If exceptions occur to your workloads, click Roll Back to roll back the update. After the rollback is complete, the update process ends. You can click Upgrade on the Add-ons page to reinitiate an update.
After the verification is passed, click Continue to proceed to the release phase. After the rolling update during the release phase is complete, the update process is paused. You can check the status of the NGINX Ingress controller and workloads. If exceptions occur to your workloads, click Roll Back to roll back the update. After the rollback is complete, the update process ends. You can click Upgrade on the Add-ons page to reinitiate an update.
If no exceptions occur, click Continue to complete the update.
Note
Make sure that the update process is complete within one week.

Precheck

Precheck table

Check Item	Content	Troubleshooting
Deployment Exist	Check whether the Deployment (kube-system/nginx-ingress-controller) of the NGINX Ingress controller exists.	N/A
Deployment Healthy	Check whether all pods controlled by the Deployment of the NGINX Ingress controller are in the Ready state. Make sure that these pods are not performing rolling updates or running in other unstable states.	N/A
Error Log	Check the most recent 200 log entries in the pod error logs. Make sure that no errors with a severity level of Error or Fatal exist.	If the preceding errors exist, exceptions recently occurred on the NGINX Ingress controller due to invalid configurations. You must fix the issue and then reinitiate an update. For more information, see NGINX Ingress controller troubleshooting.
LoadBalancer Service Healthy	Check whether the LoadBalancer Service (kube-system/nginx-ingress-lb) of the NGINX Ingress controller exists. If the LoadBalancer Service exists, check whether error events are generated for the LoadBalancer Service. If the LoadBalancer Service does not exist, a Warning event is generated.	If the LoadBalancer Service does not exist, refer to the solution to the "Manually delete the `nginx-ingress-lb` Service in the kube-system namespace of a cluster that has the NGINX Ingress controller installed" issue in Usage notes and instructions on high-risk operations. If error events are generated for the LoadBalancer Service, resolve the issue based on the content of the event. For more information, see the Service errors and solutions section of the "Service troubleshooting" topic. This check item is skipped if the type of Service is not LoadBalancer.
HPA	Check whether the Deployment of the NGINX Ingress controller uses an HPA. The HPA may trigger scaling activities during the update process and adversely affect the update.	Therefore, you must delete HPA resources from the cluster before you update the NGINX Ingress controller and then reconfigure HPA after the update ends.
Deployment	Check whether the Deployment of the NGINX Ingress controller includes only compatible changes.	The update cannot retain your custom changes made to the Deployment of the NGINX Ingress controller. Only the following parameters are retained as custom parameters in the Deployment after the NGINX Ingress controller is updated: replicas: the number of replicated pods. template.metadata.labels: pod labels. template.spec.nodeSelector: the pod selector. template.spec.tolerations: tolerations. template.spec.containers[0].resources: resource limits for the containers of the NGINX Ingress controller. The following parameters do not affect the precheck but are deprecated after the NGINX Ingress controller is updated: The redeploy-timestamp parameter in the template.metadata.annotations section. The kubectl.kubernetes.io/restartedAt annotation. The scheduler.alpha.kubernetes.io/critical-pod annotation. The imagePullPolicy parameter. The template.spec.containers.securityContext.procMount parameter if the value of the parameter is set to Default. Webhook configurations, including runtime parameters, volumes, and ports. Modifying the preceding parameters in the Deployment of the NGINX Ingress controller does not cause this check item to show FAIL. If you make changes to the Deployment other than the preceding parameters, use an outdated version, or update the NGINX Ingress controller without meeting all update requirements, this check item shows FAIL. The following list describes the common reasons why this check item shows FAIL: You mounted custom volumes by using Enterprise Distributed Application Service (EDAS). You must suspend EDAS during the update. The podAntiAffinity settings are different from the podAntiAffinity settings in the standard template. This may be caused by modifications to the template. For example, the podAntiAffinity settings are changed from `required` to `preferred`. You must manually modify the podAntiAffinity settings to ensure that the podAntiAffinity settings are the same as the podAntiAffinity settings in the standard template. The nodeAffinity parameter is added to specify exclusive nodes. You must use the nodeSelector parameter instead. If this check item shows FAIL, you can manually restore the Deployment template. For more information, see the What do I do if the Deployment template fails to pass the check? section of this topic.
Ingress Configuration	Check whether the Ingresses in the cluster include only compatible features.	If the Ingresses in the cluster use incompatible features, the NGINX Ingress controller may fail to distribute user traffic as expected after the NGINX Ingress controller is updated. As a result, a service interruption may occur. For more information about how to fix compatibility issues related to Ingresses, see the Compatibility issues section of this topic.
Component Configuration	Check whether the nginx-configuration ConfigMap in the kube-system namespace contains incompatible configurations.	If the ConfigMap contains incompatible configurations, the NGINX Ingress controller may fail to distribute user traffic as expected after the NGINX Ingress controller is updated. As a result, a service interruption may occur. For more information about how to resolve compatibility issues related to Ingresses, see the Compatibility issues section of this topic.

Incompatibility issues

During development and maintenance, new versions of the NGINX Ingress controller may introduce new features, enhance existing features, or fix security issues. However, these updates may also lead to compatibility issues with previous versions due to changes in internal architecture or variations in dependency libraries. For more information about the release notes of the NGINX Ingress controller, see NGINX Ingress controller.

snippet annotations are disallowed by default

Affected versions: versions earlier than v1.9.3-aliyun.1.

For security purposes, NGINX Ingress controller v1.9.3-aliyun.1 and later versions disallow snippet annotations.

nginx.ingress.kubernetes.io/configuration-snippet
nginx.ingress.kubernetes.io/server-snippet
nginx.ingress.kubernetes.io/stream-snippet
nginx.ingress.kubernetes.io/auth-snippet
nginx.ingress.kubernetes.io/modsecurity-snippet

To ensure security and stability, if you want to use specific features, we recommend that you use the corresponding annotations or settings instead of snippet annotations.

If you want to use snippet annotations, assess the risks and add allow-snippet-annotations: "true" to the kube-system/nginx-configuration ConfigMap to allow snippet annotations.

Incompatibility with earlier TLS versions

Affected versions: versions earlier than v1.7.0-aliyun.1.

TLS 1.1 and earlier versions have security issues. The NGINX Ingress controller no longer supports the following TLS versions: TLS 1.1 and TLS 1.0. Before you update the NGINX Ingress controller, make sure that your service is not reliant on TLS 1.1 or earlier versions, and TLS 1.1 and earlier versions are removed from the TLS settings of your service. Changes to the ConfigMap take effect immediately.

The following content shows the sample TLS settings in the nginx-configuration ConfigMap of the NGINX Ingress controller in the kube-system namespace:

ssl-protocols: SSLv3 SSLv2 TLSv1 TLSv1.1 TLSv1.2 TLSv1.3

Make sure that your service is not reliant on TLS 1.1 or earlier versions. You can delete the command line to use the default TLS settings. You can also remove TLS 1.1, TLS 1.0, SSL 3.0, and SSL 2.0 from the TLS settings. Example:

ssl-protocols: TLSv1.2 TLSv1.3

If you want to use earlier TLS versions, complete the configurations by referring to the required documents. For more information, see the Which SSL or TLS protocol versions are supported by Ingresses? section of the "Nginx Ingress FAQ" topic.

Incompatibility issue related to the nginx.ingress.kubernetes.io/rewrite-target annotation

Affected versions: versions earlier than 0.22.0.

NGINX Ingress controller 0.22.0 adds a change to the way in which the nginx.ingress.kubernetes.io/rewrite-target annotation is used. In NGINX Ingress controller 0.22.0 and later versions, you need to explicitly specify a capture group if you want to use the rewrite-target annotation.
The rewrite-target annotation in NGINX Ingress controller versions earlier than 0.22.0 is incompatible with the rewrite-target annotation in the later versions. Before you update the NGINX Ingress controller, replace rewrite-target with configuration-snippet.

For example, your NGINX Ingress controller version is earlier than 0.22.0 and you use the following Ingress rules:

View YAML content

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  annotations:
    nginx.ingress.kubernetes.io/rewrite-target: /
  name: rewrite
  namespace: default
spec:
  rules:
  - host: rewrite.bar.com
    http:
      paths:
      - path: /something/
        pathType: Prefix
        backend:
          service:
            name: http-svc
            port:
              number: 80
      - path: /something123/
        pathType: Prefix
        backend:
          service:
            name: http-svc-1
            port:
              number: 80

Modify the Ingress rules based on the following content:

View YAML content

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  annotations:
    # Use the rewrite annotation. /something indicates the path, excluding the forward slash (/) at the end. 
    # If the Ingress includes multiple paths, add the same number of rewrite annotations. 
    nginx.ingress.kubernetes.io/configuration-snippet: |
      rewrite "(?i)/something(/|$)(.*)" /$2 break;
      rewrite "(?i)/something123(/|$)(.*)" /$2 break;
  name: rewrite
  namespace: default
spec:
  rules:
  - host: rewrite.bar.com
    http:
      paths:
      - path: /something/ # Keep the value the same as the path of the Ingress in the earlier version. 
        pathType: Prefix
        backend:
          service:
            name: http-svc
            port:
              number: 80
      - path: /something123/ # Keep the value the same as the path of the Ingress in the earlier version. 
        pathType: Prefix
        backend:
          service:
            name: http-svc-1
            port:
              number: 80

You can start to update the NGINX Ingress controller after you modify the Ingress rules. After the update is complete, modify the Ingress rules based on the following content:

YAML

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  annotations:
    # Reference the matching content. 
    nginx.ingress.kubernetes.io/rewrite-target: /$2
  name: rewrite
  namespace: default
spec:
  rules:
  - host: rewrite.bar.com
    http:
      paths:
      # Use a capture group. 
      - path: /something(/|$)(.*)
        pathType: Prefix
        backend:
          service:
            name: http-svc
            port:
              number: 80
      # Use a capture group. 
      - path: /something123(/|$)(.*)
        pathType: Prefix
        backend:
          service:
            name: http-svc-1
            port:
              number: 80

Verification

Verify the status of the NGINX Ingress controller and workloads

In addition to the monitoring capability of ACK, ACK also allows you to use Simple Log Service, Managed Service for Prometheus dashboards, and container logs to monitor the status of the NGINX Ingress controller. To activate the preceding services, see Analyze and monitor the access log of nginx-ingress-controller and Managed Service for Prometheus.

Simple Log Service

You can view the logs collected by Simple Log Service in the ACK console.

Log on to the ACK console. In the left-side navigation pane, click Clusters.
On the Clusters page, find the cluster that you want to manage and click its name. In the left-side pane, choose Operations > Log Center.
Click the Application Logs tab. Then, select nginx-ingress from the Logstore drop-down list and click Select Logstore.
Note
If nginx-ingress does not exist, check whether the log collection feature is enabled for the NGINX Ingress controller. For more information, see Analyze and monitor the access log of nginx-ingress-controller.

You can view the access logs of applications in the log list or specify a pod in the query statement to view the access logs of the pod. For example, you can specify the pod of the new NGINX Ingress controller version in the query statement. Make sure that the success rate of requests to the new pod is as expected and the number of requests sent to the new pod is the same as that sent to the old pod. If the statistics of the new pod significantly differ from those of the old pod, roll back the update.

Note

If a request fails to match an Ingress rule, a 404 error is returned. By default, the error is not recorded in the access logs.

Managed Service for Prometheus dashboards

You can use the dashboards provided by Managed Service for Prometheus to monitor the requests processed by the NGINX Ingress controller.

Log on to the ACK console. In the left-side navigation pane, click Clusters.
On the Clusters page, find the cluster that you want to manage and click its name. In the left-side pane, choose Operations > Prometheus Monitoring.
Click the Network Monitoring tab, and then click the Ingresses tab.
Note
If the Ingresses tab is not displayed, check whether Prometheus metric collection is configured for the NGINX Ingress controller. For more information, see Managed Service for Prometheus.

You can view the metrics of Ingresses on the dashboard or view the metrics of a specific pod. Make sure that the success rate of requests to the new pod is normal and the number of requests sent to the new pod is the same as that sent to the old pod. If the statistics of the new pod significantly differ from those of the old pod, roll back the update.

Note

If no host is specified in the rules of an Ingress, the metrics of the Ingress are not collected. The default host is set to an asterisk (*).

Pod logs

You can use kubectl to print pod logs for troubleshooting.

Run the following command to print pod logs that contain NGINX errors with severity levels of Warn, Error, and Crit:
```
kubectl logs -n kube-system <Pod name> | grep -e warn -e error -e crit
```
Run the following command to print pod logs that contain controller errors:
```
kubectl logs -n kube-system <Pod name> | grep "^[EF]"
```

Update notes

When you update the NGINX Ingress controller, take note of changes in prefix matching. Make sure that the path configuration aligns with the request URL. Otherwise, the status code 404 will be returned for requests.

Known issues

Differences in prefix matching may exist across different versions of the NGINX Ingress controller, potentially leading to service access issues.

In versions earlier than 0.44, the prefix matching rules are loose. For example, the prefix /aaa/bbb can match /aaa/bbbbb in a request URL.
After the update, the prefix matching rules become stricter and match only exact request URLs. In the preceding example, /aaa/bbbbb in the request URL is not matched. Instead, the status code 404 is returned.

Scope of impacts

NGINX Ingress controller earlier than v0.44 are affected. For more information about release notes, see NGINX Ingress controller. For related pull request information, see kubernetes/ingress-nginx #6443.

Sample configuration

You can use the following YAML template to compare the NGINX configurations rendered by NGINX Ingress controllers of different versions, such as 0.22.0.5-552e0db-aliyun (earlier version) and 1.2.1-aliyun.1+ (later version).

  apiVersion: extensions/v1beta1
  kind: Ingress
  metadata:
    labels:
      ingress-controller: nginx
    name: test
    namespace: default
  spec:
    ingressClassName: nginx
    rules:
    - host: www.example.com
      http:
        paths:
        - backend:
            service:
              name: api-resources-svc
              port:
                number: 8080
          path: /api/resource

In earlier versions
In version such as 0.22.0.5-552e0db-aliyun, the NGINX configuration is:
```
Location /api/resource   # The path does not end with a forward slash (/). 
```
In the preceding configuration, the path /api/resource allows access from http://www.example.com/api/resources in earlier versions of NGINX.
Note
The actual request path is resources instead of resource.
In later versions
If the NGINX Ingress controller is updated to a later version such as 1.2.1-aliyun.1+, the NGINX configuration is:
```
Location /api/resource/  # The path ends with a forward slash (/).
{
}
...
Location = /api/resource  # Another location is added for precise matching.
{
}
```
HTTP status code 404 is returned if you try to access http://www.example.com/api/resources.

FAQ

Can I update the NGINX Ingress controller to a specific version? After a successful update, can I roll back to a previous version?

You cannot update the NGINX Ingress controller to a specific version. The NGINX Ingress controller is updated progressively and can only be updated to the latest version. Once the update is complete, it cannot be rolled back.

What do I do if pods fail to be created during the verification and release phases?

Cause	Solution
An exception occurs when ACK starts the pod for the new NGINX Ingress controller version. For example, ACK fails to load the configuration of the pod. As a result, the pod remains in the crash state.	Use the preceding methods to print the pod log and troubleshoot the issue. For more information, see NGINX Ingress controller troubleshooting.
Pod scheduling failures usually occur when you deploy the NGINX Ingress controller on a dedicated node. When ACK creates a pod for the new version of the NGINX Ingress controller, ACK may fail to find a node for the pod due to the resource limits and node selector.	Add nodes to your cluster or reduce the number of pods for the NGINX Ingress controller during off-peak hours before you update the component so that the new pod can be scheduled to a node.

What do I do if the Deployment template fails to pass the check?

If the Deployment template fails to pass the check, click the hyperlink to the right of Cause to go to the Component Comparison page. You can view the component parameters that failed to pass the check.

On the NGINX Ingress Controller Update page, click View Details below Precheck.
In the Cluster Components Result section of the Report page, click the result in the red box marked with ①. On the Result tab, click Deployment Template. Then, click the hyperlink to the right of Cause marked with ②.
On the Component Comparison page, you can view the component parameters that failed to pass the check.
On the Component Comparison page, the standard component template is displayed on the left and the current component template is displayed on the right. The Component Comparison page displays the differences between the templates, including compatible and incompatible parameters, and lists the differences in the lower part of the page. The Component Comparison page also shows whether the component passes the check and displays the parameters that cause the differences.
In the following figure, a difference is found between the templates and the parameter that causes the difference is .spec.template.spec.containers.(nginx-ingress-controller).args. nginx-ingress-controller is the name of the container to which the parameter belongs. The comparison result shows that the --v=2 is specified in the args parameter in the standard template, whereas --v=3 is specified in the args parameter in the current template. You must modify the args parameter before you update the component.
Modify the parameters that cause the differences.
In the left-side navigation pane, choose Workloads > Deployments. On the Deployments page, find the NGINX Ingress controller and choose More > View in YAML in the Actions column. In the Edit YAML dialog box, change the value of the args parameter from --v=3 to --v=2.
After you modify the args parameter, you can refresh the Component Comparison page to check whether the difference is fixed. If The component passes the comparison check. is displayed in the lower part of the page, the Deployment template passes the check.
Note
The system restarts the pods of the NGINX Ingress controller to apply the modifications to the Deployment. We recommend that you perform the related operations during off-peak hours.

References

For more information about the release notes of the NGINX Ingress controller, see NGINX Ingress controller.
If problems such as network failures occur when you use the Nginx Ingress controller, refer to the related topics first. For more information, see Configure security groups for clusters and NGINX Ingress FAQs.