This topic describes the procedure for diagnosing the NGINX Ingress controller and how to troubleshoot issues related to the NGINX Ingress controller. This topic also describes common diagnostics methods and provides answers to some frequently asked questions (FAQ) about the NGINX Ingress controller.
Table of contents
Background information
Container Service for Kubernetes (ACK) provides the NGINX Ingress controller that is optimized based on the open source version. The NGINX Ingress controller provided by ACK is compatible with the open source version and supports all the annotations provided by the open source version. You can install the NGINX Ingress controller when you create an ACK cluster.
Ingresses can work as normal only if you deploy an NGINX Ingress controller in the cluster to parse the routing rules of the Ingresses. After the NGINX Ingress controller receives a request that matches a routing rule, the NGINX Ingress controller routes the request to a corresponding backend Service. The backend Service then forwards the request to pods. In a Kubernetes cluster, Services, Ingresses, and the NGINX Ingress controller work in the following process:
A Service is an abstraction of a backend application that runs on a set of replicated pods.
An Ingress contains reverse proxy rules. It controls to which Service pods HTTP or HTTPS requests are routed. For example, requests are routed to different Service pods based on the hosts and URL paths in the requests.
The NGINX Ingress controller is a reverse proxy program that parses Ingress rules. If changes are made to the Ingress rules, the NGINX Ingress controller updates the Ingress rules accordingly. After the NGINX Ingress controller receives a request, it redirects the request to Service pods based on the Ingress rules.
The NGINX Ingress controller acquires Ingress rule changes from the API server and dynamically generates configuration files, such as nginx.conf. These configuration files are required by a load balancer, such as NGINX. Then, the NGINX Ingress controller reloads the load balancer. For example, the NGINX Ingress controller runs the nginx -s load
command to reload NGINX and generates new Ingress rules.
Diagnostics procedure
You can perform the following steps to check whether an issue is caused by the Ingress. Make sure that the configuration of the Ingress controller is valid.
Check whether you can access a specific pod from the controller pod. For more information, see the Manually access the Ingress and backend pod by using the Ingress controller pod section of this topic.
Check whether the NGINX Ingress controller is properly configured. For more information, see Documentation for the NGINX Ingress controller.
Use the Ingress diagnostics feature to check the configurations of the Ingress and components. Then, modify the configurations as prompted. For more information, see the Use the Ingress diagnostics feature section of this topic.
Locate the cause of the issue and refer to the relevant solution described in the Troubleshooting section of this topic.
If the issue persists, perform the following steps:
Issues that are related to TLS certificates:
Check whether the domain name is added to Web Application Firewall (WAF) in CNAME record mode or transparent proxy mode.
If the domain name is added to WAF, check whether the domain name has a TLS certificate.
If the domain name is not added to WAF, proceed to the next step.
Check whether a Layer 7 listener is configured for the Server Load Balancer (SLB) instance.
If a Layer 7 listener is configured for the SLB instance, check whether a TLS certificate is associated with the listener.
If no Layer 7 listener is configured for the SLB instance, proceed to the next step.
If the issue is not related to TLS certificates, check the error log of the controller pod. For more information, see the Diagnose the error log of the NGINX Ingress controller pod section of this topic.
If the issue persists, locate the cause of the issue by capturing packets in the controller pod and the backend pod. For more information, see the Capture packets section of this topic.
If the issue persists, submit a ticket.
Troubleshooting
Troubleshooting | Issue | Solution |
Issues related to connectivity | Pods in a cluster cannot access the Ingress. | Why am I unable to access the IP address of the LoadBalancer Service within the Kubernetes cluster? |
The Ingress controller cannot be accessed. | Why does the Ingress controller pod fail to access the Ingress controller? | |
TCP and UDP Services cannot be accessed. | ||
Issues related to HTTPS requests | The certificate is not updated or the default certificate is returned. | |
The following error is returned: | What do I do if an error "SSL_ERROR_RX_RECORD_TOO_LONG" is returned for HTTPS requests? | |
Errors occur when you create an Ingress | The following error occurs when you create an Ingress: "failed calling webhook...". | What do I do if an error "failed calling webhook" occurs when I create an Ingress? |
An Ingress is created but fails to take effect. | ||
Access fails to meet your expectations | Client IP addresses cannot be preserved. | Why does the Ingress controller pod fail to preserve client IP addresses? |
IP address whitelists do not take effect or fail to meet your expectation. | ||
You fail to access gRPC Services that are exposed by an Ingress. | Why am I unable to access gRPC Services that are exposed by an Ingress? | |
Canary release rules do not take effect. | ||
Canary release rules are invalid or other traffic is distributed to backend pods that are associated with the canary Ingress. | ||
The following error occurs: | ||
A 502, 503, 413, or 499 status code is returned. | ||
Some pages cannot be displayed. | The | |
The | What do I do if an error "net::ERR_HTTP2_SERVER_REFUSED_STREAM" occurs? |
Commonly used diagnostics methods
Use the Ingress diagnostics feature
Log on to the ACK console. In the left-side navigation pane, click Clusters.
On the Clusters page, find the cluster that you want to manage and click its name. In the left-side pane, choose .
On the Diagnostics page, click Ingress diagnosis.
On the Ingress diagnosis page, click Diagnosis. In the Select Ingress panel, enter the URL that cannot be accessed, such as https://www.example.com. Select I know and agree and then click Create diagnosis.
After the diagnostics is complete, you can view the diagnostics results and try to fix the issue.
Diagnose the access log of the NGINX Ingress controller pod in Simple Log Service
You can check the access log format of the NGINX Ingress controller in the nginx-configuration ConfigMap in the kube-system namespace.
The following sample code shows the default format of the access log of the NGINX Ingress controller:
$remote_addr - [$remote_addr] - $remote_user [$time_local]
"$request" $status $body_bytes_sent "$http_referer" "$http_user_agent" $request_length
$request_time [$proxy_upstream_name] $upstream_addr $upstream_response_length
$upstream_response_time $upstream_status $req_id $host [$proxy_alternative_upstream_name]
After you modify the log format, you must modify the log collection rules of Simple Log Service accordingly. Otherwise, the access log of the NGINX Ingress controller cannot be collected to Simple Log Service. Exercise caution when you modify the log format.
The following figure shows the page on which the access log of the NGINX Ingress controller is displayed in the Simple Log Service console. For more information, see the Step 4: View log data section of the "Collect log data from containers by using Simple Log Service" topic.
The following table describes the log fields that are displayed in the Simple Log Service console. Some log fields that are displayed in the console are different from the actual log fields.
Field | Description |
| The IP address of the client. |
| The details of the request, which include the request method, URL, and HTTP version. |
| The amount of time that is consumed to process the request. The consumed amount of time is calculated from the time when the first byte of the client request is received to the time when the last byte of the response is sent. The value of this field varies based on the network conditions of the client and therefore does not reflect the request processing capability. |
| The IP address of the upstream server. If no upstream server receives the request, the returned value is empty. If the request is sent to multiple upstream servers due to server failures, multiple IP addresses that are separated by commas (,) are returned. |
| The HTTP status code in the response from the upstream server. If the HTTP status code indicates a successful request, the upstream server can be accessed. If a 502 status code is returned, no upstream server can be accessed. Multiple status codes are separated by commas (,). |
| The response time of the upstream server. Unit: seconds. |
| The name of the upstream server. The name is in the following format: |
| The name of the alternative upstream server. If the request is forwarded to an alternative upstream server, the name of the alternative upstream server is returned. For example, you implement a canary release. |
By default, you can run the following command to query the recent access log of the NGINX Ingress controller:
kubectl logs <controller pod name> -n <namespace> | less
Expected output:
42.11.**.** - [42.11.**.**]--[25/Nov/2021:11:40:30 +0800]"GET / HTTP/1.1" 200 615 "_" "curl/7.64.1" 76 0.001 [default-nginx-svc-80] 172.16.254.208:80 615 0.000 200 46b79dkahflhakjhdhfkah**** 47.11.**.**[]
42.11.**.** - [42.11.**.**]--[25/Nov/2021:11:40:31 +0800]"GET / HTTP/1.1" 200 615 "_" "curl/7.64.1" 76 0.001 [default-nginx-svc-80] 172.16.254.208:80 615 0.000 200 fadgrerthflhakjhdhfkah**** 47.11.**.**[]
Diagnose the error log of the NGINX Ingress controller pod
You can diagnose the error log of the NGINX Ingress controller pod to narrow down the scope of troubleshooting. The error log of the Ingress controller pod includes the following types:
The log that records errors of the Ingress controller. Typically, this type of error log is generated due to invalid Ingress configurations. You can run the following command to query this type of error log:
kubectl logs <controller pod name> -n <namespace> | grep -E ^[WE]
NoteDuring the initialization of an Ingress controller, a few warning events may be generated. For example, if you do not specify the kubeconfig file or IngressClass resource, warning events are generated. These events do not affect the Ingress controller and can be ignored.
The log that records errors of the NGINX application. Typically, this type of error log is generated due to failed requests. You can run the following command to query this type of error log:
kubectl logs <controller pod name> -n <namespace> | grep error
Manually access the Ingress and backend pod by using the Ingress controller pod
Run the following command to log on to the Ingress controller pod:
kubectl exec <controller pod name> -n <namespace> -it -- bash
The Ingress controller pod is preinstalled with curl and OpenSSL, which allow you to test network connectivity and verify certificates.
Run the following command to test the network connectivity between the Ingress and the backend pod:
# Replace your.domain.com with the actual domain name of the Ingress. curl -H "Host: your.domain.com" http://127.0.**.**/ # for http curl --resolve your.domain.com:443:127.0.0.1 https://127.0.0.1/ # for https
Run the following command to verify the certificate:
openssl s_client -servername your.domain.com -connect 127.0.0.1:443
Test access to the backend pod.
NoteAn Ingress controller directly connects to the IP address of the backend pod instead of by using a ClusterIP Service.
Run the following kubectl command to query the IP address of the backend pod:
kubectl get pod -n <namespace> <pod name> -o wide
Expected output:
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES nginx-dp-7f5fcc7f-**** 1/1 Running 0 23h 10.71.0.146 cn-beijing.192.168.**.** <none> <none>
The output shows that the IP address of the backend pod is 10.71.0.146.
To test the network connectivity between the Ingress controller pod and the backend pod, run the following command to connect to the IP address by using the Ingress controller pod:
curl http://<your pod ip>:<port>/path
Run commands to troubleshoot the NGINX Ingress controller
kubectl-plugin
The Ingress controller for Kubernetes originally used NGINX. The Ingress controller of version 0.25.0 and later uses OpenResty. The Ingress controller automatically listens to the changes in the configurations of the Ingresses on the API server, generates the corresponding NGINX configurations, and then reloads the configurations to make them take effect. For more information, see Documentation for the NGINX Ingress controller.
As the number of Ingresses increases and all configurations are aggregated in a single configuration file, the configuration file is excessively large and difficult to be debugged. Especially, the Ingress controller of version 0.14.0 and later uses lua-resty-balancer to dynamically generate the configurations of the upstream servers, which makes it more difficult to debug the configuration file. Therefore, the community provides a kubectl plug-in ingress-nginx to simplify the debugging of the configurations of NGINX Ingresses. For more information, see The ingress-nginx kubectl plugin.
Run the following command to obtain the backends that the NGINX Ingress controller has known:
kubectl ingress-nginx backends -n ingress-nginx
dbg commands
In addition to using the ingress-nginx kubectl plugin, you can run the
dbg
commands to view and diagnose related information.Run the following command to access the NGINX Ingress controller pod:
kubectl exec -itn kube-system <nginx-ingress-pod-name> bash
Run the
/dbg
command. The following output is returned.nginx-ingress-controller-69f46d8b7-qmt25:/$ /dbg dbg is a tool for quickly inspecting the state of the nginx instance Usage: dbg [command] Available Commands: backends Inspect the dynamically-loaded backends information certs Inspect dynamic SSL certificates completion Generate the autocompletion script for the specified shell conf Dump the contents of /etc/nginx/nginx.conf general Output the general dynamic lua state help Help about any command Flags: -h, --help help for dbg --status-port int Port to use for the lua HTTP endpoint configuration. (default 10246) Use "dbg [command] --help" for more information about a command.
Run the following command to check whether the certificate associated with a domain name exists:
/dbg certs get <hostname>
Run the following command to view the information about all backends:
/dbg backends all
Check the status of the NGINX Ingress controller
NGINX contains a self-check module that can generate operation statistics. In the NGINX Ingress controller pod, you can run the curl command to access the NGINX Ingress controller that uses port 10246. This way, you can view the statistics on the requests and connections of the NGINX Ingress controller.
Run the following command to access the NGINX Ingress controller pod:
kubectl exec -itn kube-system <nginx-ingress-pod-name> bash
Run the following command to view the statistics on the requests and connections of the NGINX Ingress controller:
nginx-ingress-controller-79c5b4d87f-xxx:/etc/nginx$ curl localhost:10246/nginx_status Active connections: 12818 server accepts handled requests 22717127 22717127 823821421 Reading: 0 Writing: 382 Waiting: 12483
The preceding output shows that after the NGINX Ingress controller starts, it has received and processed 22,717,127 connections, and each connection is successful without immediate shutdown. During the 22,717,127 connections, 823,821,421 requests are handled, which indicates that about 36.2 requests are handled during each connection on average.
Active connections: the total number of active connections of the server on which the NGINX Ingress controller is deployed. In the preceding output, the number of active connections is 12,818.
Reading: the number of connections during which the server on which the NGINX Ingress controller is deployed is reading the request headers. In the preceding output, the number of such connections is 0.
Writing: the number of connections during which the server on which the NGINX Ingress controller is deployed is sending responses. In the preceding output, the number of such connections is 382.
Waiting: the number of connections in the keep-alive state. In the preceding output, the number of such connections is 12,483.
Capture packets
If you cannot identify the issue, capture and diagnose packets.
Check whether the issue is related to the Ingress controller pod or the application pod. If no sufficient information is available, capture packets for both the Ingress controller pod and the application pod.
Log on to the nodes on which the application pod and Ingress controller pod run.
Run the following command on the Elastic Compute Service (ECS) instances to capture all recent packets that are received by the Ingress:
tcpdump -i any host <IP address of the application pod or Ingress controller pod> -C 20 -W 200 -w /tmp/ingress.pcap
If an error is identified in the log data, stop capturing packets.
Diagnose the packets that are transferred during the time period in which the error occurred.
NotePacket capture does not affect your service and only causes a slight increase in the CPU utilization and disk I/O.
The preceding command rotates the captured packets and can generate up to 200 .pcap files, each of which is 20 MB in size.
Why am I unable to access the IP address of the LoadBalancer Service within the Kubernetes cluster?
Issue
In a Kubernetes cluster, only specific nodes can access the IP address of the LoadBalancer Service if the externalTrafficPolicy parameter is set to Local for the LoadBalancer Service.
Cause
externalTrafficPolicy: Local
is set for the LoadBalancer Service. In Local mode, the IP address of the LoadBalancer Service is accessible only from pods that are provisioned on the local node that runs the LoadBalancer Service. The IP address of the LoadBalancer Service is inaccessible from pods on other nodes in the cluster. The IP address of the LoadBalancer Service enables external access to the Kubernetes cluster. If nodes or pods in the ACK cluster cannot access the IP address without using a second hop, requests do not pass through the LoadBalancer Service. As a result, the IP address of the LoadBalancer Service is considered an extended IP address of the Service that uses the LoadBalancer Service. Requests are forwarded by kube-proxy based on iptables or IP Virtual Server (IPVS).
In this scenario, if the requested pod is not provisioned on the local node, a connectivity issue occurs. The IP address of the LoadBalancer Service is accessible only if the requested pod is provisioned on the local node. For more information about external-lb, see kube-proxy adds the IP address of external-lb to the node local iptables rules.
Solution
We recommend that you access the IP address of the LoadBalancer Service within the Kubernetes cluster by using the ClusterIP Service or the Ingress name.
The Ingress is named
nginx-ingress-lb
in the kube-system namespace.Run the
kubectl edit svc nginx-ingress-lb -n kube-system
command to modify the Ingress. Set theexternalTrafficPolicy
parameter toCluster
for the LoadBalancer Service. After you change the setting, client IP addresses cannot be preserved.If the cluster uses the Terway network plug-in and the exclusive or inclusive elastic network interface (ENI) mode is enabled for the cluster, you can set the
externalTrafficPolicy
parameter toCluster
for the LoadBalancer Service and add the ENI annotation. The annotation adds pods that are assigned ENIs as the backend servers of the LoadBalancer Service. This way, client IP addresses can be preserved and the IP address of the LoadBalancer Service can be accessed within the cluster.Sample code:
apiVersion: v1 kind: Service metadata: annotations: service.beta.kubernetes.io/backend-type: eni # Specify pods that are assigned ENIs as backend servers. labels: app: nginx-ingress-lb name: nginx-ingress-lb namespace: kube-system spec: externalTrafficPolicy: Cluster
For more information about Service annotations, see Add annotations to the YAML file of a Service to configure CLB instances.
Why does the Ingress controller pod fail to access the Ingress controller?
Issue
In a cluster that uses Flannel, when the Ingress controller pod accesses the Ingress controller bsaed on a domain name, an IP address of an SLB instance, or cluster IP address, some or all of the requests sent from the Ingress controller pod fail.
Cause
By default, Flannel does not allow loopback requests.
Solution
We recommend that you create a new cluster that uses the Terway network plug-in and migrate your workloads to the new cluster.
If you do not want to create a new cluster, you can enable
hairpinMode
in the configurations of Flannel. After you modify the configurations, recreate the Flannel pod to make the modification take effect.Run the following command to modify the configurations of Flannel:
kubectl edit cm kube-flannel-cfg -n kube-system
In the cni-conf.json file that is returned, add
"hairpinMode": true
in thedelegate
field.Sample code:
cni-conf.json: | { "name": "cb0", "cniVersion":"0.3.1", "type": "flannel", "delegate": { "isDefaultGateway": true, "hairpinMode": true } }
Run the following command to delete the previous Flannel pod and recreate the Flannel pod:
kubectl delete pod -n kube-system -l app=flannel
Why is the default TLS certificate or the previous TLS certificate used after I add a TLS certificate to the cluster or change the TLS certificate?
Issue
You added a Secret to the cluster or modified a Secret in the cluster, and updated the Secret name in the secretName field in the Ingress. When you access the cluster, the default certificate (Kubernetes Ingress Controller Fake Certificate) or the previous certificate is used.
Cause
The certificate is not returned by the Ingress controller in the cluster.
The certificate is invalid and the Ingress controller cannot load the certificate.
The certificate is returned by the Ingress controller based on the Server Name Indication (SNI) field. The SNI field may not be sent as a part of the TLS handshake.
Solution
Use one of the following methods to check whether the SNI field is sent as a part of the TLS handshake:
Update your browser to a version that supports SNI.
When you run the
openssl s_client
command to check whether the certificate is in use, specify the-servername
parameter.When you run
curl
commands, add hosts or use the--resolve
parameter to map the domain name instead of using the IP address and host request header.
Make sure that no TLS certificate is specified when you connect the website to WAF in CNAME record mode or transparent proxy mode, or no TLS certificate is associated with the Layer 7 listener of the SLB instance. The TLS certificate must be returned by the Ingress controller in the cluster.
Go to the Container Intelligence Service console and diagnose the Ingress. In the diagnostics report, check whether the configurations of the Ingress are valid and whether the log data shows errors. For more information, see the Use the Ingress diagnostics feature section of this topic.
Run the following command to view the error log of the Ingress controller pod and troubleshoot the issue based on the log data.
kubectl logs <ingress pod name> -n <pod namespace> | grep -E ^[EW]
Why am I unable to access gRPC Services that are exposed by an Ingress?
Issue
You cannot access gRPC Services that are exposed by an Ingress.
Cause
You do not set annotations in the Ingress to specify the backend protocol.
gRPC Services can be accessed only over Transport Layer Security (TLS).
Solution
Set the following annotation in the Ingress:
nginx.ingress.kubernetes.io/backend-protocol:"GRPC"
.Make sure that clients use HTTPS ports to send requests and the traffic is encrypted by using TLS.
Why am I unable to access backend HTTPS services?
Issue
You fail to access backend HTTPS services based on the Ingress.
A 400 error code may be returned and the following error message may be reported:
The plain HTTP request was sent to HTTPS port
.
Cause
The Ingress controller sends HTTP requests to the backend pods. This is the default setting.
Solution
Set the following annotation in the Ingress: nginx.ingress.kubernetes.io/backend-protocol:"HTTPS"
.
Why does the Ingress controller pod fail to preserve client IP addresses?
Issue
The Ingress controller pod fails to preserve client IP addresses. The node IP address, 100.XX.XX.XX, or other addresses are displayed.
Cause
The
externalTrafficPolicy
parameter is set toCluster
for the Service that is associated with the Ingress.A Layer 7 proxy is used by the SLB instance.
Your website is connected to WAF in CNAME record mode or transparent proxy mode.
Solution
If the
externalTrafficPolicy
parameter is set toCluster
for the Service and a Layer 4 SLB instance is used, perform the following steps:Set the
externalTrafficPolicy
parameter toLocal
. However, you may fail to access the IP address of the LoadBalancer Service within the cluster. For more information, see the Why am I unable to access the IP address of the LoadBalancer Service within the Kubernetes cluster?Perform the following steps if a Layer 7 proxy is used. For example, a Layer 7 SLB instance is used or your website is connected to WAF in CNAME record mode or transparent proxy mode,
Make sure that the X-Forwarded-For header is enabled for the Layer 7 proxy.
Add
enable-real-ip: "true"
to the ConfigMap of the Ingress controller. By default, the ConfigMap is named nginx-configuration and belongs to the kube-system namespace.Analyze the log data to check whether client IP addresses can be preserved.
If a client request traverses multiple hops before it reaches the Ingress controller pod, such as the case that the request must pass through a reverse proxy before it reaches the Ingress controller pod, you can check the value of
remote_addr
after you setenable-real-ip
to true. If the value is a client IP address, the X-Forwarded-For header is enabled to pass client IP addresses to the Ingress controller pod. If the X-Forwarded-For header is disabled, enable the X-Forwarded-For header or use other methods to add client IP addresses to requests before the requests reach the Ingress controller pod.
Why do canary release rules fail to take effect?
Issue
You set canary release rules in a cluster but the rules do not take effect.
Cause
Possible causes:
When you add
canary-*
annotations, you do not setnginx.ingress.kubernetes.io/canary: "true"
.The version of the NGINX Ingress controller is earlier than 0.47.0. When you add
canary-*
annotations, you do not specify the domain name of your application in the host field of the Ingress rules.
Solution
If the issue occurs due to the preceding reasons, set
nginx.ingress.kubernetes.io/canary: "true"
or specify the domain name of your application in the host field of the Ingress rules. For more information, see Use the NGINX Ingress controller to implement canary releases and blue-green releases.If none of the preceding reasons causes the issue, see the What do I do if traffic is not distributed based on canary release rules or traffic from other Ingresses is routed to the canary Service? section of this topic.
What do I do if traffic is not distributed based on canary release rules or traffic from other Ingresses is routed to the canary Service?
Issue
You configured canary release rules but traffic is not distributed based on the canary release rules, or traffic from other Ingresses is routed to the canary Service.
Cause
Canary release rules in an NGINX Ingress controller take effect on all Ingresses that are associated with the Service for which the canary release rules are created.
For more information about this issue, see An Ingress with canary will impact all ingresses with the same Service.
Solution
Canary Ingresses include Ingresses that are assigned the service-match or canary-*
annotations. Before you create a canary Ingress, create two same Services that are used for canary releases, and then map the Services to the backend pods that you want to access. For more information, see Use the NGINX Ingress controller to implement canary releases and blue-green releases.
What do I do if an error "failed calling webhook" occurs when I create an Ingress?
Issue
The following error occurs when you create an Ingress: "Internal error occurred: failed calling webhook...".
Cause
When you create an Ingress, a Service is used to check whether the Ingress is valid. By default, the Service named ingress-nginx-controller-admission is used. If webhook link errors occur, such as the error that the Service or the Ingress controller is deleted, the Ingress cannot be created.
Solution
Check whether the resource exists and works as expected based on the following webhook link: ValidatingWebhookConfiguration > Service > Pod.
Make sure that the admission feature is enabled for the Ingress controller pod and the pod can be accessed by sending external requests outside the cluster.
If the Ingress controller is deleted or you do not want to use the webhook feature, you can delete the ValidatingWebhookConfiguration resource.
What do I do if an error "SSL_ERROR_RX_RECORD_TOO_LONG" is returned for HTTPS requests?
Issue
One of the following errors is returned for HTTPS requests: SSL_ERROR_RX_RECORD_TOO_LONG
or routines:CONNECT_CR_SRVR_HELLO:wrong version number
.
Cause
HTTPS requests are distributed to a non-HTTPS port, such as an HTTP port.
Common causes:
Port 443 of the SLB instance is mapped to port 80 of the Ingress controller pod.
Port 443 of the Service that is associated with the Ingress controller pod is mapped to port 80 of the Ingress controller pod.
Solution
Modify the configurations of the SLB instance or Service to ensure that HTTPS requests can be distributed to the proper port.
What do I do if a common HTTP status code is returned?
Issue
HTTP status codes other than 2xx and 3xx are returned, such as 502, 503, 413, and 499.
Cause and solution
View the log and check whether the error is returned by the Ingress controller. For more information, see the Diagnose the access log of the NGINX Ingress controller pod in Simple Log Service section of this topic. If the error is returned by the Ingress controller, use the following solutions:
413 error
Cause: The request size exceeds the upper limit.
Solution: Increase the value of proxy-body-size in the ConfigMap of the Ingress controller. The default value of proxy-body-size is 1 MB for the NGINX Ingress controller of open source Kubernetes and the default value of proxy-body-size is 20 MB for the NGINX Ingress controller of ACK.
499 error
Cause: The client terminates the connection in advance. The error may not be caused by the Ingress controller or backend services.
Solutions:
If the 499 error does not occur frequently and your workloads are not affected, you can ignore the error.
If the 499 error occurs frequently, you must check whether the amount of time that the backend pods cost to process requests exceeds the request timeout period that is set on the client.
502 error
Cause: The Ingress controller cannot connect to backend pods.
Solutions:
The issue occurs occasionally:
Check whether the backend pods work as expected. If the backend pods are overloaded, add more backend pods.
By default, the Ingress controller sends HTTP 1.1 requests to backend services through HTTP persistent connections. Make sure that the timeout period of idle persistent connections configured for the backend pods is greater than that configured for the Ingress controller. By default, the timeout period is set to 900 seconds.
The issue occurs every time:
Check whether the Service port is valid and whether the Service can be accessed from the Ingress controller pod.
If the issue persists, capture and analyze packets, and then submit a ticket.
503 error
Cause: The Ingress controller cannot discover the backend pods, or the Ingress controller fails to access all backend pods.
Solutions:
The issue occurs occasionally:
Refer to the solution for resolving the 502 error.
Check the status of the backend pods and configure health checks.
The issue occurs every time:
Check whether the Service configuration is valid and whether the endpoint exists.
If you cannot locate the cause by using the preceding methods, submit a ticket.
What do I do if an error "net::ERR_HTTP2_SERVER_REFUSED_STREAM" occurs?
Issue
When you access the website, some resources cannot be loaded and one of the following errors is reported in the console: net::ERR_HTTP2_SERVER_REFUSED_STREAM
and net::ERR_FAILED
.
Cause
The number of concurrent HTTP/2 streams to the resource has reached the upper limit.
Solution
We recommend that you change the value of
http2-max-concurrent-streams
in the ConfigMap to a greater value. The default value is 128. For more information, see http2-max-concurrent-streams.Disable HTTP/2 by setting
use-http2
tofalse
in the ConfigMap. For more information, see use-http2.
What do Id do if an error "The param of ServerGroupName is illegal" occurs?
Cause
ServerGroupName is generated in the following format: namespace+svcName+port
. The server group name must be 2 to 128 characters in length and can contain letters, digits, periods (.), underscores (_), and hyphens (-). The name must start with a letter.
Solution
Modify the server group name in the required format.
What do I do if an error "certificate signed by unknown authority" occurs when I create an Ingress?
Cause
If multiple Ingresses are deployed in the cluster and the Ingresses use the same resources, such as Secrets, Services, or webhook configurations, the preceding error occurs because different SSL certificates are used to communicate with backend servers when webhooks are triggered.
Solution
Deploy two Ingresses and make sure that the Ingresses use different resources. For more information about the resources used by Ingresses, see the What are the system updates after I update the NGINX Ingress controller on the Add-ons page of the ACK console? section of the "Nginx Ingress FAQ" topic.
Why does the Ingress controller pod restart after it fails a health check?
Issue
The Ingress controller pod restarts after it fails a health check.
Cause
The Ingress controller pod or the node on which the pod is deployed is overloaded. As a result, the pod failed to pass the health check.
Kernel parameters such as
tcp_tw_reuse
ortcp_tw_timestamps
may be configured for the cluster node on which the Ingress controller pod is deployed. This may cause health check failures.
Solution
Add more Ingress controller pods and check whether the issue persists. For more information, see Deploy a high-reliability Ingress controller.
Disable
tcp_tw_reuse
or set the value of the parameter to 2, disabletcp_tw_timestamps
, and then check whether the issue persists.
How do I add Services that use TCP or UDP?
Add specific entries to the tcp-services and udp-services ConfigMaps. By default, the ConfigMaps belong to the kube-system namespace.
The following sample code shows how to map port 8080 of example-go in the default namespace to port 9000:
apiVersion: v1 kind: ConfigMap metadata: name: tcp-services namespace: ingress-nginx data: 9000: "default/example-go:8080" # Map port 8080 to port 9000.
Add port 9000 to the Deployment of the NGINX Ingress controller. By default, the Deployment is named nginx-ingress-controller and belongs to the kube-system namespace.
Add port 9000 to the Service that is associated with the Ingress.
For more information about how to add Services that use TCP or UDP, see Exposing TCP or UDP services.
Why do Ingress rules fail to take effect?
Issue
After you add or modify Ingress rules, the rules do not take effect.
Cause
The configuration of the Ingress contains errors. As a result, the Ingress failed to load the Ingress rules.
The configurations of Ingress resources contain errors.
The Ingress controller does not have the required permissions. As a result, the Ingress controller cannot monitor the changes made to Ingress resources.
The previous Ingress uses a domain name specified in the
server-alias
field. The domain name is in conflict with that of the new Ingress. As a result, the Ingress rules are ignored.
Solution
Go to the Container Intelligence Service console, diagnose the Ingress, and resolve the issue as prompted. For more information, see the Use the Ingress diagnostics feature section of this topic.
Check whether the configuration of the previous Ingress contains errors or whether configuration conflicts exist:
If
rewrite-target
is not used and the paths are specified in regular expressions, make sure that thenginx.ingress.kubernetes.io/use-regex: "true
annotation is added.Check whether PathType is set to an expected value. By default,
ImplementationSpecific
has the same effect asPrefix
.
Make sure that the ClusterRole, ClusterRoleBinding, Role, RoleBinding, and ServiceAccount that are associated with the Ingress controller exist. The default names are ingress-nginx.
Connect to the Ingress controller pod and view the rules that are added in the nginx.conf file.
Run the following command to view the pod log and locate the causes:
kubectl logs <ingress pod name> -n <pod namespace> | grep -E ^[EW]
Why does the system fail to load some web page resources or return a blank white screen when requests are redirected to the root directory?
Issue
After you set the rewrite-target
annotation in the Ingress to rewrite requests, some web page resources cannot be loaded or a blank white screen is displayed when you access the backend service.
Cause
You do not set
rewrite-target
in regular expressions.The path of the requested resource is set to the root directory.
Solution
Check whether
rewrite-target
is set in regular expressions and whether capture groups are used. For more information, see Rewrite.Check whether requests are redirected to the expected path.
How do I fix the issue that Simple Log Service cannot parse logs as expected after ingress-nginx-controller is updated?
Issue
The ingress-nginx-controller component of versions 0.20 and 0.30 are commonly used. After you update ingress-nginx-controller from 0.20 to 0.30 on the Add-ons page of the ACK console, the Ingress dashboard does not show the actual statistics of requests to the backend servers when you perform canary releases or blue-green releases with an Ingress.
Cause
The default log format of ingress-nginx-controller 0.20 is different from that of ingress-nginx-controller 0.30. Therefore, after you update ingress-nginx-controller from 0.20 to 0.30, the Ingress dashboard may not show the actual statistics of requests to the backend servers when you perform canary releases or blue-green releases with an Ingress.
Solution
To fix the issue, perform the following steps to update the nginx-configuration
ConfigMap and the configuration of k8s-nginx-ingress
.
Update the
nginx-configuration
ConfigMap.If you have not modified the
nginx-configuration
ConfigMap, copy the following content to a file namednginx-configuration.yaml
and run thekubectl apply -f nginx-configuration.yaml
command to deploy the file.apiVersion: v1 kind: ConfigMap data: allow-backend-server-header: "true" enable-underscores-in-headers: "true" generate-request-id: "true" ignore-invalid-headers: "true" log-format-upstream: $remote_addr - [$remote_addr] - $remote_user [$time_local] "$request" $status $body_bytes_sent "$http_referer" "$http_user_agent" $request_length $request_time [$proxy_upstream_name] $upstream_addr $upstream_response_length $upstream_response_time $upstream_status $req_id $host [$proxy_alternative_upstream_name] max-worker-connections: "65536" proxy-body-size: 20m proxy-connect-timeout: "10" reuse-port: "true" server-tokens: "false" ssl-redirect: "false" upstream-keepalive-timeout: "900" worker-cpu-affinity: auto metadata: labels: app: ingress-nginx name: nginx-configuration namespace: kube-system
If you have modified the
nginx-configuration
ConfigMap, run the following command to update the configuration. This ensures that your previous modifications are not overwritten.kubectl edit configmap nginx-configuration -n kube-system
Append
[$proxy_alternative_upstream_name]
to thelog-format-upstream
field, save the changes, and then exit.Modify the
k8s-nginx-ingress
configuration.Copy the following content to a file named
k8s-nginx-ingress.yaml
and run thekubectl apply -f k8s-nginx-ingress.yaml
command to deploy the ingress-nginx-controller component.
What do I do if common NGINX Ingress controller errors occur?
Issue
After you diagnose the access log based on the steps described in the Diagnose the access log of the NGINX Ingress controller pod section of this topic, you find controller errors. Example:
User "system:serviceaccount:kube-system:ingress-nginx" cannot list/get/update resource "xxx" in API group "xxx" at the cluster scope/ in the namespace "kube-system"
Cause
The NGINX Ingress controller does not have the permissions to update resources.
Solution
Check whether the issue is caused by the ClusterRole or Role based on the log data.
If the log data contains
at the cluster scope
, the issue is caused by the ClusterRole (ingress-nginx).If the log data contains
in the namespace "kube-system"
, the issue is caused by the Role (kube-system/ingrerss-nginx).
Make sure that the required permissions are granted and the role bindings are correct.
If the issue is caused by the ClusterRole:
Make sure that the ClusterRole
ingress-nginx
and ClusterRoleBindingingress-nginx
exist. If the ClusterRole and ClusterRoleBinding do not exist, you can manually create them, re-install the relevant component, or submit a ticket to request technical support.Make sure that the ClusterRole
ingress-nginx
can provide the permissions indicated in the log data. In the following figure, the LIST permission on networking.k8s.io/ingresses is required. If the ClusterRole cannot provide the corresponding permissions, add the permissions.
If the issue is caused by the Role:
Make sure that the Role
kube-system/ingress-nginx
and RoleBindingkube-system/ingress-nginx
exist. If the Role and RoleBinding do not exist, you can manually create them, re-install the relevant component, or submit a ticket to request technical support.Make sure that the Role
ingress-nginx
can provide the permissions indicated in the log data. In the following figure, the UPDATE permission on theingress-controller-leader-nginx
ConfigMap is required. If the Role does not have the corresponding permissions, grant the permissions to the Role.
Issue
After you diagnose the access log based on the steps described in the Diagnose the access log of the NGINX Ingress controller pod section of this topic, you find controller errors. Example:
requeuing……nginx: configuration file xxx test failed (multiple lines)
Cause
The system failed to reload the NGINX configurations due to syntax errors in Ingress rules or the ConfigMap.
Solution
Check the error messages in the log and locate the syntax error. You can ignore warning messages. If the error message cannot help you locate the syntax error, you can find the corresponding file and code line in the pod based on the error message. For example, find the 449th line in the /tmp/nginx/nginx-cfg2825306115 file. The following figure shows the example.
Run the following command to check for syntax errors:
# Access the pod and run the command. kubectl exec -n <namespace> <controller pod name> -it -- bash # Export the file that contains syntax errors and display line numbers. Then, check for syntax errors in the corresponding lines. cat -n /tmp/nginx/nginx-cfg2825306115
Find and fix the syntax errors.
Issue
After you diagnose the access log based on the steps described in the Diagnose the access log of the NGINX Ingress controller pod section of this topic, you find controller errors. Example:
Unexpected error validating SSL certificate "xxx" for server "xxx"
Cause
A certificate configuration error occurs because the domain names associated with the certificate are different from the domain names specified in the Ingress. You can continue to use your certificate as expected when some certificate errors at the warning level occur. For example, you can continue to use your certificate even if the system prompts that the certificate does not have the SAN attribute.
Solution
Make sure that the certificate meets the following requirements:
The format and content of the .cert file and .key file are valid.
The domain names associated with the certificate are the same as those specified in the Ingress.
The certificate has not expired.