This topic describes the Domain Name System (DNS) resolution policies and caching policies that are supported by Container Service for Kubernetes (ACK) clusters.
DNS resolution pipelines
This section shows the DNS resolution pipeline in the following scenarios:
For information about the terms in the following figures, such as timeout and attempts, see the "Resolution policies" and "Caching policies" sections in this topic.
Non-containerized applications run on Elastic Compute Service (ECS) instances.
For example, an application named App runs on an ECS instance, as shown in the following figure.
Containerized applications run in pods in Kubernetes clusters. The pods use the ClusterFirst DNS policy.
For example, an application named App runs in a pod in a Kubernetes cluster, as shown in the following figure.
Containerized applications run in pods in Kubernetes clusters. The DNS policy of the pods specifies that NodeLocal DNSCache is used for DNS resolution.
For example, an application named App runs in a pod in a Kubernetes cluster that has NodeLocal DNSCache installed, as shown in the following figure.
Resolution policies
Client side
In most cases, DNS queries of pods are handled by using the interfaces provided by glibc. The following table describes the parameters that you can configure in the /etc/resolv.conf file and the default values in different environments. These parameters specify how glibc performs DNS resolutions.
Parameter | Description | Default value in glibc | ECS | Default value for pod that uses the ClusterFirst DNS policy | Default value for pod that uses the Default DNS policy | Default value for pod that uses NodeLocal DNSCache for DNS resolution | Default value for pod that uses the host network and the Default DNS policy |
| The DNS server that resolves domain names. | None | DNS servers deployed in the virtual private cloud (VPC)② | The cluster IP address of CoreDNS③ | DNS servers deployed in the VPC |
| DNS servers deployed in the VPC |
| Domain names other than fully qualified domain names (FQDNs) are appended with the suffixes that are specified by the | None | None | <ns>.svc.cluster.cloal svc.cluster.local cluster.local | None | <ns>.svc.cluster.cloal svc.cluster.local cluster.local | None |
| If the number of dots in the domain name string is greater than the value of the ndots parameter, the domain name is an FQDN and is directly resolved. If the number of dots in the domain name string is less than the value of the ndots parameter, the domain name is appended with the suffixes that are specified by the search parameter before the domain name is resolved. | 1 | 1 | 5 | 1 | 3 | 1 |
| The timeout period of each DNS resolution. Unit: seconds. | 5 | 2 | 5 | 5 | 1 | 2 |
| The maximum number of retries that can be performed if a DNS resolution fails. | 2 | 3 | 2 | 2 | 2 | 3 |
| Send DNS queries to DNS servers in a round-robin manner. | Disabled | Enabled | Disabled | Disabled | Disabled | Enabled |
| After you specify this parameter, if two DNS requests are sent by using the same socket, the client closes the socket after the client sends the first one and opens a new socket to send the second one. | Disabled | Enabled | Disabled | Disabled | Disabled | Enabled |
①The attempts parameter takes effect only in specific scenarios, for example, when SERVFAIL, NOTIMP, or REFUSED is returned or when NOERROR is returned without resolution results. For more information, see Introduction to the attempts parameter.
②DNS servers deployed in the VPC are the default DNS servers for ECS instances in the VPC. The IP addresses of the DNS servers are 100.100.2.136 and 100.100.2.138. The DNS servers resolve authoritative domain names and domain names that are added to Alibaba Cloud DNS PrivateZone.
③The cluster IP address of CoreDNS is the IP address of the kube-dns Service in the kube-system namespace. The kube-dns Service forwards DNS queries to CoreDNS for internal domains names, authoritative domain names, and domain names that are added to Alibaba Cloud DNS PrivateZone.
④The IP address of NodeLocal DNSCache is 169.254.20.10. After you deploy NodeLocal DNSCache, NodeLocal DNSCache listens for DNS queries that are sent to the IP address on the node.
For more information about how to configure resolv.conf, see resolv.conf.
In specific cases, the DNS resolution configuration at the client side may be different from the configuration in the preceding description.
If you use Alpine Linux images to deploy containers, the DNS resolution configuration may be significantly different. This is because Alpine Linux replaces glibc with musl libc. The following list describes some of the differences:
Alpine Linux does not support the single-request and single-request-reopen parameters in the /etc/resolv.conf file.
Alpine 3.3 and earlier versions do not support the search parameter that allows you to specify search domains. As a result, service discovery fails to be implemented.
musl libc processes queries that are sent to the DNS servers specified in the /etc/resolv.conf file in parallel. As a result, NodeLocal DNSCache fails to optimize DNS resolution.
musl libc processes A and AAAA queries that use the same socket in parallel. This causes packet loss on the conntrack port in earlier kernel versions.
NoteFor more information, see musl libc.
If you use Golang or Node.js to develop applications, the applications may use the built-in DNS resolver. This also causes significant differences in DNS resolution.
Internal DNS servers in the cluster
By default, the /etc/resolv.conf file of CoreDNS is inherited from the /etc/resolv.conf file on the ECS instance that hosts CoreDNS. However, CoreDNS uses the built-in forward plug-in to forward DNS queries.
CoreDNS is built into NodeLocal DNSCache. You can configure NodeLocal DNSCache in the same way you configure CoreDNS.
The following table describes the parameters of the forward plug-in. For more information, see forward.
Parameter | Description | Default value in CoreDNS | Default value in NodeLocal DNSCache |
| Preferably uses UDP to communicate with the upstream server. | Enabled | Disabled |
| Forcefully uses TCP to communicate with the upstream server. | Disabled | Enabled |
| The number of consecutive failed health checks that must occur before an upstream server is considered unhealthy. | 2 | 2 |
| The time period for which the connection to the upstream server is kept alive. The default time period is 10 seconds. | 10s | 10s |
| The policy that is used to select upstream servers. | random | random |
| The interval at which health checks are performed. | 0.5s | 0.5s |
| The maximum number of concurrent queries that can be sent to the upstream server. | None | None |
| The timeout period of connections to the upstream server. | 30s. The timeout period dynamically decreases based on the actual connection duration. | 30s. The timeout period dynamically decreases based on the actual connection duration. |
| The timeout period of requests sent to the upstream server. | 2s | 2s |
Caching policies
Client side
The DNS caching policy at the client side varies based on the configurations of containers and applications. You can configure the DNS caching policy at the client side based on your requirements.
Internal DNS servers in the cluster
Parameter | Description | Default value of CoreDNS | Default value of NodeLocal DNSCache in ACK | Default value of CoreDNS in ACK |
success Max TTL | The maximum time to live (TTL) for the cache of successful DNS resolutions. | 3600s | 30s | 30s |
success Min TTL | The minimum TTL for the cache of successful DNS resolutions. | 5s | 5s | 5s |
success Capacity | The maximum number of successful DNS resolution results that can be cached. | 9984 | 9984 | 9984 |
denial Max TTL | The maximum TTL for the cache of failed DNS resolutions. | 1800s | 5s | 30s |
denial Min TTL | The minimum TTL for the cache of failed DNS resolutions. | 5s | 5s | 5s |
denial Capacity | The maximum number of failed DNS resolution results that can be cached. | 9984 | 9984 | 9984 |
ServerError TTL | The maximum TTL for DNS resolution results that are returned from unhealthy upstream DNS servers. | 5s | 0s. If the installed NodeLocal DNSCache Helm chart version is earlier than 1.5.0, the default value is 5s. | 0s. If the installed CoreDNS version is earlier than 1.8.4.2, the default value is 5s. |
serve_stale | Uses the outdated local DNS cache if the client cannot connect to the upstream DNS server. | Disabled | Enabled. If the installed NodeLocal DNSCache Helm chart version is earlier than 1.5.0, this parameter is disabled. | Disabled |
The actual TTL for the DNS cache is determined by the TTL of the returned DNS record, the maximum TTL, and the minimum TTL:
If the TTL of the returned DNS record is greater than the maximum TTL, the maximum TTL is used as the actual TTL for the DNS cache.
If the TTL of the returned DNS record is less than the minimum TTL, the minimum TTL is used as the actual TTL for the DNS cache.
If the TTL of the returned DNS record is greater than the minimum TTL and less than the maximum TTL, the TTL of the returned DNS record is used as the actual TTL for the DNS cache.
Optimize DNS resolution
The preceding sections describe the DNS resolution pipielines in Kubernetes clusters and the relevant parameters. You can modify the parameter settings in the pod YAML template, CoreDNS ConfigMap, and NodeLocal DNSCache ConfigMap. The following example shows how to configure the pod YAML template.
If you specify dnsPolicy:Default
in the client pod YAML template, the pod inherits the DNS settings of the ECS instance that hosts the pod. Therefore, the IP addresses of DNS servers in the VPC are automatically specified in the /etc/resolv.conf file in the pod.
apiVersion: v1
kind: Pod
metadata:
name: example
namespace: default
spec:
containers:
- image: registry.cn-hangzhou.aliyuncs.com/example-ns/example:v1
name: example
# The dnsPolicy parameter is set to Default.
dnsPolicy: Default
# The /etc/resolv.conf file in the pod.
# cat /etc/resolv.conf
nameserver 100.100.2.136
nameserver 100.100.2.138
Compared with the /etc/resolv.conf file of the ECS instance, the /etc/resolv.conf file in the pod does not contain the following options: rotate, single-request-reopen, timeout:2, and attempts:3
. This may cause resolution errors when network jitters occur, Modify the pod YAML template based on the following content:
apiVersion: v1
kind: Pod
metadata:
name: example
namespace: default
spec:
containers:
- image: registry.cn-hangzhou.aliyuncs.com/example-ns/example:v1
name: example
# Set the dnsPolicy parameter to Default.
dnsPolicy: Default
# Add the following options.
dnsConfig:
options:
- name: timeout
value: "2"
- name: attempts
value: "3"
- name: rotate
- name: single-request-reopen
# After you add the options to the /etc/resolv.conf, redeploy the pod to apply the modification.
# cat /etc/resolv.conf
nameserver 100.100.2.136
nameserver 100.100.2.138
options rotate single-request-reopen timeout:2 attempts:3