Container Intelligence Service (CIS) allows you to diagnose nodes, pods, Services, Ingresses, memory, and networks with a few clicks to locate issues in your Container Service for Kubernetes (ACK) cluster. This topic describes how to use the cluster diagnostics feature to diagnose an ACK cluster.
Prerequisites
An ACK managed cluster is created. For more information, see Create an ACK managed cluster.
The status of the ACK cluster is Running.
NoteYou can log on to the ACK console, go to the Clusters page, and then check whether the Cluster Status column of your cluster displays Running.
Introduction to cluster diagnostics
CIS provides the following diagnostic items.
Diagnostic item | Description |
Diagnose node issues, such as Kubernetes nodes in the NotReady state. | |
Diagnose pod status issues, such as pod startup failures or frequent pod restarts. | |
Diagnose Service issues, such as Service configurations, resource quotas, and abnormal events. | |
Diagnose Ingress-related issues in traffic routing configurations. | |
Diagnose node memory issues, such as memory leaks, cgroup leaks, out of memory (OOM) errors. Diagnostic results can be visualized to display the overall memory usage. | |
Diagnose common network issues, such as connectivity issues between pods, between a cluster and the Internet, and between a LoadBalancer Service and the Internet. |
Configure diagnostics
When you use the cluster diagnostics feature, ACK runs a data collection program on each node in the cluster and collects diagnostic results. ACK collects key error messages from the system log and operation information, such as the system version, loads, Docker, and kubelet. ACK does not collect business information or sensitive data.
The procedures for configuring node, pod, Service, Ingress, memory, and network diagnostics are similar. The following section uses node diagnostics as an example to demonstrate how to configure the diagnostics feature.
Log on to the ACK console. In the left-side navigation pane, click Clusters.
On the Clusters page, click the name of the cluster that you want to manage. In the left-side navigation pane, choose
and follow the on-screen instructions to complete authorization.On the Diagnosis page, click Node diagnosis. Then, click Diagnosis in the upper-left corner.
In the Select node panel, specify Node name, read and select I know and agree, and then click Create diagnosis.
You can view the diagnostic progress on the page. After the diagnostic is complete, the page displays the diagnostic results and diagnostic items. You can check the cause and fix the issues.
View diagnostic results
On the Node Diagnosis page, click Diagnosis details in the Operation column of the diagnostic report in the list to view the diagnostic results on the details page.
The diagnostic items may vary based on the cluster configuration. The actual diagnostic items on the diagnostic page shall prevail.
Diagnostic item | Flag | Description |
Node diagnostics |
| Node diagnostics consist of the Node, NodeComponent, ClusterComponent, ECSControllerManager, and GPUNode diagnostic items. These diagnostic items help you identify node anomalies based on the status of nodes, node components, cluster components, and Elastic Compute Service (ECS) instances. On the diagnostic details page, you can view the node diagnostic results, repair suggestions, and diagnostic items. Move the pointer over the icon to the right of a diagnostic item to view information about the diagnostic item. Diagnostic items with the Abnormal or Warning flag are displayed on the Troubleshoot tab. When a diagnostic item displays the Abnormal flag, you can move the pointer over Details in the Status column to view details about the issue. |
Pod diagnosis | Pod diagnostics consist of the Pod, ClusterComponent, Node, NodeComponent, and ECSControllerManager diagnostic items. These diagnostic items help you identify pod anomalies based on the status of pods, cluster components, nodes, and ECS instances. On the diagnostic details page, you can view the pod diagnostic results, repair suggestions, and diagnostic items. Move the pointer over the icon to the right of a diagnostic item to view information about the diagnostic item. Diagnostic items with the Abnormal or Warning flag are displayed on the Troubleshoot tab. When a diagnostic item displays the Abnormal flag, you can move the pointer over Details in the Status column to view details about the issue. | |
Service diagnostics | Service diagnostics consist of the Service and ResourceQuotas diagnostic items. These diagnostic items help you identify Service anomalies based on the billing method of Classic Load Balancer (CLB) instances, certificates, quotas, and abnormal events. Move the pointer over the icon to the right of a diagnostic item to view information about the diagnostic item. Diagnostic items with the Abnormal or Warning flag are displayed on the Troubleshoot tab. When a diagnostic item displays the Abnormal flag, you can move the pointer over Details in the Status column to view details about the issue. | |
Ingress diagnosis | Ingress diagnostics consist of the Ingress, Addon, and SLB diagnostic items. These diagnostic items help you identify Ingress anomalies based on the status of Ingresses, Ingress plug-ins, and Server Load Balancer (SLB) instances. Move the pointer over the icon to the right of a diagnostic item to view information about the diagnostic item. Diagnostic items with the Abnormal or Warning flag are displayed on the Troubleshoot tab. When a diagnostic item displays the Abnormal flag, you can move the pointer over Details in the Status column to view details about the issue. | |
Memory diagnosis | None. | On the diagnostic details page, you can view diagnostic results in the Memory Overview, Memory Analysis, and OOM Analysis sections, including memory leaks, memory utilization, and memory occupied by each process. |
Network diagnosis |
| On the Diagnosis result page, you can view the diagnostic results. The Packet paths section displays all nodes that are diagnosed. Abnormal nodes are highlighted. |