All Products
Search
Document Center

Container Service for Kubernetes:FAQ about cloud-edge O&M communication component Tunnel

Last Updated:Nov 11, 2024

This topic provides answers to some frequently asked questions about cloud-edge O&M communication component Tunnel.

What do I do if I fail to run the kubectl exec/logs command, Managed Service for Prometheus fails to collect metrics from edge nodes, or the metrics server fails to collect metrics from edge nodes?

Issues

  • When you run the kubectl exec -it edge-tunnel-agent-xxx -n kube-system -- sh command, the error: unable to upgrade connection: fail to setup the tunnel: fail to setup TLS handshake through the Tunnel error appears.

  • Managed Service for Prometheus fails to collect metrics from edge nodes.

  • The metrics server fails to collect metrics from edge nodes.

Causes

These issues occur if the edge-tunnel-server and edge-tunnel-agent components in the cloud are not connected.

Solutions

  1. Run the following command to print the log of the edge-tunnel-server pod. Replace edge-tunnel-server-xxx with the name of the edge-tunnel-server pod.

    kubectl logs edge-tunnel-server-xxx -n kube-system 

    The following error appears in the output:

    tunnel.go:74] "currently no tunnels available" err="No backend available"
    interceptor.go:115] successfully setup connection to "127.0.0.1:10255" with headers: "\r\nX-Tunnel-Proxy-Host: xxxx\r\nUser-Agent: Go-http-client/1.1"
    interceptor.go:136] fail to write request to tls connection: write unix @->/tmp/interceptor-proxier.sock: write: broken pipe
  2. Collect the diagnostic information of the edge nodes that have encountered exceptions. For more information, see How do I collect the diagnostic information about nodes in an ACK Edge cluster?

  3. View the log of the edge-tunnel-agent component in the diagnostic information and check for the following error:

     1 clientset.go:156] "cannot sync once" err="rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing dial tcp xxx.xxx.xxx.xxx:10262: i/o timeout\""
    • If you cannot find the preceding error, submit a ticket.

    • If you find the preceding error, proceed to the next step.

  4. Run the following command to view the address of tunnel-server:

    kubectl get svc -n kube-system | grep tunnel-server-svc

    Expected output:

    x-tunnel-server-svc            LoadBalancer   172.16.XX.XX     47.99.XX.XX    10262:30164/TCP                         47h
  5. Check the connectivity between tunnel-server and the edge nodes by running the ping 47.99.XX.XX command.

    • If tunnel-server is disconnected from the edge nodes, troubleshoot network issues.

    • If tunnel-server is connected to the edge nodes, proceed to the next step.

  6. Run the telenet 47.99.XX.XX 10262 command on an edge node to check the connectivity. If the port is unreachable, check the security group rules on the Alibaba Cloud side, the network access control list (ACL) rules of the Server Load Balancer (SLB) instance, and the iptables rules of tunnel-server. Make sure that these rules do not block connection requests.

  7. If no security rules on the Alibaba Cloud side block connection requests, check whether the network of the edge node blocks Internet traffic on port 10262.

    Important
    • Cloud-edge O&M channels use ports 10262, 10263, and 10264. You are not allowed to limit ports 10262, 10263, and 10264 on the Alibaba Cloud side and you are not allowed to limit port 10262 at the edge.

    • No network ACL is configured for the SLB instance used by cloud-edge O&M channels. Do not create network ACLs for the SLB instance.