All Products
Search
Document Center

Container Service for Kubernetes:Knative FAQ and solutions

Last Updated:Aug 14, 2024

This document provides answers to commonly raised questions when using Knative in a Container Service for Kubernetes (ACK) cluster.

Index

What are the differences between Alibaba Cloud Knative and open source Knative?

Alibaba Cloud Knative provides enhanced service capabilities based on the open source Knative, including O&M, ease of use, elasticity, Ingress, event-driven service, and monitoring and alerting. For more information, see Comparison between Alibaba Cloud Knative and open source Knative.

Which Ingress do I need to use when I install Knative?

Alibaba Cloud Knative supports four types of Ingresses: Application Load Balancer (ALB) Ingresses, Microservices Engine (MSE) Ingresses, Service Mesh (ASM) Ingresses, and Kourier Ingresses. ALB Ingresses are suitable for load balancing at the application layer. MSE cloud-native Ingresses are suitable for microservice scenarios. ASM Ingresses provide the Istio capabilities. If you require only basic Ingress capabilities, you can use Kourier Ingresses. For more information, see Select Ingresses for Knative.

What permissions are required to use Knative with a RAM user or role

Permissions to access all namespaces in the cluster are required. You can perform the following steps to authorize the Resource Access Management (RAM) user or role.

  1. Log on to the ACK console. In the left-side navigation pane, click Authorizations.

  2. Click the RAM Users tab, in the RAM user list, click Modify Permissions corresponding to the target RAM user.

  3. In the Add Permissions area, select the cluster to be authorized, then select All Namespaces as the Namespace, and complete the authorization as prompted.

How long does it take to reduce the number of pods to zero?

The amount of time that is required to reduce the number of pods to zero depends on the following three parameters:

  • stable-window: the time window before the scale-in operation is performed. Before pods are scaled in, the system observes and evaluates the metrics within the time window and does not immediately perform the scale-in operation.

  • scale-to-zero-grace-period: the grace period before the number of pods is reduced to zero. During this period, the system does not immediately stop or delete the last pod even if no new requests are received. This helps respond to burst traffic.

  • scale-to-zero-pod-retention-period: the retention period of the last pod before the number of pods is reduced to zero.

To reduce the number of pods to zero, make sure that the following conditions are met:

  1. No requests are received during the time window that is specified by the stable-window parameter.

  2. The retention period of the last pod that is specified by the scale-to-zero-pod-retention-period parameter is exceeded.

  3. The amount of time that is consumed to switch to the proxy mode for a serverless Kubernetes service is longer than the grace period that is specified by the scale-to-zero-grace-period parameter.

The retention period of the last pod before the number of pods is reduced to zero does not exceed the value that is calculated based on the following formula: stable-window + Max["scale-to-zero-grace-period", "scale-to-zero-pod-retention-period"]. If you want to forcibly set a retention period of the last pod before the number of pods is reduced to zero, we recommend that you use the scale-to-zero-pod-retention-period parameter.

How to use GPU resources in Knative?

You can add the annotation k8s.aliyun.com/eci-use-specs to the spec.template.metadata.annotation section of the configurations of a Knative Service to specify a GPU-accelerated Elastic Compute Service (ECS) instance type. You can add the nvidia.com/gpu field to the spec.containers.resources.limits section to specify the amount of GPU resources that are required by the Knative Service.

For more information, see Configure GPU resources for a Knative Service.

How to use GPU sharing in Knative?

You can refer to Examples of using GPU sharing to share GPUs to enable the GPU sharing feature for nodes, and specify the maximum amount of available GPU resources in Knative through the aliyun.com/gpu-mem parameter. For more information, see Configure GPU resources for a Knative Service.

By default, Knative scales the number of instances to zero during off-peak hours. How to reduce the cold start latency?

By default, the open source version of Knative scales the number of instances to zero during off-peak hours to reduce the costs of instances. When the next request arrives, the application is allocated to a new instance. The system must first allocate infrastructure resources by using the Kubernetes scheduler, then pull the application image and start up the application. Although this approach reduces costs, it results in a cold start with a long latency when the application starts.

To avoid the cold start latency, we recommend that you use one of the following solutions.

  • Configure reserved instances: Reserve a low-specification and low-cost burstable instance to balance the cost and cold start latency of Knative. When the first request arrives, the reserved instance processes the request and starts to create the default specification instances. After the default specification instances are created, all subsequent new requests are forwarded to these instances. The reserved instance is released after it processes all requests that are sent to it. For more information, see Configure a reserved instance.

  • Elastic Container Instance provides image caches. You can create a cache snapshot from an image and then use the cache snapshot to deploy a pod on an elastic container instance. This reduces the image layers that you need to download and therefore accelerates pod creation. For more information, see Use image caches to accelerate pod creation for Knative Services.

Am I charged for the Activator component of ACK Knative?

Yes. Activator is a data plane component that runs as pods and occupies your instance resources.

How to configure the listening port in Knative?

The listening port of the application must be consistent with the port of containerPort in Knative, which defaults to 8080. For more information about how to configure custom listening ports, see Configure custom listening ports.