The Gateway with Inference Extension component is an enhanced component built on the Kubernetes Gateway API and its Inference Extension specification. It supports Layer 4 and Layer 7 routing services in Kubernetes and provides intelligent load balancing for large language model (LLM) inference scenarios. This topic introduces the Gateway with Inference Extension component, explains how to use it, and provides its change log.
Component information
The Gateway with Inference Extension component is built on the Envoy Gateway project. It is compatible with Gateway API features and integrates the Gateway API's inference extension. The component primarily provides load balancing and routing for LLM inference services.
Usage instructions
The Gateway with Inference Extension component requires the CustomResourceDefinitions (CRDs) provided by the Gateway API component. Before you install the Gateway with Inference Extension, ensure that the Gateway API component is installed in your cluster. For more information, see Install components.
For more information about using the Gateway with Inference Extension component, see Overview of Gateway with Inference Extension.
Change log
December 2025
Version number | Change date | Changing Content | Impact |
v1.4.0-apsara.4 | December 16, 2025 |
| Upgrading from an earlier version restarts the gateway pod. Perform the upgrade during off-peak hours. |
September 2025
Version number | Change date | Changes | Impact |
v1.4.0-apsara.3 | September 4, 2025 |
| Upgrading from an earlier version restarts the gateway pod. Perform the upgrade during off-peak hours. |
May 2025
Version number | Change date | Change History | Impact |
v1.4.0-aliyun.1 | May 27, 2025 |
| Upgrading from an earlier version restarts the gateway pod. Perform the upgrade during off-peak hours. |
April 2025
Version number | Change date | Changes | Impact |
v1.3.0-aliyun.2 | May 7, 2025 |
| Upgrading from an earlier version restarts the gateway pod. Perform the upgrade during off-peak hours. |
March 2025
Version number | Change date | Description | Impact |
v1.3.0-aliyun.1 | March 12, 2025 |
| This upgrade does not affect your services. |