By Yuanyun
By the end of 2020, Alibaba Cloud had put forward the "Trinity" concept, aiming to integrate the proprietary technologies, open-source projects, and commercial products into a unified technology system. By doing so, the value of technology will be maximized.
After years of successful performance testing during the Double 11, the internal HSF framework of Alibaba Group has built up its core competitiveness in high performance and high availability. As one of the most popular service governance frameworks in China and abroad, Dubbo has much to say about its open-source affinity.
As the first solution to use the Trinity archiecture, Alibaba has high hopes for Dubbo 3.0. It integrates the features of the internal HSF perfectly and has the core capabilities of high performance and high availability. We hope to use it to solve the internal implementation and achieve the unification of the technology stack. It has been implemented on a large scale in Kaola. It will be implemented in many core scenarios in the future, carrying complex business scenarios, such as the 618 Shopping Festival and Double 11.
Before specifying the details of the changes in Dubbo 3.0, let's discuss the benefits of upgrading to Dubbo 3.0 from two aspects:
From the perspective of business applications, what specific benefits can one gain by upgrading to Dubbo 3.0?
First, in terms of performance and resources utilization, Dubbo 3.0 can effectively reduce the additional resource consumption caused by the framework, improving resource utilization significantly.
From the single machine perspective, Dubbo 3.0 can save about 50% of memory usage. From the cluster perspective, it can support millions of cluster instances, laying the foundation for larger business scaling in the future. Dubbo 3.0's support for the reactive stream communication model can lead to a significant increase in the overall throughput in some business scenarios.
Second, Dubbo 3.0 brings more possibilities for business architecture upgrades. The most intuitive is the upgrade of communication protocols, which brings more options to the business architecture.
The original Dubbo protocol bound the microservice access to a certain extent. For example, mobile and frontend services need to go through protocol conversion at the gateway layer to access Dubbo's backend services. For another example, Dubbo only supports request-response communication, which makes it impossible to support scenarios that require streaming or reverse communication.
Finally, Dubbo 3.0 is effective for the cloud-native upgrade of business products. Whether it is a passive change brought by the underlying infrastructure upgrade or a proactive upgrade by the business to solve pain points, when the business upgrades to cloud-native, Dubbo 3.0 can help the business products to access cloud-native quickly by giving a cloud-native solution.
After clarifying the benefits of upgrading to Dubbo 3.0, let's take a look at the specific changes in Dubbo 3.0:
Application-Level Service Discovery Model
The prototype of the application-level service discovery model was proposed first in Dubbo 2.7.6. After some iteration time, a relatively stable model for Dubbo 3.0 was finally formed.
In Dubbo 2.7 and earlier versions, applications perform service registration and discovery at the interface granularity. Each interface will correspond to a piece of data on the registry, and different machines will be registered with metadata information belonging to the current machine or interface-level configuration information, such as serialization, data center, unit, and timeout configuration.
All servers providing this service are changed independently at the interface granularity when they are restarted or released. For example, a gateway application relies on 30 interfaces of an upstream application. When the upstream application is being released, 30 corresponding address lists are being brought online and offline.
The approach of using interfaces as the first citizen for registration and discovery was the earliest splitting of SOA or microservices, providing the independence and dynamic change capability of a single service or a single node. As the business evolves, the number of services that a single application relies on is increasing, and the number of machines per service provider is growing as well because of business or capacity. From the perspective of the client as a whole, the total number of dependent service addresses increases rapidly. According to this situation, optimization in terms of the design of the registration and discovery process can be considered.
Here, two features should be noticed:
Based on the preceding features, application-level registration and discovery are finally proposed. Applications are used as the basic dimensions of registration and discovery. The main difference from the interface level is that if an application provides 100 interfaces, 100 nodes need to be registered in the registry. If the application has 100 machines, that means 10,000 virtual node changes for its clients at each release. However, application-level registration and discovery only require one node and only 100 virtual nodes for each release. For applications that rely on a large number of services and machines, this is a drop in scale of tens to one-hundredth of a magnitude, and the memory consumption will be reduced by at least half.
However, the design of the technical scheme needs to consider the correct function and the upgrade of the existing business. Therefore, the upgrade to application-level registration and discovery is based on the need to align interface-level registration and discovery capabilities. Regardless of whether the client is upgraded or whether application-level registration and discovery are enabled, the premise is that it does not affect proper business calls.
We have designed a new component to provide this guarantee. The metadata center can manage two parts of data:
Finally, since the new service discovery is highly similar to the service discovery models under Spring Cloud, Service Mesh, and other systems, Dubbo can discover data between the registry and nodes in other systems.
Dubbo 3.0 is the ideal microservice framework for the cloud-native era. Currently, several trends indicate that Kubernetes has become the de-facto standard for resource scheduling, Mesh has become the mainstream trend, and Kubernetes has seen rapid growth in scale. These trends put forward higher requirements for Dubbo.
Firstly, a more convenient way for users to deploy and invoke Dubbo services on Kubernetes is a significant problem that must be solved. A unified protocol and data exchange format are essential to solving this problem. Secondly, the popularity of Mesh brings diversity issues, such as how can native Dubbo and Mesh-based Dubbo coexist and ways to support multi-language scenarios. Lastly, the increases in scale will bring greater challenges to the entire Dubbo architecture since components (such as the registry and the client) will have more data and calls.
The top priority of the evolution of Dubbo is to provide more efficient services while maintaining stability.
These challenges of the cloud-native era have contributed to the development of the next generation of Dubbo, including new protocols, Kubernetes infrastructure support, multi-language support, and scalability.
The most basic capability of the RPC framework is to complete service calls across business processes, forming a chain and a network of services, of which the core carrier is the RPC protocol.
Meanwhile, due to the close coupling with business data, the design and implementation of RPC protocol also directly determines the business architecture in some aspects, such as the interaction from terminal equipment to the backend equipment, multilingual adoption in microservice architecture, and data transmission models between services.
Dubbo 2 provides the core semantics of RPC, including protocol header, flag bit, request ID, and request/response data. However, in the cloud-native era, Dubbo 2 protocol faces two main challenges. The first is that the ecosystem is not interoperable, making it difficult for users to understand the binary protocol. The second is that Dubbo is not friendly enough for gateway components, such as Mesh, that require a complete parsing protocol to obtain the call metadata. For example, some RPC contexts face challenges in terms of performance and usability.
As a service framework, Dubbo is most important for providing remote communication capabilities. The design and implementation of the Dubbo 2 RPC protocol have been proven in practice that it limits the business architecture in several aspects, such as the interaction from terminal equipment to the backend equipment, multilingual adoption in microservice architecture, and data transmission models between services.
While supporting existing features and addressing remaining problems, the following features are needed for the next-generation protocol:
Based on these requirements, the HTTP2/protobuf combination is the most suitable. When mentioning this combination, it may be easy to come up with the gRPC protocol. The relationship between the new-generation protocol and gRPC is listed below:
(1) The new Dubbo protocol is a protocol extended based on gRPC, which also ensures that the new protocol and gRPC are interoperable and shared across the ecosystem.
(2) Building on the first clause, the new Dubbo protocol will more natively support Dubbo's service governance, providing greater flexibility.
(3) In terms of serialization, since most applications have not used Protobuf, the new protocol will give sufficient support in serialization, adapting existing serialization for easy migration to Protobuf.
(4) In the request model, the new protocol will support Reactive natively, which is also unavailable in the gRPC protocol.
To make Dubbo implement in the Service Mesh system, after referring to many solutions, two Mesh solutions that are most suitable for Dubbo 3.0 were finally determined. One is the classic Sidecar-based Service Mesh, and another is the Proxyless Mesh without Sidecar.
For the Sidecar Mesh solution, its deployment method is consistent with the current mainstream Service Mesh deployment solution. Dubbo 3.0 focuses on providing a completely transparent upgrade experience for business applications as much as possible. It includes an imperceptible upgrade from a programming perspective but allows the entire call process to be updated through Dubbo 3.0 lightweight and Triple protocols, minimizing losses and O&M costs. This solution is also known as the Thin SDK solution, which removes all unnecessary components.
The Proxyless Mesh deployment solution is another Mesh form planned for Dubbo 3.0, where the goal is to interact directly with the control plane from the traditional SDK without starting Sidecar.
Imagine the following scenarios where the Proxyless Mesh deployment solution is commonly used:
Viewing the two forms together, Dubbo has many Mesh solutions available for different business scenarios, different migration phases, and different infrastructure guarantees, which can be governed by a unified control plane.
The preceding figure shows the expected deployment solution of Dubbo 3.0 on Kubernetes. Dubbo 3.0 will be a Kubernetes-native service for its service discovery model, supporting mutual calls without deploying an independent registry.
The preceding figure shows the future deployment solution of Dubbo 3.0 on Istio. The hybrid deployment of Thin SDK and Proxyless is used here. As shown in Pod 1 and Pod 3, the data traffic is sent directly from Dubbo Service. While Pod 2 is deployed in Thin SDK mode, the traffic is intercepted by Sidecar and then flows out.
Cloud-native has brought about major changes in technology standardization. The core objectives of all cloud-native basic components are the ways to make it easier to create and run applications on the cloud with flexible and scalable features. With the elasticity of cloud-native technologies, an application can be scaled out by a large number of machines to support business needs in a very short time.
For example, to cope with flash sales at midnight or emergencies, applications often need thousands or tens of thousands of machines to improve performance to meet user needs. However, the expansion also brings a series of problems, such as the frequency of node exceptions due to the extremely large number of cluster nodes and the uneven service capacity of nodes due to a variety of objective factors. These are the problems encountered in the large-scale deployment of clusters in cloud-native scenarios.
Dubbo is expected to solve these problems based on a flexible cluster scheduling mechanism. This mechanism can mainly solve two problems. First, the distributed service can be maintained stably and without avalanche when nodes are abnormal. Second, large-scale applications can run at the best state, providing higher throughput and performance:
Apache Dubbo 3.0.0 was officially released in June 2021 as a milestone version after it was donated to Apache. This means Apache Dubbo has fully embraced cloud-native.
In November 2021, we will release Apache Dubbo 3.1 and bring the implementation and practices of Apache Dubbo deployment in Mesh scenarios.
In March 2022, we will release Apache Dubbo 3.2, which will bring a new intelligent traffic scheduling mechanism for large-scale application deployment, improving system stability and resource utilization.
Finally, Apache Dubbo 3.0 has already been integrated with the internal RPC framework of Alibaba Group. This is expected to solve the internal implementation and unify the technology stack. In the future, Apache Dubbo 3.0 will be implemented on a large scale in the Alibaba Group, supporting complex business scenarios, such as the 618 Shopping Festival and Double 11.
The community will try its best to ensure a short release cycle and fix the existing problems promptly. You are welcome to submit issues and performance requirements, and the community will review and reply as soon as possible. Thanks for your support.
Apache Dubbo 3.0.0 Officially Released – Apply Cloud-Native in All Respects
Best Practices for O&M of Large-Scale Microservice Applications in the Serverless Era
206 posts | 12 followers
FollowAlibaba Developer - August 24, 2021
Alibaba Cloud Community - February 6, 2022
Alibaba Developer - May 20, 2021
Alibaba Cloud Native Community - July 20, 2021
Alibaba Cloud Native Community - December 7, 2023
Alibaba Developer - January 20, 2022
206 posts | 12 followers
FollowMSE provides a fully managed registration and configuration center, and gateway and microservices governance capabilities.
Learn MoreAlibaba Cloud Container Service for Kubernetes is a fully managed cloud container management service that supports native Kubernetes and integrates with other Alibaba Cloud products.
Learn MoreA fully-managed Apache Kafka service to help you quickly build data pipelines for your big data analytics.
Learn MoreAccelerate and secure the development, deployment, and management of containerized applications cost-effectively.
Learn MoreMore Posts by Alibaba Cloud Native