This article is from Alibaba DevOps Practice Guide written by Alibaba Cloud Yunxiao Team
With the rapid development of Alibaba's diversified business for more than 20 years, the technology system has undergone multiple major changes in the web era, the mobile era, the data intelligence era, and the cloud computing era. In these revolutions, the technical systems, tool systems, and knowledge systems faced by developers are constantly evolving. R&D tools play a key role in technological scale, cost reduction, and efficiency improvement.
Generally, the technical personnel in an enterprise are divided into multiple types, including frontend, mobile side, server side, data, algorithm, testing, and O&M. They also represent the labor division of several major technologies in the current software engineering field. Each technology stack has its unique development path and toolset. In addition to the vertical technology dimension segmentation, Alibaba performs horizontal segmentation from the front to the back of the user perception path, such as no-code/low-code programming preferred by the business side and pro-code programming preferred by the general side.
The development of the R&D tool system can be divided into three stages: technology stack standardization, integration of tools, processes, and platform, and technical diversification for specific scenarios.
At the early development stage of a specific technical field or when a company is newly established, there will be a situation where various technical frameworks compete against each other and multiple R&D processes are running in parallel. Generally, converging mainstream technology stacks is the first choice to improve R&D efficiency. For example, the proportion of Java technology stack developers in Alibaba exceeds 50%. The middleware, programming frameworks, supporting tools, and R&D processes that run based on Java technology stack are highly coupled to form a unified R&D solution.
The productization of the solution will give birth to an integrated tool/process platform, and the core benefit of this platform for enterprises aims to standardize and automate the inherent processes. Thus, it can raise the skill bottom line of all technical employees and improve the average human efficiency. On the other hand, the tool platform can help enterprises accumulate available assets and summarize and analyze the process data to provide a decision-making basis for managers.
The third stage of the development of R&D tools is the deep coupling with the enterprise's business and customized scenarios, achieving performance breakthroughs in specific fields. For example, code-free programming in the OA field, frontend intelligent P2C, and server-side function programming.
The term DevOps refers to the full process of planning, code, development, testing, release, O&M, and monitoring. It is divided into three stages: the requirement analysis stage, the code development stage, and the delivery and O&M stage. They correspond to requirement-centric, code-centric, and application-centric platforms, respectively.
The platform needs to solve the problem of how to manage enterprise R&D assets first. Assets are generally divided into knowledge assets (requirements, documents, and design drawings), code assets (programs, configurations, and data), and application and resource assets (logical units that provide external services and corresponding physical assets). Then, enterprises need to record the data generated in the R&D process for analyzing and finding ways to improve efficiency.
The tool platform accumulates asset data and process data to a unified data middle platform. The data is connected by the standardized process of DevOps from planning to monitoring. We call it the value stream at Alibaba. It represents the entire process from definition to realization of the business value, and the speed of value delivery is precisely the R&D efficiency.
Currently, it is almost necessary for enterprises to migrate to the cloud. Enterprises must consider making good use of the cloud to establish a DevOps system. According to Alibaba's experience, the key to making good use of the cloud is to provide tools to use the cloud for development and O&M personnel, respectively.
O&M or SRE is the creator and maintainer of infrastructure. They focus on a large number of scattered IT assets. The most important thing is to manage these assets and control their production and O&M processes. We use an ITIL- or ITSM-based cloud resource management platform to help O&M personnel improve management efficiency. This is called resource-oriented cloud management.
Developers and testers focus on how to turn business requirements into online services that can be used quickly and securely. A combination of one or more services is called an application. An application runs on a series of cloud resources and becomes a logical group of resources. We will establish application development, testing, and O&M processes and configure these processes on an application management platform. This is application-oriented cloud usage.
We connect related personnel to the cloud through the cloud resource management platform and application management platform at Alibaba. We abstract processes in platforms to shield the cloud technology details and improve the cloud using the efficiency of various roles. The two most important assets of the enterprise are accumulated, which are resources and applications.
Fully cloud-based technical systems, such as Kubernetes, containerization, Serverless, and Service Mesh have gradually become industry standards. Cloud-native has become the goal of many enterprises' technological upgrades. The DevOps tool system must be upgraded to adapt to the trend of cloud-native.
Kubernetes is a representative cloud-native technology. Firstly, it has evolved from the container orchestration capability to effectively shield the underlying physical resources and develop powerful programmable extension capabilities. Based on this capability, it develops a number of column-level middleware, O&M tools, and a programming framework. Secondly, it is oriented to final states. This declarative resource O&M mode is different from the traditional process-oriented O&M mode. It offers the opportunity to get rid of manual control and implement unattended changes. Therefore, cloud-native DevOps tools must be able to adapt to cloud-native technologies and products and inherit the final-state orientation idea to improve R&D and O&M efficiency.
Alibaba has integrated GitOps/IaC with cloud-native technologies. Alibaba has created a next-generation cloud-native R&D and O&M platform by combining it with the traditional application management experience. Compared with the traditional mode, the new platform has the following features:
Developers can use code to describe the delivery process and runtime status of the application. The system determines its execution policy based on the changes to gradually approach the final state. During this process, the system can receive user instructions or monitor data changes to independently alter the change path, ensuring the security and reliability of the system.
Architects, SRE, test engineers, and security engineers can implement modular definitions for the application description code, implement the import function in the code, introduce the predefined content, and control rules of each role. The application owner can define the details of an application according to the rules. This hierarchical design reduces the complexity of application definition and meets enterprises' requirements of hierarchical control.
The delivery process, rule configuration, configuration items, and resource configuration are defined by only using the code. This can achieve the convergence of O&M definition and reduce the complexity of developing and understanding various cloud products. A unified operation interface can also form to prevent inconsistency risks caused by different systems and different permission policies.
After any configuration change is simplified to a code change, it can be forwarded to the production environment through the unified CI/CD process safely and reliably. This kind of process consistency can ensure quality, control risks to the greatest extent, and prepare automated test cases for O&M changes.
Alibaba's business is still developing at high speed, and cloud technologies, especially in the cloud-native field, are also maturing rapidly. Software development methods and tool systems need to meet challenges in a timely manner, constantly lower technical thresholds, improve efficiency, and reduce risks. The standardization and openness brought by cloud-native also allow the Alibaba Cloud R&D Team to productize internal practices continuously. The team can export them externally through Apsara DevOps, serving a large number of developers on the cloud. We hope our tools and practices can help you make full use of the cloud and share its benefits with us.
DevOps Capability Improvement Model - Alibaba DevOps Practice Part 26
1,037 posts | 255 followers
FollowAlibaba Cloud Data Intelligence - August 7, 2024
Alibaba Cloud Community - February 4, 2022
Alibaba Cloud Community - March 2, 2022
Alibaba Cloud Data Intelligence - August 14, 2024
Alibaba Cloud Community - February 8, 2022
Alibaba Cloud Community - February 5, 2022
1,037 posts | 255 followers
FollowA unified, efficient, and secure platform that provides cloud-based O&M, access control, and operation audit.
Learn MoreManaged Service for Grafana displays a large amount of data in real time to provide an overview of business and O&M monitoring.
Learn MoreAn enterprise-level continuous delivery tool.
Learn MoreAccelerate software development and delivery by integrating DevOps with the cloud
Learn MoreMore Posts by Alibaba Cloud Community