By Shanlie, Wang Yongmeng, and Zhang Yu
RiverGame Co. Ltd is an emerging mobile gaming enterprise. Since its establishment in 2018, it has faced the global gaming market and had a place in the highly competitive gaming market by creating interesting game experiences. After only two years, RiverGame Co. Ltd has become one of the top 30 Chinese game manufacturers, with the majority of their success in overseas markets coming from their game, Top War. As indicated by their slogan, better Chinese games, RiverGame Co. Ltd is enriching its game categories and hoping to bring more happiness to worldwide players.
The system scale and complexity of game servers are undergoing major changes because of the rapid growth of the business. Fortunately, RiverGame Co. Ltd has a small (but powerful) technical team that has always been exploring all the cutting-edge technology in their field. It also adopts multiple methods to improve the system architecture to better support business needs and reduce IT costs.
There is a very important task in the multiple iterations of the technical architecture, which is to abstract and separate the common business capabilities in game scenarios from the main server of the game to the unified service layer. Business capabilities separated from the main server include account management, Instant Messaging (IM), content security, membership system, information push, and game behavior analysis. This separation reduces the business complexity of the main game server and implements the support for the core game scenarios on the main server. In addition, common capabilities can be reused in multiple game categories. This operation reduces R&D costs and improves R&D efficiency.
The capability splitting and the decrease in the business coupling degree have facilitated continuous iterations and pre-research on new technologies. This has also created an opportunity for RiverGame to explore the cloud-native Serverless field in depth. Serverless architecture can give full play to the rapid scalability of computing resources and is an important development direction of cloud computing. In games, the main game servers provide complex core business logic that requires long-term operation and low-latency data interaction among multiple player terminals. Therefore, virtual machines or containers are needed for the main servers. The business scenarios surrounding the games separated from the Top War main server have become the first choice for piloting the Serverless technical architecture.
The online translation service was the first scenario piloted in Serverless, which is related to the company’s globalization strategy. The enterprise’s typical work on Top War is a game facing the global market that is attracting worldwide players. When entering the game interface, there are players from different countries discussing various game-related topics in different languages.
In this business scenario, worldwide players are brought together through a simple online translation function that offers an excellent user experience. The simple and easy-to-use design is also one of the reasons why Top War has been highlighted repeatedly by players in various major app markets.
It is impossible for RiverGame to develop a real-time translation tool that supports dozens of languages from scratch. Fortunately, the communications between game players are often brief, so the translation results do not need to be completely accurate. The timeliness of backend processing is the real focus. Through a simple preprocessing of players' requests, the translation work can be forwarded to a third-party platform to complete since platforms like Google Translator have already provided powerful online translation capabilities.
This is a simple function, but it still faces certain challenges during the implementation of the technical architecture. The number of online players in each period is not the same, and there are peaks and valleys. When the number of online players is relatively large, there will be a large number of chats. Moreover, the number of chats is not proportional to the number of players online. When encountering some hot events, heated discussions among global players will be triggered, and there will also be an upsurge in the number of messages requiring online translation. Thus, scalable architecture is required to process players' translation requests.
The original architecture was implemented through Server Load Balancer (SLB) and professional Hypertext Preprocessor (PHP) application clusters based on the EasySwoole framework.
In this architecture, the main applications written in the PHP perform shows a series of preprocessing on players' translation requests, including replacing symbol codes and filtering sensitive content. Then, the requests are forwarded to a third-party translation platform to obtain the translation results. This is a widely used technical architecture with high concurrent processing capabilities. In the era of cloud computing, based on the scalable characteristics of cloud resources, the throughput of the entire cluster is dynamically adjusted as business volume changes. However, from the perspective of cloud-native, there are still some imperfections in this architecture while running in a large-scale production environment:
Is there any solution that can help the Technical Team focus on the implementation of the business logic and refine resource allocation based on the real-world requests from players to maximize resource utilization? As cloud computing develops rapidly, major cloud manufacturers are actively exploring new solutions to solve cost and efficiency problems in a more "cloud-native" manner. The Serverless solution based on Alibaba Cloud FC is prominent in this field.
FC is an event-driven and fully managed computing service. When using FC, developers only need to write and upload code without infrastructures, such as management servers. FC automatically prepares computing resources, running business logic in a scalable and reliable manner. Meanwhile, FC provides additional features, such as log query, performance monitoring, and alerting, to ensure stable system operations.
Compared with the traditional application server that provides services externally while running, the biggest difference is that FC pulls up computing resources as needed for processing tasks and recycles computing resources automatically after the tasks are completed. This solution lines up with the Serverless concept, which maximizes resource utilization, reduces system maintenance workload, and usage cost. Since there is no need to apply for computing resources in advance, users do not need to consider capacity or scalability with the pay-as-you-go model.
For simple business logic implementation, such as online translation, it is easy to migrate from the traditional architecture to the Serverless architecture. Each translation request by players is processed as an FC task. The specific procedure pulls up the corresponding computing resources for processing and releases the resources automatically after the task is completed. Since the RiverGame Team is most familiar with the Java language, the team uses the Java language to implement online translation in the transforming process to Serverless, making full use of the various ecological capabilities of the Java system. No specific development languages or specific business logic are designated by FC. It supports all mainstream development languages. After the transformation by Serverless, the online translation system architecture becomes simpler.
Functions configured with HTTP triggers can respond directly to requests sent by players and schedule corresponding computing resources to process in a scalable and reliable way. Since the task allocation in FC can fully match the user traffic changes in the frontend, useless SLB can be removed from the architecture. Additionally, the long-running application clusters are no longer needed. FC platforms can pull up a large number of computing resources quickly to execute tasks concurrently and ensure the high availability of the entire architecture. In this process, Redis caches some simple statements with high frequency to reduce dependencies on third-party platforms. The biggest surprise brought by this architecture is that the team no longer needs to carry out capacity planning and auto scaling management. Therefore, they can focus on meeting their business needs and achieving business innovation in more fields.
Compared to Node.js or other languages, Java instances take a longer time to initialize and load categories. FC enables computing resources to be pulled up in milliseconds through various optimizations. However, it often takes a few seconds for a Java program to run, which is not good for delay-sensitive services, such as online translation. The solution proposed by Alibaba Cloud is to use a single-instance with multiple-concurrency and reserved instances to solve problems of delay-sensitive businesses.
Each FC instance pulled up through single-instance with multiple-concurrency can process up to 100 tasks concurrently, reducing the average execution time, costs, and the probability of a cold start. With reserved instance optimization, FC allocates computing resources in advance based on the function load changes. As a result, the system can use reserved instances to process requests during the expansion of on-demand instances, eliminating the delay caused by cold starts completely.
The transformed online translation service uses a Serverless architecture with on-demand computing resources, making full use of the scalability of cloud computing. In terms of cost, since the application no longer needs to run for a long time to provide external services, the use of cloud resources can be fully matched to the real-world changes in business volume. Thus, the average resource utilization can be improved significantly. In terms of system throughput, FC can pull up the computing resources with tens of thousands of instances in a short period and support massive concurrency during peak hours or when user requests surge. Moreover, there is no need for preliminary work in the capacity assessment. In terms of system maintenance, there is no need to reserve computing resources or maintain the underlying software and hardware, which reduces the operational costs significantly. Thus, the RiverGame Technical Team RiverGamecan focus on the implementation of complex business logic and technological innovations. In online translation scenarios, the Serverless solution based on FC saves more than 40% of IT costs compared with traditional architectures.
Another R&D efficiency improvement is the version and alias management feature provided by FC. The version is equivalent to a service snapshot that helps users release one or more versions of the service using an alias. This enables continuous integration and release in the software development lifecycle and implements grayscale iteration of services conveniently.
In later architecture optimizations, RiverGame Co. Ltd will try to preprocess the original content as much as possible using machine learning technology to reduce dependencies on third-party platforms. In the AI inference field, the advantages of Serverless architecture can also be applied to schedule large amounts of computing resources in a short time for large-scale concurrent processing through pre-trained deep learning models.
After the successful pilot of Serverless in online translation scenarios, RiverGame continues to explore scenarios that match Serverless in more business areas. Now, Serverless technology has been introduced into fields, such as push services, content security, and game behavior analysis. In the future, RiverGame will continue to explore the Serverless architecture based on its technical characteristics, so it can enjoy the benefits of cloud computing while embracing new technologies.
208 posts | 12 followers
FollowAlibaba Cloud Native - March 26, 2021
Alibaba Clouder - January 20, 2021
Alibaba Container Service - February 18, 2021
Alibaba Clouder - August 4, 2020
Alibaba Cloud Community - March 2, 2022
Alibaba Cloud Serverless - January 3, 2024
208 posts | 12 followers
FollowAlibaba Cloud Function Compute is a fully-managed event-driven compute service. It allows you to focus on writing and uploading code without the need to manage infrastructure such as servers.
Learn MoreWhen demand is unpredictable or testing is required for new features, the ability to spin capacity up or down is made easy with Alibaba Cloud gaming solutions.
Learn MoreHigh Performance Computing (HPC) and AI technology helps scientific research institutions to perform viral gene sequencing, conduct new drug research and development, and shorten the research and development cycle.
Learn MoreDeploy custom Alibaba Cloud solutions for business-critical scenarios with Quick Start templates.
Learn MoreMore Posts by Alibaba Cloud Native