×
Community Blog Analysis on the Serverless Elasticity of Cloud-Native AnalyticDB for MySQL

Analysis on the Serverless Elasticity of Cloud-Native AnalyticDB for MySQL

This article discusses data lakehouse edition, AnalyticDB for MySQL, and cost reduction and efficiency enhancement.

By Li Wei (Muyuan) (a core R&D Staff Member of Cloud-Native AnalyticDB for MySQL with ten years of R&D experience in the data warehouse, data lake, big data, and cloud-native). Currently, he focuses on the cloud-native data warehouse and serverless elasticity.

Background

Nowadays, the world is suffering from slow economic growth and sluggish market demand. The way for enterprises to effectively reduce costs is to enhance digital construction and improve operation efficiency. In this context, the cloud-native data warehouse, AnalyticDB for MySQL data lakehouse edition, can be used flexibly on-demand.

The following is a figure of the data lakehouse edition. The orange part is the new function of the data lakehouse edition compared with the data warehouse edition, and the gray part is the iterative upgraded function of the data lakehouse edition compared with the data warehouse edition.

1

1. The Value and Challenges of Serverless Elasticity

Definition of Serverless

Compared with Serverful, Serverless has the following three aspects of optimization (from Berkeley [1])

  • Resource Decoupling: It weakens the connection between storage and computing. Storage and computing of services are deployed separately. Storage is no longer a part of the service but has evolved into a separate service. This makes computing stateless and easier to schedule and scale.
  • Auto Scaling: The execution of code no longer requires the manual allocation of resources. Users do not need to specify the required resources (such as the number of machines, the size of bandwidth, disk, etc.) for the operation of the service. Only a copy of the code is needed, and the rest is handled by the Serverless platform.
  • Pay-by-Usage: Serverless is billed based on the usage of services (such as the number of calls, duration, and normalized resources) instead of based on the usage of resources (such as ECS instances and VM specifications) as in the traditional serverful services.

Values of Serverless Elasticity

When you use cloud-native data warehouse service, have you encountered the following problems and expected the service provider to help you solve them?

Case 1: The mixed SQL load of the service includes short queries and offline ETL. When offline ETL is running, the response time of short queries is affected.

Case 2: In order to run a large offline SQL statement, the instance is scaled out. When the offline SQL statement is not running, the resources of the instance are wasted.

Case 3: During the peak period of the online load, you need to manually scale out the instances, and you may easily mess things up.

Case 4: In emergencies, the instances may fail to be scaled out due to the insufficient underlying resources caused by load increases.

Case 5: The instance elasticity efficiency is low, and the startup time is noteworthy compared with the usage time of service resources.

The AnalyticDB for MySQL Team will be continuously paying attention to the issues related to Serverless elasticity. It will also help enterprises build a digital infrastructure with better serverless elasticity capabilities through the productization of technology.

Challenges of Serverless Elasticity

Serverless elasticity is used to help users solve the problems above. At the same time, it poses corresponding challenges to AnalyticDB for MySQL in terms of scheduling, cost, inventory, and elasticity efficiency. For example:

  • Inventory Supply: Whether there are sufficient resources to support large-scale elastic demand
  • Load Decoupling: The way to intelligently identify and decouple online and offline SQL statements for the same instance
  • Offline Elasticity: The way to realize on-demand use of resident instance quotas for batch processing after load decoupling.
  • Online Elasticity: How online elasticity intelligently perceives load changes for elasticity
  • Elasticity Efficiency: How elasticity efficiency reduces the time overhead and cost of elasticity

These challenges have been gradually solved by AnalyticDB for MySQL. We're also trying to share the technologies used in the process so everyone can use the related product capabilities of AnalyticDB for MySQL to meet business needs.

2. Serverless Elastic Architecture

In order to provide Serverless elastic product capabilities, two aspects of fundamental construction need to be implemented for the architecture, including fine-grained elastic unit definition and end-to-end pooled scheduling architecture for engines, resource scheduling, and resource inventory.

Definition of ACU Normalized Resources

1ACU is approximately equal to 1Core 4GB is introduced as the definition of normalized resource to measure the usage of elastic resources. 1 ACU has a small resource unit, which can support AnalyticDB for MySQL to achieve the most fine-grained elasticity and help users reduce costs to the extreme.

End-to-End Pooled Scheduling Architecture

When elastic inventory assurance, elastic efficiency, and fine-grained resource elasticity have to be guaranteed, it is difficult for the traditional ECS-based architecture with exclusive deployment to do that. In the AnalyticDB for MySQL of the data lakehouse edition, resource scheduling is built based on ACK/Kubernetes, and resource pools use the two-level inventory (fixed and elastic). The overall architecture can be divided into three layers:

  • Engine Scheduling Layer: The elastic resource orchestration service of different engines (such as on-demand elasticity of batch processing and time-sharing elastic resource application for online computing)
  • Unified Scheduling Layer: Hybrid scheduling is built based on the capabilities of ACK/Kubernetes, and storage, computing, and network infrastructure are managed simultaneously.
  • Elastic Inventory Scheduling Layer: Two-level resource pools are used to manage both fixed and elastic resource pools. This ensures resource supply during elasticity and optimizes elasticity efficiency.

2

3. Technical Analysis of Serverless Elasticity

In terms of product capabilities, AnalyticDB for MySQL supports practical (two-level inventory assurance), fast (high elasticity efficiency), and accurate (pertinent to business without wasted resources).

Practical - Pooled Elastic Inventory Supply Technology

Whether it is the query elasticity of the offline loads or the elasticity of the nodes at the online instance level, inventory assurance is required. If you stock up on a batch of machines to meet the elasticity requirements, it will bring a huge inventory cost burden to the AnalyticDB for MySQL service when the user resources are scaled in. In order to ensure the elastic resource supply offline and minimize the cost of AnalyticDB for MySQL, a two-level elastic inventory supply capability based on the user profile operation is established.

  • Resource Requirements: Resource requirements converted from user query loads, including timing elasticity and automatic elasticity
  • Inventory Operations: It consists of a fixed pool and an elastic pool, where the fixed pool has an inventory supply cycle of 0.5-15 days, providing better elasticity efficiency and estimable inventory. The inventory supply cycle of the elastic pool is at the level of 7s-180s. The cost is high, and the inventory is inestimable. The inventory operation module does the prediction through the inventory water level profile of different resources to determine the purchase and release quantity of different resources.
  • Inventory Supply: After receiving the resource demand of inventory operation, the inventory supply module will select the appropriate X-Dragon, ECS, and ECI models to meet the resource demand. Here, factors (such as the inventory of the models themselves, the combination of different models, and the cost performance of the models) will be taken into consideration.

3

Fast - Pooled Elastic Efficiency Optimization Technology

In addition to inventory supply technology to support load elasticity, we have elasticity efficiency as another important technology. If it takes ten minutes to start an offline query, it will affect the user experience and cause a high additional cost. In the on-demand resource mode of AnalyticDB for MySQL offline queries, we can do 1200ACU-scale queries with an elastic time of only about 10s. The AnalyticDB for MySQL team has made end-to-end optimizations from the query execution model and the storage of pods to the network of pods to achieve this efficiency.

  • Master Pod Cache Pool: One Master Pod and several Executor Pods are required to execute a query. Starting the Master Pod causes an overhead of time. Multiple Executor Pods can be started concurrently. We have built a cache pool for Master Pod to reduce the overhead of starting the Master Pod, thus reducing the startup overhead of this part to 100ms level.
  • Cache Disk Cache Pool: The Executor Pod of AnalyticDB for MySQL generates data (such as Spill and shuffle) in the implementation process to be stored in the Cache disk of Pod if the overhead costs of on-demand mounting of the cloud disk calling service on the cloud disk link from Cache disk are high. We have built a cache pool of Cache disks on the nodes of the fixed pool, and the overhead of mounting disks when pods are started is reduced to about 0.5s.
  • Network Interface Controller Cache Pool: The network of AnalyticDB for MySQL's execution Pod uses the cloud-native ENI. On-demand mounting cost is high as it calls the VPC service in the process. We constructed buffer pools for ENI on the fixed pool of nodes, and the mounting time of the network interface controller is reduced to 0.5s.

4

Accurate - On-Demand Elasticity Technology Catering to the Business Load

On-Demand Elasticity Technology of the Online Load

After online/offline load decoupling, offline loads can achieve extreme resource elasticity according to query. However, online queries require high RT, and instance node elasticity is recommended to cope with load changes. On-demand elasticity of AnalyticDB for MySQL online load achieves auto elasticity by establishing a closed-loop feedback link of load awareness → inventory supply → instance elasticity.

  • Load Awareness: It includes two modes: scheduled elasticity rules set by users (already productized) and AnalyticDB Workload Manager self-aware service load for elasticity (under development).
  • Inventory Supply: After the load awareness module generates the specific demand for resource scaling, the inventory supply module will prepare resources in advance or in real-time.
  • Instance Elasticity: After resources are ready, the instance elastic module scales the instances to support the resource requirements of workload awareness.

5

Decoupling On-Demand Elasticity Technology of Offline Load

When using AnalyticDB for MySQL, the hybrid load scenario includes both online analysis and ETL offline analysis. Under the architecture with no decoupling from the online load, both the execution tasks of the online query and the offline analysis will use compute nodes, which will cause two problems:

  • Offline Affects the Online Load Stability: When running offline, offline tasks consume a lot of resources (such as CPU), resulting in online service jitter, as online tasks are quite sensitive to node jitter.
  • High Cost: You need to start resident resources in advance to ensure that offline queries have sufficient resources. When offline queries are over, these resources will run empty, and users need to bear the cost of these empty runs.

AnalyticDB for MySQL enables elastic resource supply at the offline query level to solve this problem. Resources required for offline queries are completely isolated from online resources, and online loads are not affected. Resources for offline queries are applied for use on-demand, and users do not need to bear the cost of running resources.

6

AnalyticDB for MySQL is equipped with the four basic technical capabilities above to support Serverless elasticity. In the future, AnalyticDB for MySQL will continue to strengthen its technical construction in terms of intelligence, speed, and cost-saving.

4. Serverless Elasticity Effects and Best Practices

Based on the preceding technologies, the Serverless elasticity of AnalyticDB for MySQL can bring about the following effects:

  • Load Awareness: Auto-scaling based on rules and loads is supported.
  • Billing: Minute-level billing is supported. Batch processing tasks provide SQL job-level billing information.
  • Cost: Pay-as-you-go billing is supported. The cost of usage can be reduced by up to 80%.
  • Elastic Capability: A single computing task supports elastic expansion in the range of 0 to 10,000 ACUs in seconds.

AnalyticDB for MySQL is suitable for users that have low-cost offline ETL processing and need to use high-performance online analysis functions to support BI reports, interactive queries, and applications.

Reference

[1] The Berkeley Paper: https://www2.eecs.berkeley.edu/Pubs/TechRpts/2019/EECS-2019-3.pdf

0 1 0
Share on

ApsaraDB

443 posts | 93 followers

You may also like

Comments