This topic describes PolarDB for Xscale and its features. In this manual, it is referred to as PolarDB-X.
Introduction
PolarDB for Xscale
PolarDB-X is a cloud-native, distributed database service independently developed by Alibaba Cloud. It offers high throughput, large storage, low latency, high scalability, and high availability to cater to a wide variety of business requirements.
PolarDB-X has supported all services related to the Double 11 Shopping Festivals held by Alibaba Group each year. After a decade of continuous improvement, PolarDB-X boasts robust data consistency, superior system stability, and exceptional scalability. It is widely adopted in institutions and public sectors, such as justice, finance, taxation, transportation, logistics, and energy.
As an independently managed service, PolarDB-X is developed around the principle of building distributed capabilities around the MySQL ecosystem to minimize the learning curve and usage expenses for users.
In addition, PolarDB-X integrates centralized and distributed architectures, better supporting business growth.
Features
PolarDB-X can be deployed by using containers based on Alibaba Cloud resources. It adopts a shared-nothing architecture to decouple storage and compute provisioning. This design facilitates hierarchical capacity planning tailored to specific business requirements. PolarDB-X offers full compatibility with the MySQL open source ecosystem, including deep compatibility with SQL syntax, transaction behavior, and ecosystem tools. This facilitates the migration of applications from MySQL to PolarDB-X with minimal or no code modifications.
PolarDB-X supports the smooth evolution from the single-server centralized architecture to the large-scale distributed architecture, allowing for a minimum of one server and a maximum of 1024 servers (PB-level storage).
Financial-grade high availability
PolarDB-X adopts a multi-node architecture and uses Paxos to ensure strong consistency between nodes. Paxos requires that over half of the nodes confirm each write operation. This ensures that a cluster still provides services properly even if one of its nodes fails.
PolarDB-X can be deployed in different modes for disaster recovery. For example, in your PolarDB-X instance, you can deploy three nodes in a single data center in a region, three nodes in three data centers in the same region, or five nodes in three data centers across two regions to meet the financial-grade disaster recovery requirements.
Transparent distribution
PolarDB-X strives to offer the same user experience as standalone MySQL databases. Therefore,PolarDB-X offers an easy-to-use distributed architecture that is transparent to users.
By default, PolarDB-X uses a primary key for sharding. This way, you do not need to specify a partition key when you migrate business data to PolarDB-X.
PolarDB-X delivers exceptional performance and strong consistency for distributed transactions. PolarDB-X uses the self-developed X-Paxos protocol to achieve a zero recovery point objective (RPO) during a failover. Additionally, PolarDB-X leverages the TSO policy and distributed MVCC capability to ensure the isolation and consistency of distributed transactions.
PolarDB-X supports linear scaling and uses consistent hashing for partitioning. To mitigate hotspotting and achieve load balancing, PolarDB-X can offload requests from overburdened nodes. During a scale-out, PolarDB-X implements computation offloading while maintaining stringent data consistency. The scale-out of a PolarDB-X instance does not affect your business. PolarDB-X also supports parallel queries and throttling to ensure business continuity during scaling.
PolarDB-X provides binary logging to resolve the issues of data forwarding from compute nodes to data nodes in distributed databases. In distributed databases, if you restore nodes based on backup files that are created at different points in time, data inconsistency can occur. To resolve this issue, PolarDB-X ensures data consistency before backup and creates backup files based on the globally consistent data.
Integration of centralized and distributed architectures
PolarDB-X integrates centralized and distributed architectures, and therefore has the scalability of distributed databases and performance of a standalone database. It can switch between the two types of architectures smoothly. Data nodes in PolarDB-X are isolated as centralized databases, which are fully compatible with the standalone databases. When distributed scaling is required as your business expands, PolarDB-X effortlessly transitions to the distributed architecture. The corresponding distributed components integrate seamlessly with the original data nodes to facilitate scaling, which eliminates the need for data migration or application modifications.
PolarDB-X provides two editions: Standard Edition (centralized architecture) and Enterprise Edition (distributed architecture). You can upgrade your PolarDB-X instance from Standard Edition to Enterprise Edition.
HTAP
With the rising adoption of cloud-native technologies, next-generation cloud-native data warehouses like Snowflake and the HTAP architecture are undergoing continuous refinement. HTAP is poised to become the norm for databases.
PolarDB-X offers the clustered columnar index (CCI) feature. By default, row-oriented tables have primary and secondary indexes. CCI is an additional column-oriented secondary index covering all row-oriented columns. A table can have both row- and column-oriented data. In addition, cost-based optimizers and vectorized operators are developed for HTAP. A set of SQL engines are used to support HTAP.
Open source and multi-cloud deployment
To meet the business requirements of different industries, PolarDB-X provides four deployment modes: Alibaba public cloud, Apsara Stack, DBStack, and Lite edition.
Alibaba public cloud supports fast iteration of PolarDB-X and ensures the stability of PolarDB-X instances. You can deploy PolarDB-X instances on Alibaba public cloud in fully managed mode. PolarDB-X provides a high-performance cloud-native database service in 13 regions worldwide.
Apsara Stack supports core Alibaba cloud services to meet compliance requirements for security and isolation. Note that PolarDB-X Lite on Alibaba public cloud and Apsara Stack varies based on types of deployment resources.
DBStack is a lightweight platform for database management and supports core Alibaba cloud services. DBStack meets the business requirements for high-performance, high-availability, and cost-effective database solutions in diverse scenarios.
PolarDB-X Lite provides up-to-date features and allows you to use minimal resources to create a distributed database cluster.
Security and stability
PolarDB-X has passed several national-level security certifications. It is successfully applied in the core systems in industries that have higher security requirements, such as finance and telecommunications.
PolarDB-X provides all-round security measures, such as IP address whitelists, SSL, TDE encryption, backup encryption, always-confidential, SQL audit and tracing, three-role mode, and tag permissions.
PolarDB-X provides financial-grade high availability and disaster recovery. It achieves a zero recovery point objective (RPO) during a failover and supports deployment in three data centers across two regions. Therefore, it satisfies the level 5 disaster recovery standards of the finance industry.
Scenarios
Process a large number of transactions with low latency
Scenario
Transactions are involved in Internet business. Transaction systems are one of the core components in information systems. Business continuity, transactional consistency, and system security are the fundamentals to ensure that transaction systems can run as expected. In the Internet era, transaction systems that can run with heavy load and low latency for a long time become the trend.
Features
Financial-grade high availability and transparent distribution
Store data in a centralized manner
Scenario
This scenario is also referred to as data centralization or data collation. In this scenario, an operational data store (ODS) is used to store enterprise business data and aggregate data sources that are vertically split. This scenario requires databases that support highly concurrent write operations, mass storage, multidimensional queries, and low-cost data processing.
Features
Transparent distribution, HTAP, security, and stability.
Implement the database and table sharding architecture
Scenario
This scenario demands high throughput, high concurrency, robust stability, and distributed O&M capabilities (such as distributed DDL queries and scaling). Open-source components can be used to implement the database and table sharding architecture.
Features
Transparent distribution
Transform databases from a traditional architecture to a distributed architecture
Scenario
The growth of the business leads to the generation of a large volume of data. If the maximum capacity of standalone databases is about to be exceeded or large single tables cause performance and maintenance issues, you can transform databases from a traditional architecture to a distributed architecture. This helps resolve the preceding issues in a cost-effective manner. If you want to transform information systems from a traditional architecture to a distributed architecture, one of the key difficulties is to transform databases. One of the core demands that database users have is to use distributed databases in the same way as standalone databases.
Features
Transparent distribution and integration of centralized and distributed architectures
Implement modular disaster recovery
Scenario
In industries such as finance and telecommunications, it is necessary to ensure the continuity of core services as workloads increase. To achieve this, the current architecture is transformed to a distributed one to create a modular framework. Each module processes its own business workloads, implementing data center-level fault isolation and active geo-redundancy.
Features
Financial-grade high availability, transparent distribution, security, and stability.
Run transactional and analytical queries in hybrid mode
Scenario
Business on the Internet tends to be real-time and intelligent. Therefore, you may need to perform hybrid transaction/analytical processing (HTAP) in the same data source. To meet these requirements, databases must be easy-to-use and can ensure data consistency and security.
Features
HTAP
Minimize business costs and enhance productivity.
Scenario
The business workload tends to be stable. Databases need to be optimized to minimize business costs. The following features are required: one-click MySQL migration, data compression, and data merge.
Features
Integration of centralized and distributed architectures, HTAP, open source, and multi-cloud deployment
Multi-cloud deployment for disaster recovery
Scenario
Self-built cross-cloud disaster recovery is required to prevent vendor lock-in and enhance technological mastery and failure escape. A database is required to cater to diverse demands.
Features
Open source and multi-cloud deployment