By Qihua
PolarDB-X V2.4 supports the column-oriented storage query acceleration. It builds a mature column-oriented storage SQL engine on the low-cost OSS storage base to provide superior performance, higher cost-effectiveness, and query acceleration capabilities for query analysis.
In PolarDB-X, to build HTAP capabilities, the Compute Node (CN) and the Columnar work together to realize an efficient data processing and storage system. Compute nodes play an important role in query and transaction management, while columnars focus on column-oriented storage and access optimization of data.
In this collaborative architecture, DDL (Data Definition Language) operations face a series of challenges: how to maintain schema synchronization between CN and Columnar and ensure complete data consistency; how to properly handle potential schema asymmetry caused by binlog transmission delay; how to ensure that column-oriented storage always returns data based on the latest schema when performing SQL accelerated queries.
To effectively address these challenges, we designed the database Multi Schema Evolution process in the column-oriented storage engine so that the schema fully supports Instant DDL. This solution takes into account the data delay issues between CN and Columnar, ensuring the consistency of data versions, and thereby maintaining the overall stability and reliability of the system.
The above figure shows the overall architecture of the PolarDB-X product:
• Global Meta Service (GMS): provides distributed metadata and a global timestamp distributor named Timestamp Oracle (TSO) and maintains Meta information such as Table/Schema and Statistic. GMS also maintains security information such as accounts and permissions.
• Compute Node (CN): provides a distributed SQL engine that contains core optimizers and executors. A CN uses a stateless SQL engine to provide distributed routing and computing and uses the two-phase commit protocol (2PC) to coordinate distributed transactions. A CN also executes DDL operations in a distributed manner and maintains global indexes.
• Data Node (DN): provides a data storage engine. A DN uses Paxos to provide highly reliable storage services and uses multiversion concurrency control (MVCC) for distributed transactions. A DN also provides the pushdown computation feature to push down operators such as Project, Filter, Join, and Agg in distributed systems, and supports local SSDs and shared storage.
• Change Data Capture (CDC): provides a primary/secondary replication protocol that is compatible with MySQL. The primary/secondary replication protocol is compatible with the protocols and data formats that are supported by MySQL binlogs. A CDC uses the primary/secondary replication protocol to exchange data.
• Columnar: provides persistent columnar indexes and maintains and updates columnar indexes in real time based on changes recorded in distributed transaction binlogs to facilitate efficient analytical query processing. By leveraging object storage and working in tandem with CNs, a columnar node provides the scalability required for real-time updates and the capability to execute snapshot-consistent queries.
Imagine that your life is changing over time; you might change jobs and domiciles, or even have new family members. To adapt to these changes, you will make changes to your home, such as transforming the study into a baby room or adding a new workroom. You may not want to move to a new house for every little change.
Similarly, in the PolarDB-X column-oriented storage engine, it is also necessary to adapt to various changes in the table structure when the business is online and avoid performance and storage issues caused by such changes. This is what we call "Schema Evolution".
The following section describes how Schema Evolution improves the capabilities of the column-oriented storage engine.
In the PolarDB-X column-oriented storage engine architecture, we first need to consider the rapid iteration of Schema versions after performing DDL on the CN. As a columnar replica of a CN, Columnar needs to update its Schema in real time after the CN changes Schema, and accurately return the new Schema data in subsequent queries.
Let's imagine a simple scenario where we have a database with a table called "Students" to record student information. Initially, for quick design and simplicity, the student ID field used the varchar type, such as "1001" for the ID of "Bob". Over time, we found that using string types for ID storage was inefficient, so we changed the data type from varchar to bigint.
Then, we need to make a schema change to the student_id column in the "Students" table:
Such a simple DDL operation is generally performed in the row-oriented storage according to the process of creating a temporary table -> copying data (type conversion) -> checking constraints and switching tables, which may involve locking tables. However, this process does not work in the column-oriented storage engine, due to its unique form.
The column-oriented storage engine receives full and incremental data from the row-oriented storage in real time and finally stores the data in the form of ORC (a column-oriented storage format) files to the cloud-native Object Storage Service (OSS). If we copy the schema change plan of row-oriented storage, then we need to copy and convert all relevant data in OSS after a DDL operation. In massive data scenarios that require complex SQL query acceleration, it is impracticable in terms of performance.
In addition, as a replica of the row-oriented storage, the column-oriented storage needs to be consistent with the changes of the CN to meet the real-time performance of writing and reading. This solution generates huge latency, which also breaks the fundamental positioning of the product.
Data Forwarding of Columnar Index
Therefore, column-oriented storage must be implemented based on the design concept of Multi Schema. After Schema changes are generated in the row-oriented storage and synchronized to the column-oriented storage, the column-oriented storage needs to quickly generate a new version of the Schema and ensure that the subsequent incremental data can generate corresponding ORC files based on the new version.
In addition, the Schema of the old version also needs to be saved, because OSS still stores a large amount of data of the old version, which needs to be read in SQL accelerated query scenarios and converted to the latest Schema version. Take the "Students" table as an example. The Schema version of the originally stored Student table is called V1. After the ID column is changed from varchar to bigint, a new Schema version V2 will be generated.
Subsequent DML operations on this table will be performed based on V2 and a new ORC file will be generated. However, the existing ORC file will not be changed for the time being. If the read process needs to read V1 data based on the V2 Schema, we will convert the V1 data type and return it to the user. This not only reduces the performance overhead but also meets the real-time requirements of data.
What about the old version of the data, will it be retained? In the background of the column-oriented storage engine, an asynchronous thread is responsible for compressing and sorting files, where we will change the OSS data of the old version to the latest Schema version, thus reducing the subsequent overhead of reading and writing column-oriented storage data.
After implementing multi Schema in column-oriented storage, we need to control multi-version evolution. Since DML and DDL are transmitted through the binlog between CN and Columnar, which are different processes running on two machines, there exist certain unstable factors, resulting in potential asymmetry in schemas in CN and Columnar. PolarDB-X solves this problem through a series of version control methods. The following is a brief analysis of two possible scenarios:
From the preceding two examples, we can see that due to the uncertainty of the latest Schema version in CN and Columnar at a certain time, we need to design the capability to control multi schema, specify which schema should be used when users read data and complete the correct data conversion. PolarDB-X solves this problem by introducing a multi schema tracking mechanism based on a global timestamp (TSO), which can restore the corresponding version of the schema according to a given TSO. With the help of TSO in the Binlog, we specify the exact point in time when CN and Columnar update the Schema version. With this mechanism, we can achieve the following key capabilities:
• When CN reads files, it can learn the Columnar Schemas of different files and convert them into the current CN Schema version for reading.
• When processing incremental data, Columnar needs to know the Columnar Schema at the corresponding moment of each DML event and write the data to the file based on the Columnar Schema.
• If a fault occurs, data may be restored from the Columnar Schema corresponding to any Binlog file position.
During Schema Evolution, it is critical to ensure data compatibility across versions. As mentioned in the example at the beginning of the article, when the ID column of the "Students" table changes from the original varchar to bigint, this change must ensure a smooth conversion, that is, the varchar type data stored in the ORC file of V1 version can be seamlessly converted and conform to the bigint type constraint of V2 version. Furthermore, the rules for this format conversion must be strictly consistent with the standard conversion behavior of MySQL.
PolarDB-X addresses two types of issues involved:
In addition, to provide users with access to the latest data status, the background compression mechanism of the column-oriented storage engine continuously updates the ORC files of the old version according to the latest schema format, and then generates and maintains the new version of the files.
The features of the column-oriented storage engine are being enriched and expanded to cover a wider range of business scenarios and requirements. This includes the snapshot and archiving features for column-oriented storage replicas. The column-oriented storage engine is able to create exact replicas of row-oriented storage data continuously and in real time, essentially providing a snapshot capability based on any point in time. Users can create a column-oriented storage snapshot at a specific point in time at any time and use this snapshot to access the data at that moment. If users want to keep the snapshot for a long time, the engine also provides an archiving feature to persistently store the snapshot.
In addition, PolarDB-X supports the Time to Live (TTL) policy, which can automatically delete expired data and use the engine to archive it to lower-cost object storage services (such as OSS), thereby optimizing storage costs and data lifecycle management. These features make the column-oriented storage engine not only superior in performance but also provide new possibilities in data management and cost-effectiveness. Schema Evolution is one of the cornerstones supporting these features:
• Based on the multi schema design, the column-oriented storage engine can build snapshots based on the schema corresponding to specific points in time, avoiding problems caused by schema and data asymmetry.
• Based on the detailed data compatibility design, the engine supports smooth conversion policies to ensure that adjustments to the table structure will not affect the existing query and analysis operations as the business develops and changes. It allows data to be smoothly migrated from the old version to the new version and ensures the stability and compatibility of the old query results. When archiving data, the column-oriented storage engine follows the compatibility design principle so that even archived data can be accessed and analyzed under the new Schema version. This provides great flexibility for enterprise information systems with long time spans and frequent structural changes, and also greatly reduces the complexity and cost of maintenance.
Currently, the column-oriented storage engine is under rapid development and iteration. It already supports common DDL operations and can handle common data type changes. At the same time, we are working on improving the support for more complex DDL operations and the snapshot and archiving features.
The following table describes the supported DDL types.
DDL Type | Statement | Supported Items |
---|---|---|
Add a column | ADD COLUMN | Supported |
Drop a column | DROP COLUMN | Supported |
Change the column type | MODIFY COLUMN | Supported |
Change the column type and column name | CHANGE COLUMN | Supported |
Alter the column set default | ALTER COLUMN SET DEFAULT, ALTER COLUMN DROP DEFAULT | Supported |
Rename the table | RENAME TABLE | Supported |
Delete all data from tables | TRUNCATE TABLE | Not supported |
Execute multiple statements by using one statement string | ALTER TABLE [Statement 1], [Statement 2] | Not supported |
Partition modification operations | ||
Not supported |
For more information about DDL support and constraints, please refer to the official documentation of the PolarDB-X column-oriented storage engine: Create and Use CCI_ Cloud-native Database PolarDB (PolarDB) - Alibaba Cloud Help Center.
This article introduces the Schema Evolution capability designed for DDL (Data Definition Language) operations in the PolarDB-X column-oriented storage engine, which improves the support of the engine for schema changes in terms of multi-version, version evolution, and data compatibility. It is not only fully compatible with the DDL capability of row-oriented storage MySQL, but also avoids performance consumption, and achieves real-time and consistency of data in column-oriented storage replicas, thus supporting Instant DDL capabilities.
In addition, Schema Evolution also supports the subsequent snapshot and archiving features of the column-oriented storage engine, thus covering more business scenarios and further improving the business value of the engine.
Best Practices for GanosBase Real-Time Heatmap Aggregation Query
ApsaraDB - April 10, 2024
ApsaraDB - June 19, 2024
ApsaraDB - November 12, 2024
ApsaraDB - January 3, 2024
ApsaraDB - October 16, 2024
ApsaraDB - June 4, 2024
Alibaba Cloud PolarDB for PostgreSQL is an in-house relational database service 100% compatible with PostgreSQL and highly compatible with the Oracle syntax.
Learn MoreAlibaba Cloud PolarDB for Xscale (PolarDB-X) is a cloud-native high-performance distributed database service independently developed by Alibaba Cloud.
Learn MoreAlibaba Cloud PolarDB for MySQL is a cloud-native relational database service 100% compatible with MySQL.
Learn MoreTSDB is a stable, reliable, and cost-effective online high-performance time series database service.
Learn MoreMore Posts by ApsaraDB