Full Support of PolarDB-X Columnar Index for Multi Schema

By Qihua

Background

PolarDB-X V2.4 supports the column-oriented storage query acceleration. It builds a mature column-oriented storage SQL engine on the low-cost OSS storage base to provide superior performance, higher cost-effectiveness, and query acceleration capabilities for query analysis.

In PolarDB-X, to build HTAP capabilities, the Compute Node (CN) and the Columnar work together to realize an efficient data processing and storage system. Compute nodes play an important role in query and transaction management, while columnars focus on column-oriented storage and access optimization of data.

In this collaborative architecture, DDL (Data Definition Language) operations face a series of challenges: how to maintain schema synchronization between CN and Columnar and ensure complete data consistency; how to properly handle potential schema asymmetry caused by binlog transmission delay; how to ensure that column-oriented storage always returns data based on the latest schema when performing SQL accelerated queries.

To effectively address these challenges, we designed the database Multi Schema Evolution process in the column-oriented storage engine so that the schema fully supports Instant DDL. This solution takes into account the data delay issues between CN and Columnar, ensuring the consistency of data versions, and thereby maintaining the overall stability and reliability of the system.

The above figure shows the overall architecture of the PolarDB-X product:

• Global Meta Service (GMS): provides distributed metadata and a global timestamp distributor named Timestamp Oracle (TSO) and maintains Meta information such as Table/Schema and Statistic. GMS also maintains security information such as accounts and permissions.

• Compute Node (CN): provides a distributed SQL engine that contains core optimizers and executors. A CN uses a stateless SQL engine to provide distributed routing and computing and uses the two-phase commit protocol (2PC) to coordinate distributed transactions. A CN also executes DDL operations in a distributed manner and maintains global indexes.

• Data Node (DN): provides a data storage engine. A DN uses Paxos to provide highly reliable storage services and uses multiversion concurrency control (MVCC) for distributed transactions. A DN also provides the pushdown computation feature to push down operators such as Project, Filter, Join, and Agg in distributed systems, and supports local SSDs and shared storage.

• Change Data Capture (CDC): provides a primary/secondary replication protocol that is compatible with MySQL. The primary/secondary replication protocol is compatible with the protocols and data formats that are supported by MySQL binlogs. A CDC uses the primary/secondary replication protocol to exchange data.

• Columnar: provides persistent columnar indexes and maintains and updates columnar indexes in real time based on changes recorded in distributed transaction binlogs to facilitate efficient analytical query processing. By leveraging object storage and working in tandem with CNs, a columnar node provides the scalability required for real-time updates and the capability to execute snapshot-consistent queries.

Multi Schema Evolution

Imagine that your life is changing over time; you might change jobs and domiciles, or even have new family members. To adapt to these changes, you will make changes to your home, such as transforming the study into a baby room or adding a new workroom. You may not want to move to a new house for every little change.

Similarly, in the PolarDB-X column-oriented storage engine, it is also necessary to adapt to various changes in the table structure when the business is online and avoid performance and storage issues caused by such changes. This is what we call "Schema Evolution".

The following section describes how Schema Evolution improves the capabilities of the column-oriented storage engine.

Multi Schema

In the PolarDB-X column-oriented storage engine architecture, we first need to consider the rapid iteration of Schema versions after performing DDL on the CN. As a columnar replica of a CN, Columnar needs to update its Schema in real time after the CN changes Schema, and accurately return the new Schema data in subsequent queries.

Let's imagine a simple scenario where we have a database with a table called "Students" to record student information. Initially, for quick design and simplicity, the student ID field used the varchar type, such as "1001" for the ID of "Bob". Over time, we found that using string types for ID storage was inefficient, so we changed the data type from varchar to bigint.

Then, we need to make a schema change to the student_id column in the "Students" table:

Such a simple DDL operation is generally performed in the row-oriented storage according to the process of creating a temporary table -> copying data (type conversion) -> checking constraints and switching tables, which may involve locking tables. However, this process does not work in the column-oriented storage engine, due to its unique form.

The column-oriented storage engine receives full and incremental data from the row-oriented storage in real time and finally stores the data in the form of ORC (a column-oriented storage format) files to the cloud-native Object Storage Service (OSS). If we copy the schema change plan of row-oriented storage, then we need to copy and convert all relevant data in OSS after a DDL operation. In massive data scenarios that require complex SQL query acceleration, it is impracticable in terms of performance.

In addition, as a replica of the row-oriented storage, the column-oriented storage needs to be consistent with the changes of the CN to meet the real-time performance of writing and reading. This solution generates huge latency, which also breaks the fundamental positioning of the product.

Data Forwarding of Columnar Index

Therefore, column-oriented storage must be implemented based on the design concept of Multi Schema. After Schema changes are generated in the row-oriented storage and synchronized to the column-oriented storage, the column-oriented storage needs to quickly generate a new version of the Schema and ensure that the subsequent incremental data can generate corresponding ORC files based on the new version.

In addition, the Schema of the old version also needs to be saved, because OSS still stores a large amount of data of the old version, which needs to be read in SQL accelerated query scenarios and converted to the latest Schema version. Take the "Students" table as an example. The Schema version of the originally stored Student table is called V1. After the ID column is changed from varchar to bigint, a new Schema version V2 will be generated.

Subsequent DML operations on this table will be performed based on V2 and a new ORC file will be generated. However, the existing ORC file will not be changed for the time being. If the read process needs to read V1 data based on the V2 Schema, we will convert the V1 data type and return it to the user. This not only reduces the performance overhead but also meets the real-time requirements of data.

What about the old version of the data, will it be retained? In the background of the column-oriented storage engine, an asynchronous thread is responsible for compressing and sorting files, where we will change the OSS data of the old version to the latest Schema version, thus reducing the subsequent overhead of reading and writing column-oriented storage data.

Version Evolution

After implementing multi Schema in column-oriented storage, we need to control multi-version evolution. Since DML and DDL are transmitted through the binlog between CN and Columnar, which are different processes running on two machines, there exist certain unstable factors, resulting in potential asymmetry in schemas in CN and Columnar. PolarDB-X solves this problem through a series of version control methods. The following is a brief analysis of two possible scenarios:

"CN Schema" version ahead of Columnar

Before time T1, CN executes the DDL statement.
At time T1, CN completes its internal execution logic and updates the CN Schema to version V2.
At time T2, Columnar receives the DDL event in the Binlog, updates the Columnar Schema to version V2, and generates the OSS file of the new version (V2).

"CN Schema" version behind Columnar

At time T1, CN executes the DDL statement.
At time T2, Columnar receives the DDL event in the Binlog, updates the Columnar Schema to version V2, and generates the OSS file of the new version (V2).
At time T3, CN completes its internal execution logic and updates the CN Schema to version V2.

From the preceding two examples, we can see that due to the uncertainty of the latest Schema version in CN and Columnar at a certain time, we need to design the capability to control multi schema, specify which schema should be used when users read data and complete the correct data conversion. PolarDB-X solves this problem by introducing a multi schema tracking mechanism based on a global timestamp (TSO), which can restore the corresponding version of the schema according to a given TSO. With the help of TSO in the Binlog, we specify the exact point in time when CN and Columnar update the Schema version. With this mechanism, we can achieve the following key capabilities:

• When CN reads files, it can learn the Columnar Schemas of different files and convert them into the current CN Schema version for reading.

• When processing incremental data, Columnar needs to know the Columnar Schema at the corresponding moment of each DML event and write the data to the file based on the Columnar Schema.

• If a fault occurs, data may be restored from the Columnar Schema corresponding to any Binlog file position.

Data Compatibility

During Schema Evolution, it is critical to ensure data compatibility across versions. As mentioned in the example at the beginning of the article, when the ID column of the "Students" table changes from the original varchar to bigint, this change must ensure a smooth conversion, that is, the varchar type data stored in the ORC file of V1 version can be seamlessly converted and conform to the bigint type constraint of V2 version. Furthermore, the rules for this format conversion must be strictly consistent with the standard conversion behavior of MySQL.

PolarDB-X addresses two types of issues involved:

During the conversion process involving multiple Schema versions, PolarDB-X ensures the accuracy of the data format and properly handles issues such as numeric precision and data truncation to maintain data integrity.
Given that the ORC file format is not directly compatible with MySQL, PolarDB-X is responsible for ensuring that the conversion and storage procedure of column-oriented storage files follows the MySQL format standard.

In addition, to provide users with access to the latest data status, the background compression mechanism of the column-oriented storage engine continuously updates the ORC files of the old version according to the latest schema format, and then generates and maintains the new version of the files.

Snapshot and Archive Features

The features of the column-oriented storage engine are being enriched and expanded to cover a wider range of business scenarios and requirements. This includes the snapshot and archiving features for column-oriented storage replicas. The column-oriented storage engine is able to create exact replicas of row-oriented storage data continuously and in real time, essentially providing a snapshot capability based on any point in time. Users can create a column-oriented storage snapshot at a specific point in time at any time and use this snapshot to access the data at that moment. If users want to keep the snapshot for a long time, the engine also provides an archiving feature to persistently store the snapshot.

In addition, PolarDB-X supports the Time to Live (TTL) policy, which can automatically delete expired data and use the engine to archive it to lower-cost object storage services (such as OSS), thereby optimizing storage costs and data lifecycle management. These features make the column-oriented storage engine not only superior in performance but also provide new possibilities in data management and cost-effectiveness. Schema Evolution is one of the cornerstones supporting these features:

• Based on the multi schema design, the column-oriented storage engine can build snapshots based on the schema corresponding to specific points in time, avoiding problems caused by schema and data asymmetry.

• Based on the detailed data compatibility design, the engine supports smooth conversion policies to ensure that adjustments to the table structure will not affect the existing query and analysis operations as the business develops and changes. It allows data to be smoothly migrated from the old version to the new version and ensures the stability and compatibility of the old query results. When archiving data, the column-oriented storage engine follows the compatibility design principle so that even archived data can be accessed and analyzed under the new Schema version. This provides great flexibility for enterprise information systems with long time spans and frequent structural changes, and also greatly reduces the complexity and cost of maintenance.

DDL Capability Support

Currently, the column-oriented storage engine is under rapid development and iteration. It already supports common DDL operations and can handle common data type changes. At the same time, we are working on improving the support for more complex DDL operations and the snapshot and archiving features.

The following table describes the supported DDL types.

DDL Type	Statement	Supported Items
Add a column	ADD COLUMN	Supported
Drop a column	DROP COLUMN	Supported
Change the column type	MODIFY COLUMN	Supported
Change the column type and column name	CHANGE COLUMN	Supported
Alter the column set default	ALTER COLUMN SET DEFAULT, ALTER COLUMN DROP DEFAULT	Supported
Rename the table	RENAME TABLE	Supported
Delete all data from tables	TRUNCATE TABLE	Not supported
Execute multiple statements by using one statement string	ALTER TABLE [Statement 1], [Statement 2]	Not supported
Partition modification operations
Not supported

For more information about DDL support and constraints, please refer to the official documentation of the PolarDB-X column-oriented storage engine: Create and Use CCI_ Cloud-native Database PolarDB (PolarDB) - Alibaba Cloud Help Center.

Summary

This article introduces the Schema Evolution capability designed for DDL (Data Definition Language) operations in the PolarDB-X column-oriented storage engine, which improves the support of the engine for schema changes in terms of multi-version, version evolution, and data compatibility. It is not only fully compatible with the DDL capability of row-oriented storage MySQL, but also avoids performance consumption, and achieves real-time and consistency of data in column-oriented storage replicas, thus supporting Instant DDL capabilities.

In addition, Schema Evolution also supports the subsequent snapshot and archiving features of the column-oriented storage engine, thus covering more business scenarios and further improving the business value of the engine.

Community

Full Support of PolarDB-X Columnar Index for Multi Schema

Background

Multi Schema Evolution

Multi Schema

Version Evolution

"CN Schema" version ahead of Columnar

"CN Schema" version behind Columnar

Data Compatibility

Snapshot and Archive Features

DDL Capability Support

Summary

Read previous post:

Read next post:

ApsaraDB

You may also like

Comments

ApsaraDB

Related Products

PolarDB for PostgreSQL

PolarDB for Xscale

PolarDB for MySQL

Time Series Database (TSDB)

A Free Trial That Lets You Build Big!