All Products
Search
Document Center

AnalyticDB:XUANWU analytical storage engine

Last Updated:Jun 21, 2024

The XUANWU analytical storage engine provides highly reliable and highly available enterprise-class data storage capabilities with high performance at low costs. This helps AnalyticDB for MySQL implement high-throughput, real-time data writes and high-performance, real-time queries.

High-throughput, real-time data writes

AnalyticDB for MySQL uses a three-layer architecture to provide high throughput capabilities. The access node layer, storage node layer, and persistent distributed storage layer can be scaled out in parallel. AnalyticDB for MySQL supports the hybrid row-column storage format and the asynchronous migration of incremental data to implement high-throughput, high-concurrency, real-time data writes.

AnalyticDB for MySQL uses the Raft consensus protocol and the apply method to synchronously write data. This allows you to query data immediately after the data is written and ensures write consistency. The XUANWU analytical storage engine uses the mark-for-delete technology to allow you to update and delete data in real time with high throughput, and uses the multiversion concurrency control (MVCC) technology to ensure data atomicity and integrity.

Hybrid row-column storage format

The XUANWU analytical storage engine supports the hybrid row-column storage format, which is similar to the Optimized Row Columnar (ORC) or Parquet format in Apache Hadoop. The hybrid row-column storage format supports analytical column pruning, high-throughput data scanning, and row alignment to implement high-performance, random queries, especially in scenarios that involve multidimensional index filtering.

The following figure shows the hybrid row-column storage format.

行列混存

Adaptive indexing

In online analytical processing (OLAP) scenarios, multidimensional queries are required, but traditional single-column or combined indexes in online transaction processing (OLTP) scenarios cannot meet the requirements. The XUANWU analytical storage engine uses adaptive indexing on columns to automatically configure the index data structure for column types such as STRING, NUMBER, TEXT, JSON, and VECTOR. Additionally, The XUANWU analytical storage engine uses column-level indexes to implement multidimensional combined retrieval and multiway merges in a progressive streaming manner. This greatly improves the data filtering performance.

The following types of indexes are supported: inverted indexes, BKD-tree indexes, and bitmap indexes. The index performance varies based on data distribution characteristics, such as the cardinality and the number of table records for range queries. In specific scenarios, indexing overheads are higher than scanning overheads. Example: age > 0 and age <100. The XUANWU analytical storage engine determines whether to index or scan data based on cost-based optimization (CBO).

The following figure shows how to use multiway merges for different types of indexes.

玄武-自适应索引

Fusion of structured and unstructured indexes

The index manager of the XUANWU analytical storage engine manages structured and unstructured indexes at the storage layer in a centralized manner. The indexes include BKD indexes of numerical values, inverted indexes of strings, unstructured JSON and vector indexes, and full-text indexes of text data. The index manager provides a unified expression for the compute layer, which allows the SQL logic of the compute layer to be compatible with different data types and accelerates queries. AnalyticDB for MySQL performs correlation analysis between full-text data and structured tables to support complex SQL logic.