This topic describes the features of AnalyticDB for MySQL.
Compute engines
AnalyticDB for MySQL supports three types of compute engines: Spark, XIHE massively parallel processing (MPP), and XIHE bulk synchronous parallel (BSP).
Spark engine
Data Lakehouse Edition (V3.0) is fully compatible with Apache Spark, a multi-language programmable compute engine, and supports seamless migration of applications. This engine is ideal for complex extract-transform-load (ETL) processing and machine learning scenarios.
XIHE MPP engine
The XIHE MPP engine uses an MPP architecture and computes data for all tasks of a query. Data pipelines are used to implement stream computing, meeting low-latency ad hoc analysis requirements.
XIHE BSP engine
The XIHE BSP engine divides tasks in a directed acyclic graph (DAG) and uses a batch computing architecture to compute data for each task. This engine supports disk storage and is ideal for complex analysis scenarios that involve large data volumes, high throughput, and limited resources. You are charged for the resources you use on a pay-as-you-go basis. Exclusive resources are allocated to each query, preventing resource contention among queries. Furthermore, the XIHE BSP engine provides powerful failover capabilities and can retry failed tasks.
The XIHE BSP engine does not allow you to use user-defined functions (UDFs) or write Hudi tables.
Comparison between XIHE MPP and XIHE BSP
When you submit a query to the frontend node and specify an interactive resource group to execute the query, the query is executed in XIHE MPP mode on the permanent compute nodes. If you specify a job resource group to execute the query, the query is executed in XIHE BSP mode on the temporary compute nodes.
Item | XIHE MPP | XIHE BSP |
Resource group | Interactive resource group | Job resource group |
Response time (RT) | Within milliseconds | Within seconds or minutes |
Scenario | Latency-sensitive applications with high queries per second (QPS) | High-throughput applications that have relatively low latency requirements |
Isolation | Among resource groups | Among queries |
SQL syntax | No difference |
Storage engines
XUANWU storage engine (C-Store)
The XUANWU storage engine provides cost-effective enterprise-grade data storage with high reliability, availability, and performance, laying a foundation for high-throughput real-time writes and high-performance real-time queries.
Hudi storage engine
Data Lakehouse Edition (V3.0) uses cost-effective Object Storage Service (OSS) and open source Apache Hudi storage solutions to implement incremental data processing in near real time.
Connection methods
The following table describes the connection methods that are supported by AnalyticDB for MySQL.
Connection method | References |
MySQL command-line tool | Use the MySQL command-line tool to connect to AnalyticDB for MySQL |
Business system | Java, Druid connection pool, Python, PHP, C# for macOS, and Go |
Client | |
BI tool | FineBI, Quick BI, Yonghong BI, DataV, Tableau, QlikView, FineReport, Power BI, and Smartb |
Security
IP address whitelists
By default, AnalyticDB for MySQL clusters do not allow access from external connections. To access an AnalyticDB for MySQL cluster, you must add the IP address or CIDR block of your client to a whitelist of the cluster. For more information, see Configure a whitelist.
SQL audit
SQL audit records the information about DML and DDL operations, helping you perform fault analysis, behavior analysis, and security audit and improving database security. For more information, see Configure SQL audit.
Disk encryption
AnalyticDB for MySQL provides the disk encryption feature. This feature encrypts the data on each disk in your cluster based on block storage. This way, your data cannot be stolen even if it is leaked. For more information, see Disk encryption.
Access control
Resource Access Management (RAM) controls the access permissions of RAM users on cloud resources. After an AnalyticDB for MySQL cluster is created by a RAM user, only the RAM user and the Alibaba Cloud account to which they belong can view and manage the cluster. If multiple users in your organization need to use the AnalyticDB for MySQL cluster, you can create multiple RAM users and grant them the permissions to view or manage the AnalyticDB for MySQL cluster. For more information, see Manage RAM users and permissions.
Database permission control
After you create a database account, you can perform database operations. For example, you can create or delete databases, tables, and views. You can also insert, change, and query data. AnalyticDB for MySQL provides two types of database accounts: privileged accounts and standard accounts. A privileged account can grant standard accounts different levels of permissions, such as cluster-level, database-level, table-level, and column-level permissions. For more information, see Account types and Database permission model.
Monitoring and alerting
AnalyticDB for MySQL provides performance metrics to help you obtain a clear understanding of the health status and performance of cluster nodes. For more information, see View monitoring information. In addition, AnalyticDB for MySQL can send alerts to contacts if the values of CPU utilization, disk usage, IOPS, query duration, and number of database connections exceed specified thresholds. For more information, see Configure an alert rule.
Backup and restoration
AnalyticDB for MySQL supports periodic full backup and log backup, which can effectively prevent data loss.
Full data backup
AnalyticDB for MySQL compresses full data snapshots of a cluster and stores them on offline storage media. If you want to perform a full data restoration, AnalyticDB for MySQL clones a cluster, downloads the backup sets, and then restores data to the new cluster. For more information, see Manage backups.
Log backup
AnalyticDB for MySQL uploads redo logs in parallel from multiple nodes to OSS to store logs in real time. You can restore the data of a cluster to a specified point in time by using the full data backups and their subsequent redo logs up until the desired point in time. This ensures the data security within this period of time. For more information, see Manage backups.
Data restoration from backups
AnalyticDB for MySQL supports full data restoration and restoration to a point in time. AnalyticDB for MySQL clones a cluster and restores data of the downloaded backup sets to the new cluster. For more information, see Clone a cluster.
Intelligent diagnostics and optimization
Schema optimization
Schema design and optimization can significantly reduce database costs and improve query performance. AnalyticDB for MySQL regularly analyzes your algorithms based on the performance metrics of SQL queries and relevant information such as tables and indexes. It automatically offers optimization suggestions and one-click deployment, which lets you easily apply the suggestions. This greatly simplifies the optimization process. For more information, see Schema optimization.
SQL pattern
The SQL pattern feature is designed for real-time SQL statements. AnalyticDB for MySQL can group similar SQL statements into an SQL pattern and perform intelligent diagnostics and analysis on the pattern, improving the diagnostics efficiency. The diagnostic results of SQL patterns can serve as an effective basis for database optimization. For more information, see Schema optimization.
SQL diagnostics
AnalyticDB for MySQL provides diagnostic results and optimization suggestions based on statistics of queries, stages, and operators. For more information, see Introduction to SQL diagnostics.
Data import and export
You can import data from other databases, OSS, Tablestore, Apsara File Storage for HDFS, MaxCompute, ApsaraMQ for Kafka, or Log Service to AnalyticDB for MySQL, and export data from AnalyticDB for MySQL to other databases, OSS, Apsara File Storage for HDFS, or MaxCompute. For more information, see Data import overview and Data export overview.
Data ingestion
Data source management
AnalyticDB for MySQL Data Lakehouse Edition (V3.0) supports ApsaraMQ for Kafka, Log Service, and Hive data sources. Each data synchronization or migration job requires a data source. Data sources can be reused among different data synchronization or migration jobs. This simplifies the process of creating duplicate jobs. You can create, query, modify, and delete data sources.
Data synchronization
AnalyticDB for MySQL Data Lakehouse Edition (V3.0) allows you to create data synchronization jobs to synchronize data from ApsaraMQ for Kafka and Log Service to Data Lakehouse Edition in real time. This helps meet requirements such as near-real-time data ingestion, full data archiving, and elastic analysis. For more information, see Use data synchronization to synchronize Kafka data to Data Lakehouse Edition and Use data synchronization to synchronize Log Service data to Data Lakehouse Edition.
Data migration
AnalyticDB for MySQL Data Lakehouse Edition (V3.0) allows you to create data migration jobs to migrate Hive data to OSS. For more information, see Use data migration to migrate Hive data to Data Lakehouse Edition.
Metadata discovery
AnalyticDB for MySQL Data Lakehouse Edition (V3.0) can automatically discover OSS buckets and data objects that are stored in the same region, and create and update data lake metadata. For more information, see Use metadata discovery to import data to Data Lakehouse Edition.
Data management
AnalyticDB for MySQL Data Lakehouse Edition (V3.0) allows you to manage databases and tables in a visualized manner. You can view information about databases, tables, and views in the AnalyticDB for MySQL console. For more information, see Manage data.
Job development
AnalyticDB for MySQL Data Lakehouse Edition (V3.0) provides two engines to implement different job development methods: the open source Spark engine and the in-house XIHE engine that is developed by the AnalyticDB for MySQL team. For more information, see Job development.
Job scheduling
AnalyticDB for MySQL Data Lakehouse Edition (V3.0) provides job scheduling for batch SQL and Spark applications to help you perform complex ETL processing. For more information, see Perform job scheduling in Data Lakehouse Edition.
SQL
XIHE SQL
When you execute XIHE SQL statements, AnalyticDB for MySQL uses the in-house XIHE engine that is highly compatible with MySQL. The following table describes the syntax, functions, and data types that are supported by XIHE SQL.
Spark SQL
AnalyticDB for MySQL Data Lakehouse Edition (V3.0) supports the Spark engine that is fully compatible with Apache Spark. You can use Spark SQL to perform data lake operations. For more information, see SQL Reference.