All Products
Search
Document Center

AnalyticDB:Getting started with Enterprise Edition, Basic Edition, and Data Lakehouse Edition

Last Updated:Feb 10, 2026

AnalyticDB for MySQL is a cloud-hosted, petabyte-scale, real-time data warehouse built for high-concurrency online analytical processing (OLAP). This topic walks you through creating and using an AnalyticDB for MySQL cluster, from setup to job development.

Important

Data Lakehouse Edition is no longer available for new purchases. If you already purchased a Data Lakehouse Edition cluster, you can still follow this topic to use it.

Preparations

  • You have an Alibaba Cloud account.

    If you do not have one, go to the or the Alibaba Cloud website to register. If your organization requires multiple users to access resources collaboratively, create Resource Access Management (RAM) users under your Alibaba Cloud account (root account) and assign them permissions. This lets each RAM user access only the resources they need.

  • If you are using AnalyticDB for MySQL Enterprise Edition, Basic Edition, or Data Lakehouse Edition for the first time, read Product introduction to learn about the product concepts, benefits, and common scenarios of AnalyticDB for MySQL.

Costs

Creating a cluster incurs compute and storage costs. For details, see Pricing for Enterprise Edition and Basic Edition and Pricing for Data Lakehouse Edition.

Procedure

Step 1: Create a cluster

  1. Log on to the AnalyticDB for MySQL console. In the upper-right corner of the page, click Create Cluster.

  2. On the purchase page, configure key parameters to quickly select your cluster type. Keep other parameters at their default values. For more parameter details, see Create a cluster.

    Configuration item

    Description

    Product Type

    • Pay-as-you-go: Pay by the hour after usage. Best for short-term needs. Release the cluster when done to save costs.

    • Subscription: Pay upfront when you create the cluster. Best for long-term needs. Offers lower prices than pay-as-you-go, with bigger discounts for longer subscription durations.

    Edition

    • Enterprise Edition: Multi-replica edition. Uses multi-replica storage and distributed computing. Provides high availability (HA).

    • Basic Edition: Single-replica edition. Uses single-replica storage. Does not provide HA.

    For more information about Enterprise Edition and Basic Edition, see Editions.

    Deployment Mode

    • Enterprise Edition: Supports single-zone deployment or multi-zone deployment.

      Important

      Multi-zone deployment is supported only in China (Hangzhou), China (Shanghai), China (Beijing), China (Zhangjiakou), China (Shenzhen), China (Hong Kong), and Singapore.

    • Basic Edition: Fixed to single-zone deployment.

    Region

    The geographic location where the cluster runs. You cannot change the region after purchase. Choose the region closest to your business to improve access speed.

    Primary Zone

    Select the primary zone for your cluster.

    Secondary Zone

    Select the secondary zone for your cluster. If the primary zone fails, service automatically switches to the secondary zone. During the switch, the cluster may become unavailable for reads or writes, or experience timeouts.

    Important

    Set this parameter only if Deployment Mode is set to multi-zone deployment.

    Virtual private cloud (VPC)

    vSwitch in primary zone

    vSwitch in secondary zone

    • If you already created a VPC that fits your network plan, select it. For example, if you created an ECS instance in a VPC that meets your plan, select that VPC.

    • If you did not create a VPC that fits your plan, use the default VPC and vSwitch.

    • If the default VPC and vSwitch do not meet your needs, create your own VPC and vSwitch.

    Important
    • If you use other Alibaba Cloud services such as Elastic Compute Service (ECS) or ApsaraDB RDS, make sure your AnalyticDB for MySQL cluster and those services are in the same VPC. Otherwise, they cannot interconnect over the internal network and will not perform optimally.

    • Set vSwitch in secondary zone only if Deployment Mode is set to multi-zone deployment.

    Reserved resource node specifications

    Default reserved resource node specifications are 8 ACU. These support:

    • Data computation. Increasing reserved resources improves query speed.

    • Each group of reserved resources supports up to 8 TB of hot data storage. You pay for hot data storage based on actual usage.

    Note

    If you previously purchased Data Warehouse Edition or Data Lakehouse Edition, see Mapping between Enterprise Edition and Data Lakehouse Edition or Data Warehouse Edition specifications when selecting Enterprise Edition specifications.

    Number of reserved resource nodes

    • Enterprise Edition clusters: Default is 3 nodes. Step size is 3.

      Note

      InEnterprise Edition, you can set the number of reserved resource nodes to 0. When you do, the purchase price no longer appears in the lower-right corner of the page. Note the following:

      • If the number of reserved resource nodes is 0, you can only use external tables. To use internal tables in AnalyticDB for MySQL, set the number to greater than 0.

      • If you set the number of reserved resource nodes to 0 when creating the cluster, you can add reserved resources later by scaling out. For details, see Scaling for Enterprise Edition and Basic Edition.

    • Basic Edition clusters: Default is 1 node. Step size is 1.

  3. Complete the remaining purchase steps as prompted in the console.

    After payment succeeds, wait about 10–15 minutes. On the Clusters page, the cluster status changes to Running when creation completes.

Step 2: Create a database account

Important

AnalyticDB for MySQL supports these database accounts:

  • Privileged account: Can manage all standard accounts and databases. Equivalent to the root account in MySQL.

  • Standard account: Requires manual permission grants. For differences between privileged and standard accounts and how to create them, see Privileged and standard accounts.

This topic uses a privileged account to describe the steps for creating one.

  1. On the Clusters page, click the ID of your target cluster to open the cluster details page.

  2. In the navigation pane on the left, click Accounts.

  3. On the Database Accounts tab, click Create Account.

  4. In the Create Account panel, set the parameters.

    Parameter

    Description

    Database Account

    Name for the privileged account. Enter a valid name as prompted.

    Account Type

    Select Privileged Account.

    New Password

    Password for the privileged account. Enter a valid password as prompted.

    Confirm Password

    Re-enter the privileged account password.

    Description

    Optional notes about the account to help with future management.

  5. Click OK to create the account.

    Note

    After you create a standard account, click Actions next to the account, then click Permissions to set its permissions.

Step 3: Create a resource group

AnalyticDB for MySQL isolates compute resources across resource groups. This protects core workloads from complex computations and unexpected spikes.

To facilitate subsequent job development and testing, this section uses a Job resource group as an example to describe the creation steps. You can also create a resource group for the corresponding engine based on your job development type. The supported resource group types are listed below. For more information about resource groups and computing engines, see Create and manage resource groups and Features.

Type

Executed task

Common scenarios

Interactive resource group

You must specify an engine during creation. Supported tasks:

  • The XIHE engine supports XIHE MPP SQL.

  • Spark engine: supports Spark SQL.

High QPS, low response time (RT) online or interactive analytics scenarios.

Job resource group

No engine required during creation. Supports:

  • Xihe BSP SQL

  • Spark SQL and Spark applications

High-throughput offline scenarios.

AI resource group

MLSQL model execution and Ray-managed compute.

Heterogeneous computing scenarios.

  1. In the navigation pane on the left of the cluster details page, click Cluster Management > Resource Management.

  2. Click the Resource Groups tab. In the upper-left corner of the resource group list, click Create Resource Group.

  3. Enter a Resource Group Name. For Job Type, select Job.

    Note

    In this tutorial, set Minimum Computing Resources to 0 ACU and Maximum Computing Resources to 8 ACU. These values are for quick-start testing only.

  4. Click OK.

Step 4: (Optional) Run the built-in test script in the SQL editor

Note

If you only want to quickly try or test the cluster, follow this section to load and analyze the built-in dataset.

  1. In the navigation pane on the left of the cluster details page, click Job Development > SQL Development.

  2. On the Databases and Tables tab, click Load Built-in Dataset.

  3. On the Scripts tab, double-click the script and run the SQL statement in the SQLConsole to test the cluster.

    Note

    For table details in the built-in dataset, see Built-in dataset table details.

Step 5: Develop jobs

Type

Prerequisites

Appearance

Spark SQL job development

You created a Job resource group or Interactive resource group with Spark engine.

Xihe BSP SQL job development

You created a Job resource group.

Internal table

Spark SQL job development

You created a Job resource group or Interactive resource group with Spark engine.

  • The reserved resources for an Enterprise Edition cluster cannot be 0 ACU.

  • A Data Lakehouse Edition cluster must have at least 16 ACU of compute reserved resources and 24 ACU of storage reserved resources.

Xihe BSP SQL job development

You created a Job resource group.

Data Lakehouse Edition clusters must have at least 16 ACU of compute reserved resources and 24 ACU of storage reserved resources.

Job development (external tables)

Develop external-table jobs using Spark SQL

Follow these steps:

外表

  1. Create an OSS bucket and directory in the same region as your AnalyticDB for MySQL cluster. For details, see Enable OSS, Create an OSS bucket in the console, and Create a directory.

  2. Create an OSS Hudi external table.

    1. In the navigation pane on the left of the cluster details page, click Job Development > SQL Development.

    2. In the SQLConsole window, select the Spark engine, a Job resource group, or an Interactive resource group with Spark engine.

    3. Enter the following statement and click Run SQL (F8) to create an external database named spark_external_db to store Hudi data. This example uses the path oss://testBucketName/adb-test-1/.

      CREATE DATABASE spark_external_db
      LOCATION 'oss://testBucketName/adb-test-1/';
    4. Create an external table named spark_hudi_table in the spark_external_db database to store Hudi data. This example uses the path oss://testBucketName/adb-test-1/spark_external_db/.

      CREATE TABLE spark_external_db.spark_hudi_table
      (id int,
       name string,
       score int,
       city string
      )
      using hudi
      partitioned by (id)
      tblproperties
      (primaryKey = 'id', 
      preCombineField = 'city')
      LOCATION 'oss://testBucketName/adb-test-1/spark_external_db/';
    5. Insert data.

      INSERT OVERWRITE spark_external_db.spark_hudi_table PARTITION(id) VALUES
      (001,'Anna',99,'London'),
      (002,'Bob',67,'USA'),
      (003,'Cindy',78,'Spain'),
      (004,'Dan',100,'China');
  3. Query OSS Hudi external table data.

    1. Enter the following statement and click Run SQL (F8) to query data in the external table spark_hudi_table.

      SELECT * FROM spark_external_db.spark_hudi_table;
      Note

      If your Data Lakehouse Edition cluster has compute reserved resources, or your Enterprise Edition cluster has reserved resources, you can use Xihe MPP SQL online mode (select Xihe engine and Interactive resource group) to query external table data.

    2. On the Job Development > Spark JAR Development page, in the Applications list, click Actions next to your SQL query, then click Logs. View the table data in the logs.

Develop external-table jobs using Xihe BSP SQL

Follow these steps:

xihe外表

  1. Create an OSS bucket and directory in the same region as your AnalyticDB for MySQL cluster. For details, see Enable OSS, Create an OSS bucket in the console, and Create a directory.

  2. Upload sample data.

    This example uploads the data file xihe_oss.txt to the OSS path oss://adb-test-1/test_xihe/. Rows are separated by line feeds. Columns are separated by commas (,). Sample data:

    001,Anna,99,London
    002,Bob,67,USA
    003,Cindy,78,Spain
    004,Dan,100,China
  3. Create an OSS external table.

    1. In the navigation pane on the left of the cluster details page, click Job Development > SQL Development.

    2. In the SQLConsole window, select a Job resource group and the Xihe engine.

    3. Enter the following statement and click Run SQL (F8) to create an external database named xihe_external_db.

      CREATE EXTERNAL DATABASE xihe_external_db;
    4. Enter the following statement and click Run SQL (F8) to create an external table named xihe_oss_table.

      CREATE EXTERNAL TABLE xihe_external_db.xihe_oss_table
      (
       id int ,
       name string ,
       score int,
       city string
      )
      ROW FORMAT DELIMITED FIELDS TERMINATED BY  ','
      STORED AS TEXTFILE
      LOCATION 'oss://adb-test-1/test_xihe/xihe_oss.txt';
  4. Query the OSS external table.

    Enter the following statement and click Run SQL (F8) to query data in the external table xihe_oss_table.

    SELECT * FROM xihe_external_db.xihe_oss_table;
    Note

    If your Data Lakehouse Edition cluster has compute reserved resources, you can use Xihe MPP SQL online mode (select Xihe engine and Interactive resource group) to query external table data.

Job development (internal tables)

AnalyticDB for MySQL uses a three-layer data warehouse architecture:

  • ODS layer: Operational Data Store (ODS) layer. Performs extract, transform, and load (ETL) on source data. In practice, ODS pulls data from data sources. To help new users complete this tutorial easily, this tutorial loads test data into the ODS layer.

  • DWD layer: Data warehouse detail (DWD) layer. Cleans and filters ODS layer data. This tutorial uses filtering as an example.

  • ADS layer: Application data service (ADS) layer. Stores data used for business analytics and report generation.

Develop internal-table jobs using Spark SQL

  1. Create an ODS layer data table.

    1. In the navigation pane on the left of the cluster details page, click Job Development > SQL Development.

    2. In the SQLConsole window, select the Spark engine and a Job resource group.

    3. Enter the following statement and click Run SQL (F8) to create a database named test_spark_db.

      CREATE DATABASE test_spark_db;
    4. Create an index-free, partitioned table named adb_spark_ods in the test_spark_db database for the ODS layer, then insert data.

      CREATE TABLE test_spark_db.adb_spark_ods
        (id int,
         name string,
         age int)
      USING adb
      tblproperties (
      'distributeType' = 'HASH',
      'distributeColumns' = 'id',
      'partitionType' = 'VALUE',
      'partitionColumn' = 'age', 
      'partitionCount' = '200', 
      'indexAll' = 'false')
      INSERT OVERWRITE test_spark_db.adb_spark_ods PARTITION(age) VALUES
      (001,'Anna',18),
      (002,'Bob',22),
      (003,'Cindy',12),
      (004,'Dan',25);
  2. Create a DWD layer data table.

    This tutorial reads and filters data from the ODS layer table, then writes it to the DWD layer table.

    1. Create an index-free, partitioned table named adb_spark_dwd in the test_spark_db database.

      CREATE TABLE test_spark_db.adb_spark_dwd (
        id int,
        name string,
        age int
      )
      USING adb
      TBLPROPERTIES(
        'distributeType'='HASH', 
        'distributeColumns'='id', 
        'partitionType'='value',
        'partitionColumn'='age',
        'partitionCount'='200',
        'indexAll'='false')
    2. (Optional) Before querying offline hot data, in the SQLConsole window, select the Xihe engine and the (Interactive) user_default resource group, then run these statements:

      SET adb_config CSTORE_HOT_TABLE_ALLOW_SINGLE_REPLICA_BUILD=true;
      SET adb_config ELASTIC_ENABLE_HOT_PARTITION_HAS_HDD_REPLICA=true;
      SET adb_config ELASTIC_PRODUCT_ENABLE_MIXED_STORAGE_POLICY=true;
      Important

      If you do not set these parameters correctly before querying offline hot data, subsequent SQL statements will fail.

    3. Switch to the Spark engine, read data from the ODS layer table adb_spark_ods where the id column is not 002, and write the data to the DWD layer table.

      INSERT OVERWRITE test_spark_db.adb_spark_dwd partition(age) 
      SELECT 
        id,
        name,
        age
      FROM test_spark_db.adb_spark_ods WHERE id != 002;
    4. Query data in the adb_spark_dwd table.

      SELECT * FROM test_spark_db.adb_spark_dwd;
      Note

      When you run a SELECT statement using Spark SQL, results do not display table data. To view table data, follow the optional step below.

    5. Optional: On the Spark JAR development page, in the Applications list, click Actions next to your SQL query, then click Logs. View the table data in the logs.

  3. Create ADS layer data.

    Data in the ADS layer is derived from the DWD layer through refined filtering and is ready for direct use in business analysis. Because ADS layer tables require fast query performance, you must add an index when creating an ADS layer table. This tutorial reads data from the DWD layer table adb_spark_dwd where the value in the `age` column is greater than 15, and writes the data to the ADS layer table adb_spark_ads.

    1. Create an indexed, partitioned table named adb_spark_ads in the test_spark_db database.

      CREATE TABLE test_spark_db.adb_spark_ads (
        id int,
        name string,
        age int
      )
      USING adb
      TBLPROPERTIES(
        'distributeType'='HASH', 
        'distributeColumns'='id', 
        'partitionType'='value',
        'partitionColumn'='age',
        'partitionCount'='200',
        'indexAll'='true')
    2. Read data with an age greater than 15 from the DWD layer data table adb_spark_dwd and write it to the ADS layer data table adb_spark_ads.

      INSERT OVERWRITE test_spark_db.adb_spark_ads partition(age) 
      SELECT 
        id,
        name,
        age 
      FROM test_spark_db.adb_spark_dwd WHERE age > 15;
  4. Query ADS layer table data.

    AnalyticDB for MySQL Enterprise Edition, Basic Edition, and Data Lakehouse Edition clusters support both offline data queries (using Spark SQL or Xihe BSP SQL) and online real-time queries (using Xihe MPP SQL). To ensure real-time results, this tutorial uses Xihe MPP SQL online mode to query ADS layer table data.

    1. In the SQLConsole window, select the Xihe engine and the (Interactive) user_default resource group.

    2. Run the following statement to query ADS layer table data.

      SELECT * FROM test_spark_db.adb_spark_ads;

      Results:

      +------+-------+------+
      | id   | name  | age  |
      +------+-------+------+
      |    4 | Dan   |   25 |
      |    1 | Anna  |   18 |
      +------+-------+------+

Develop jobs using Xihe BSP SQL

  1. Create an ODS layer data table.

    1. In the navigation pane on the left of the cluster details page, click Job Development > SQL Development.

    2. In the SQLConsole window, select the Xihe engine and a Job resource group.

    3. Enter the following statement and click Run SQL (F8). Create a database named test_xihe_db.

      CREATE DATABASE test_xihe_db;
    4. Enter the following statement and click Run SQL (F8). In the test_xihe_db database, create an index-free, partitioned table named adb_xihe_ods for the ODS layer, then insert data.

      CREATE TABLE test_xihe_db.adb_xihe_ods (id int, name string, age int)DISTRIBUTED BY HASH (id)PARTITION BY VALUE (age)LIFECYCLE 4INDEX_ALL='N';
      INSERT INTO test_xihe_db.adb_xihe_ods(id,name,age) VALUES(001,'Anna',18),(002,'Bob',22),(003,'Cindy',12),(004,'Dan',25);
  2. Create a DWD layer data table.

    This tutorial reads and filters data from the ODS layer table, then writes it to the DWD layer table.

    1. Enter the following statement and click Run SQL (F8). In the test_xihe_db database, create an index-free, partitioned table named adb_xihe_dwd.

      CREATE TABLE test_xihe_db.adb_xihe_dwd( id int, name string, age int)DISTRIBUTED BY HASH (id)PARTITION BY VALUE (age)LIFECYCLE 4INDEX_ALL = 'N';
    2. Enter the following statement and click Execute SQL (F8). This reads data from the Operational Data Store (ODS) layer table adb_xihe_ods where the id column is not 002 and writes it to the DWD layer table adb_xihe_dwd.

      INSERT INTO test_xihe_db.adb_xihe_dwd SELECT  id, name, age FROM test_xihe_db.adb_xihe_ods where id != 002;
    3. Enter the following statement and click Run SQL (F8). Query data in the adb_xihe_dwd table.

      SELECT * FROM test_xihe_db.adb_xihe_dwd;

      Results:

      +------+-------+------+| id   | name  | age  |+------+-------+------+|    4 | Ban   |   25 ||    1 | Anna  |   18 ||    3 | Cindy |   12 |+------+-------+------+
  3. Create an ADS layer data table.

    Table data at the ADS layer is written from the DWD layer after more fine-grained filtering and can be directly used for business analysis. Tables at the ADS layer must have indexes to meet requirements for fast query response. In this tutorial, you will read data from the DWD layer table adb_xihe_dwd where the age column is greater than 15 and write it to the ADS layer table adb_xihe_ads.

    1. Enter the following statement and click Run SQL (F8). In the test_xihe_db database, create an indexed, partitioned table named AnalyticDB for MySQL adb_xihe_ads.

      CREATE TABLE test_xihe_db.adb_xihe_ads (id int, name string, age int)DISTRIBUTED BY HASH (id)PARTITION BY VALUE (age)LIFECYCLE 4; 
    2. You can enter the following statement and click Run SQL (F8). Read data where the age column is greater than 15 from the DWD-layer data table adb_xihe_dwd and write it to the ADS-layer data table adb_xihe_ads.

      INSERT INTO test_xihe_db.adb_xihe_ads SELECT  id, name, age FROM test_xihe_db.adb_xihe_dwd WHERE age > 15;
  4. Query ADS layer data.

    AnalyticDB for MySQL Data Lakehouse Edition clusters support both offline data queries (using Spark SQL or Xihe BSP SQL) and online real-time queries (using Xihe MPP SQL). To ensure real-time results, this tutorial uses Xihe MPP SQL online mode to query ADS layer table data.

    1. In the SQLConsole window, select the Xihe engine and the (Interactive) user_default resource group.

    2. Enter the following statement and click Run SQL (F8). Query ADS layer table data.

      SELECT * FROM test_xihe_db.adb_xihe_ads;

      Results:

      +------+-------+------+| id   | name  | age  |+------+-------+------+|    4 | Ban   |   25 ||    1 | Anna  |   18 |+------+-------+------+

What to do next

After you complete this tutorial, see Data import for more information about the next steps.