All Products
Search
Document Center

ApsaraDB for SelectDB:Test performance with the TPC-DS benchmark

Last Updated:Dec 20, 2025

ApsaraDB for SelectDB is designed to provide high-performance and user-friendly data analytics services. SelectDB delivers high performance in scenarios such as wide table aggregation, multi-table join queries, and high-concurrency point queries. This topic describes how to use the TPC-DS benchmark to test the performance of complex queries in SelectDB within a Linux environment and provides the test results.

Overview

The Transaction Processing Performance Council Decision Support Benchmark (TPC-DS) is a decision support benchmark developed by the Transaction Processing Performance Council (TPC). It evaluates the performance of different systems in handling complex queries and large-scale data analytics.

TPC-DS is designed to simulate complex decision-making in real-world business scenarios. It tests system performance using a series of complex queries and data operations. These operations include joins, aggregations, sorting, filtering, and subqueries. The queries cover a range of scenarios from simple to complex, such as report generation, data mining, and online analytical processing (OLAP).

Preparations

Step 1: Prepare a target instance

  1. Prepare a SelectDB instance.

    If you already have a target instance, verify that its configuration meets the following requirements.

    If you do not have a target instance, you can create one.

    The instance used for this performance test must meet the following requirements.

    • The SelectDB Core version must be 4.1 or later.

      If your target instance has a version earlier than 4.1, you must upgrade the instance. For more information, see Upgrade the kernel version.

    • Specifications must be 96 cores and 384 GB or higher. This test uses an instance with 96 cores and 384 GB.

    • Cluster cache space must be 1,200 GB or larger. This test uses 1,200 GB of cache space.

  2. Set the streaming_load_max_mb parameter to its maximum value.

    During the test, the tool uploads the test dataset to SelectDB using Stream Load. The data volume for this test exceeds the default Stream Load limit of 10,240 MB. You must change the BE parameter streaming_load_max_mb to the maximum value of 10,240,000 MB. For more information about how to change parameters, see Parameter settings.

  3. Create a destination database for the test data.

    If you already have a destination database, you can skip this step.

    1. Connect to the instance. For more information, see Connect to an ApsaraDB for SelectDB instance using a MySQL client.

    2. Create a database.

      The destination database for this test is named test_db. The statement is as follows:

      CREATE DATABASE test_db;

Step 2: Prepare a test server

Important

The following installation scripts are for Linux servers. If your server uses a different operating system, you must modify the scripts as needed.

Usage notes

Take note of the following for your server:

  • If you plan to use Git to download the TPC-DS test tool on the server, you must enable public network access for the server.

    • New ECS instance: When you purchase the ECS instance, for the Public IP parameter, select Assign Public IPv4 Address.

    • Existing ECS instance without public network access: To enable public network access for an existing ECS instance, see Enable public network access.

  • The data files generated for this test dataset are approximately 1,000 GB. Ensure that the server has sufficient memory.

  1. Create a target server.

    If you already have a target server, you can skip this step.

    If you do not have a target server, you can create a custom ECS instance. Select Alibaba Cloud Linux for the image.

  2. Install the required tools for the test.

    • Install the MySQL client.

      yum install mysql
    • Install unzip.

      yum install unzip
    • Install jq.

      yum install jq
    • Install CURL.

      yum install curl
  3. (Optional) Install Git.

    This test uses Git to download the TPC-DS tool. If you have already obtained the TPC-DS tool by other means and plan to upload it to the server manually, you can skip this step.

    yum install git

Step 3: Ensure network connectivity

Ensure that the target server where the TPC-DS test tool is installed can communicate with the SelectDB instance.

  1. Request a public endpoint for the SelectDB instance. For more information, see Apply for and release a public endpoint.

    If the server on which you want to install the TPC-DS test tool is an Alibaba Cloud server and is in the same virtual private cloud (VPC) as the ApsaraDB for SelectDB instance, you can skip this step.

  2. Add the IP address of the server where you want to install the TPC-DS test tool to the IP address whitelist of the ApsaraDB for SelectDB instance. For more information, see Configure an IP address whitelist.

Step 4: Understand the test dataset

Important

Standard TPC-DS test datasets are often very different from actual business scenarios. Some tests also involve parameter tuning for the test dataset. Therefore, the results from a standard test dataset reflect database performance only in specific scenarios. We recommend that you use your actual business data for further testing.

In this test, 1,000 GB of data is generated by TPC-DS and imported into a SelectDB instance to evaluate the performance of the SelectDB instance. The following section describes the data tables in the 1,000 GB test dataset.

TPC-DS table name

Number of rows

customer_demographics

1,920,800

reason

65

warehouse

20

date_dim

73,049

catalog_sales

1,439,980,416

call_center

42

inventory

783,000,000

catalog_returns

143,996,756

household_demographics

7,200

customer_address

6,000,000

income_band

20

catalog_page

30,000

item

300,000

web_returns

71,997,522

web_site

54

promotion

1,500

web_sales

720,000,376

store

1,002

web_page

3,000

time_dim

86,400

store_returns

287,999,764

store_sales

2,879,987,999

ship_mode

20

customer

12,000,000

Procedure

Important

The following scripts are for Linux servers. If your server uses a different operating system, you must modify the scripts as needed.

Step 1: Log on to the target server

If your server is an Alibaba Cloud ECS instance, see Connect to an ECS instance for logon instructions.

For other types of servers, see their respective product documentation.

Step 2: Install the TPC-DS test tool

  1. Download the tool.

    This test uses Git to download the tool. The script is as follows:

    git clone https://github.com/apache/doris.git && cd ./doris/tools/tpcds-tools

    Alternatively, you can download the tool from tpcds-tools and then manually upload it to the target server.

  2. Compilation tools

    Run the following script to compile the tool.

    sh bin/build-tpcds-tools.sh

Step 3: Generate the TPC-DS test dataset

Important

The larger the data volume, the longer it takes to generate the data. The time required depends on server performance.

Run the script in the installation directory of the test tool to generate the test dataset.

The syntax is as follows:

sh bin/gen-tpcds-data.sh -s <yourAimDataNum>

Parameters:

yourAimDataNum:

  • Description: The size of the data to generate using TPC-DS.

  • Unit: GB

This is a medium-scale test that requires generating a 1,000 GB (1 TB) test dataset. This step may take a long time. We recommend that you run this task in the background. The statement is as follows:

nohup sh bin/gen-tpcds-data.sh -s 1000 > gen-tpcds-data.log 2>&1 &

The execution result is saved to the gen-tpcds-data.log file in the tool's installation directory. You can view this file to check whether the process ran correctly.

The test dataset is saved to the tpcds-data directory within the bin directory of the tool's installation directory. The data files have a .dat extension.

Step 4: Use a script to create test tables for SelectDB

  1. Configure the SelectDB instance.

    Before you run the table creation script, you must configure the SelectDB instance in the doris-cluster.conf file. This file is located in the tpcds-tools/conf/ directory of the test tool installation directory. The following is an example:

    # Any FE host
    export FE_HOST='selectdb-cn-******.selectdbfe.rds.aliyuncs.com'
    # The http_port from fe.conf
    export FE_HTTP_PORT=8080
    # The query_port from fe.conf
    export FE_QUERY_PORT=9030
    # Doris username
    export USER='admin'
    # Doris password
    export PASSWORD='******'
    # The database where the TPC-DS tables are located
    export DB='test_db'

    Parameters:

    Parameter name

    Parameter description

    FE_HOST

    The endpoint of the SelectDB instance.

    You can obtain the VPC endpoint or public endpoint of the instance from the Network Information section of the instance details page in the SelectDB console.

    FE_HTTP_PORT

    The HTTP protocol port of the SelectDB instance.

    The default port of the SelectDB instance is 8080.

    You can obtain the HTTP protocol port of the instance from the Network Information section of the instance details page in the SelectDB console.

    FE_QUERY_PORT

    The MySQL protocol port of the SelectDB instance.

    The default port of the SelectDB instance is 9030.

    You can obtain the MySQL protocol port of the instance from the Network Information section of the instance details page in the SelectDB console.

    USER

    The account of the SelectDB instance.

    After you create a SelectDB instance, the system automatically creates an admin account for the instance.

    PASSWORD

    The password of the SelectDB instance account.

    If you set USER to the admin account but have forgotten the password, you can reset the admin password for the instance in the console.

    DB

    The name of the database in the SelectDB instance where you will import data.

  2. Create tables.

    In the installation directory of the test tool, run the following script to create the test tables. After you run the script, the tables described in Step 4: Understand the test dataset are created in the destination database of the SelectDB instance.

    sh bin/create-tpcds-tables.sh -s 1000

Step 5: Import data into SelectDB

Important

The larger the data volume, the longer it takes to import the data. The time required depends on server performance.

In the installation directory of the test tool, run the following script to import the TPC-DS test dataset into the SelectDB instance:

sh bin/load-tpcds-data.sh

This is a medium-scale test, and you need to import the generated 1,000 GB (1 TB) test dataset into SelectDB. This step may take a long time. We recommend that you run this task in the background. The statement is as follows:

nohup sh bin/load-tpcds-data.sh > load-tpcds-data.log 2>&1 &

The execution result is saved to the load-tpcds-data.log file in the tool's installation directory. You can view this file to check whether the process ran correctly.

Step 6: Test query performance

  • Test the performance of SQL queries in batches

    Important

    The larger the data volume, the longer the batch test takes. The time required depends on server performance.

    You can run the TPC-DS test SQL scripts to execute the SQL queries in the test set in batches.

    The syntax is as follows.

    sh bin/run-tpcds-queries.sh -s <yourAimDataNum>

    Parameters:

    yourAimDataNum: Specifies the scale of the dataset for the query. This value must be the same as the scale used to generate the data. For example, if you used -s 1000 to generate the data, you must also use -s 1000 to run the queries.

    After the script is executed, the console window displays the performance of each SQL statement in the test set against SelectDB.

    This medium-scale test queries a 1000 GB (1 TB) test dataset and may take a long time to complete. You can run this task in the background. The statement is as follows.

    nohup sh bin/run-tpcds-queries.sh -s 1000 > run-tpcds-queries.log 2>&1 &

    The query results are saved to the run-tpcds-queries.log file in the tool installation directory. You can view this file to check the query status and review the test results. For the test results on the 1000 GB dataset, see Test results.

  • Test the performance of a single SQL query

    You can also follow these steps to test the performance of a specific SQL statement in SelectDB:

    1. Connect to the SelectDB instance. For more information, see Connect to an ApsaraDB for SelectDB instance using DMS.

    2. Execute the target SQL statement. You can obtain the target SQL statement from TPCDS-Query-SQL.

      Note

      The SQL statements in TPCDS-Query-SQL are the same as those executed in the batch test script.

Test results

The following TPC-DS 1000 GB query performance test results were obtained using a SelectDB version 4.1.1 instance configured with 96 cores, 384 GB of memory, and a 1200 GB cluster cache.

Query

TPC-DS 1000 GB (s)

Q1

0.4

Q2

3.05

Q3

0.22

Q4

7.16

Q5

0.51

Q6

0.19

Q7

0.38

Q8

0.32

Q9

4.48

Q10

0.28

Q11

4.51

Q12

0.1

Q13

0.49

Q14_1

4.73

Q14

4.79

Q15

0.26

Q16

0.27

Q17

0.9

Q18

0.48

Q19

0.18

Q20

0.1

Q21

0.08

Q22

0.98

Q23_1

12.94

Q23

12.78

Q24_1

2.29

Q24

2.32

Q25

0.63

Q26

0.16

Q27

0.39

Q28

4.12

Q29

0.59

Q30

0.2

Q31

1.19

Q32

0.06

Q33

0.28

Q34

0.51

Q35

0.88

Q36

0.34

Q37

0.07

Q38

4.09

Q39_1

0.27

Q39

0.3

Q40

0.12

Q41

0.04

Q42

0.1

Q43

0.88

Q44

1.34

Q45

0.2

Q46

0.91

Q47

1.71

Q48

0.45

Q49

0.77

Q50

0.75

Q51

4.32

Q52

0.1

Q53

0.2

Q54

0.94

Q55

0.1

Q56

0.25

Q57

0.96

Q58

0.22

Q59

6

Q60

0.29

Q61

0.47

Q62

0.77

Q63

0.18

Q64

0.99

Q65

3.42

Q66

0.31

Q67

17.92

Q68

0.79

Q69

0.52

Q70

1.69

Q71

0.39

Q72

2.73

Q73

0.44

Q74

3.97

Q75

3.55

Q76

0.99

Q77

0.27

Q78

11.98

Q79

2.06

Q80

0.49

Q81

0.36

Q82

0.19

Q83

0.17

Q84

0.11

Q85

0.48

Q86

0.41

Q87

4.79

Q88

3.43

Q89

0.29

Q90

0.13

Q91

0.1

Q92

0.05

Q93

1.04

Q94

0.2

Q95

0.27

Q96

0.43

Q97

3.39

Q98

0.17

Q99

1.59

Total

171.45