Unlock the Power of AI

1 million free tokens

88% Price Reduction

NaNDayNaN:NaN:NaN
Activate Now

TPC-H tests

Updated at: 2024-09-30 07:35

This topic describes the TPC-H test design, procedure, and results for a PolarDB-X instance.

Background information

TPC-H is a benchmark that is widely used to evaluate the analytic query capabilities of databases. It is developed and released by the Transaction Processing Performance Council (TPC). In the TPC-H benchmark, 22 complex SQL queries are performed on eight tables. Most queries contain joins on several tables, subqueries, and GROUP BY clauses.

Note

The TPC-H performance tests described in this topic are implemented based on the TPC-H benchmark test but do not meet all requirements of the TPC-H benchmark test. Therefore, the test results described in this topic cannot be compared with the published results of the TPC-H benchmark test.

Test design

  • Amount of test data

    The test is performed based on a scalar factor of 100 GB. The following list describes the amount of data in each major table:

    • The lineitem table contains 600 million rows of data.

    • The orders table contains 150 million rows of data.

    • The partsupp table contains 80 million rows of data.

  • Instance specifications for the test

    Node specifications

    Number of nodes

    Dataset size

    Node specifications

    Number of nodes

    Dataset size

    8C64G

    6

    100 GB

  • ECS instance type for stress testing

    ecs.g7.4xlarge (16 vCPUs and 64 GB of memory)

Procedure

  1. Create an Elastic Compute Service (ECS) instance for stress testing.

    Create an ECS instance that is used to prepare data and perform stress testing.

    Note

    The ECS instance must be deployed in a virtual private cloud (VPC). Record the name and ID of the VPC for subsequent use. You must deploy all database instances that are described in subsequent steps in the VPC.

  2. Create a PolarDB-X instance for stress testing.

    1. Create a PolarDB-X instance. For information about how to create a PolarDB-X instance, see Create a PolarDB-X instance.

      Note

      Make sure that the ECS instance and the PolarDB-X instance reside in the same VPC.

    2. Create a database that you want to test in the PolarDB-X instance. In this example, a database named tpch_100g is created. For more information, see Create a database.

      CREATE DATABASE tpch_100g MODE = 'auto';
    3. Execute the following statements to create tables in the tpch_100g database:

      CREATE TABLE `customer` (
        `c_custkey` int(11) NOT NULL,
        `c_name` varchar(25) NOT NULL,
        `c_address` varchar(40) NOT NULL,
        `c_nationkey` int(11) NOT NULL,
        `c_phone` varchar(15) NOT NULL,
        `c_acctbal` decimal(15,2) NOT NULL,
        `c_mktsegment` varchar(10) NOT NULL,
        `c_comment` varchar(117) NOT NULL,
        PRIMARY KEY (`c_custkey`)
      ) ENGINE=InnoDB DEFAULT CHARSET=latin1 PARTITION BY KEY(`c_custkey`)
        PARTITIONS 192;
      
      CREATE TABLE `lineitem` (
        `l_orderkey` bigint(20) NOT NULL,
        `l_partkey` int(11) NOT NULL,
        `l_suppkey` int(11) NOT NULL,
        `l_linenumber` bigint(20) NOT NULL,
        `l_quantity` decimal(15,2) NOT NULL,
        `l_extendedprice` decimal(15,2) NOT NULL,
        `l_discount` decimal(15,2) NOT NULL,
        `l_tax` decimal(15,2) NOT NULL,
        `l_returnflag` varchar(1) NOT NULL,
        `l_linestatus` varchar(1) NOT NULL,
        `l_shipdate` date NOT NULL,
        `l_commitdate` date NOT NULL,
        `l_receiptdate` date NOT NULL,
        `l_shipinstruct` varchar(25) NOT NULL,
        `l_shipmode` varchar(10) NOT NULL,
        `l_comment` varchar(44) NOT NULL,
        KEY `IDX_LINEITEM_PARTKEY` (`l_partkey`),
       KEY `IDX_SUPPKEY` (`l_suppkey`),
        KEY `IDX_LINEITEM_SHIPDATE` (`l_shipdate`),
        PRIMARY KEY (`l_orderkey`,`l_linenumber`)
      ) ENGINE=InnoDB DEFAULT CHARSET=latin1 PARTITION BY KEY(`l_orderkey`)
        PARTITIONS 192;
      
      CREATE TABLE `orders` (
        `o_orderkey` bigint(20) NOT NULL,
        `o_custkey` int(11) NOT NULL,
        `o_orderstatus` varchar(1) NOT NULL,
        `o_totalprice` decimal(15,2) NOT NULL,
        `o_orderdate` date NOT NULL,
        `o_orderpriority` varchar(15) NOT NULL,
        `o_clerk` varchar(15) NOT NULL,
        `o_shippriority` bigint(20) NOT NULL,
        `o_comment` varchar(79) NOT NULL,
        PRIMARY KEY (`O_ORDERKEY`)
      ) ENGINE=InnoDB DEFAULT CHARSET=latin1 PARTITION BY KEY(`o_orderkey`)
        PARTITIONS 192;
      
      CREATE TABLE `part` (
        `p_partkey` int(11) NOT NULL,
        `p_name` varchar(55) NOT NULL,
        `p_mfgr` varchar(25) NOT NULL,
        `p_brand` varchar(10) NOT NULL,
        `p_type` varchar(25) NOT NULL,
        `p_size` int(11) NOT NULL,
        `p_container` varchar(10) NOT NULL,
        `p_retailprice` decimal(15,2) NOT NULL,
        `p_comment` varchar(23) NOT NULL,
        PRIMARY KEY (`p_partkey`)
      ) ENGINE=InnoDB DEFAULT CHARSET=latin1 PARTITION BY KEY(`p_partkey`)
        PARTITIONS 192;
      
      CREATE TABLE `partsupp` (
        `ps_partkey` int(11) NOT NULL,
        `ps_suppkey` int(11) NOT NULL,
        `ps_availqty` int(11) NOT NULL,
        `ps_supplycost` decimal(15,2) NOT NULL,
        `ps_comment` varchar(199) NOT NULL,
        KEY `IDX_PARTSUPP_SUPPKEY` (`PS_SUPPKEY`),
        PRIMARY KEY (`ps_partkey`,`ps_suppkey`)
      ) ENGINE=InnoDB DEFAULT CHARSET=latin1 PARTITION BY KEY(`ps_partkey`)
        PARTITIONS 192;
      
      CREATE TABLE `supplier` (
        `s_suppkey` int(11) NOT NULL,
        `s_name` varchar(25) NOT NULL,
        `s_address` varchar(40) NOT NULL,
        `s_nationkey` int(11) NOT NULL,
        `s_phone` varchar(15) NOT NULL,
        `s_acctbal` decimal(15,2) NOT NULL,
        `s_comment` varchar(101) NOT NULL,
        PRIMARY KEY (`s_suppkey`)
      ) ENGINE=InnoDB DEFAULT CHARSET=latin1 PARTITION BY KEY(`s_suppkey`)
       PARTITIONS 192;
      
      CREATE TABLE `nation` (
        `n_nationkey` int(11) NOT NULL,
        `n_name` varchar(25) NOT NULL,
        `n_regionkey` int(11) NOT NULL,
        `n_comment` varchar(152) DEFAULT NULL,
        PRIMARY KEY (`n_nationkey`)
      ) ENGINE=InnoDB DEFAULT CHARSET=latin1 broadcast;
      
      CREATE TABLE `region` (
        `r_regionkey` int(11) NOT NULL,
        `r_name` varchar(25) NOT NULL,
        `r_comment` varchar(152) DEFAULT NULL,
        PRIMARY KEY (`r_regionkey`)
      ) ENGINE=InnoDB DEFAULT CHARSET=latin1 broadcast;
  3. Modify instance parameters.

    Note

    To optimize stress testing, modify specific computing layer parameters of the PolarDB-X instance.

    1. Set the XPROTO_MAX_DN_CONCURRENT and XPROTO_MAX_DN_WAIT_CONNECTION parameters to 4000. For more information, see Parameter settings.

    2. Connect to the PolarDB-X instance by using a command-line client. Then, execute the following SQL statements in the same session to disable logging and CPU statistical sampling:

      set GLOBAL RECORD_SQL = false;
      set GLOBAL ENABLE_HTAP=true;
      set GLOBAL ENABLE_MASTER_MPP=true;
      set GLOBAL MPP_METRIC_LEVEL = 0;
      set GLOBAL ENABLE_CPU_PROFILE = false;
      set GLOBAL ENABLE_SORT_AGG=false;
      set GLOBAL MPP_PARALLELISM=192;
      set GLOBAL GROUP_PARALLELISM=8;
  4. Prepare data for stress testing.

    1. Import the stress testing data.

      1. Download and upload the data import tool Batch Tool to the ECS instance. Download link: https://github.com/polardb/polardbx-tools/releases. For more information, see Use Batch Tool to export and import data.

      2. Install a Java environment on the ECS instance. Sample command for an ECS instance that runs Alibaba Cloud Linux 3:

        yum install -y  java-11-openjdk.x86_64
      3. Run the following command to import the 100-GB TPC-H dataset:

        java -Xmn8g -Xmx16g -jar batch-tool.jar -h ${HOST} -P${PORT} -u ${USER} -p ${PASSWORD} -D tpch_100g -o import -benchmark tpch -scale 100 -s , -pro 1 -con 80 -minConn 81 -maxConn 81 -batchSize 500
        Note

        The following list describes the parameters. Replace ${HOST}, ${PORT}, ${USER}, and ${PASSWORD} with the actual values.

        • {HOST}: the host of the PolarDB-X instance.

        • {PORT}: the port number used to connect to the PolarDB-X instance.

        • {USER}: the username used to connect to the PolarDB-X instance.

        • {PASSWORD}: the password used to connect to the PolarDB-X instance.

    2. Verify data integrity.

      Connect to the PolarDB-X instance by using a command-line client and check whether the amount of data stored in each table meets your expectation.

      select (select count(*) from customer) as customer_cnt,
       (select count(*)  from lineitem) as lineitem_cnt,
       (select count(*)  from nation) as nation_cnt,
       (select count(*)  from orders) as order_cnt,
       (select count(*) from part) as part_cnt,
       (select count(*) from partsupp) as partsupp_cnt,
       (select count(*) from region) as region_cnt,
       (select count(*) from supplier) as supplier_cnt;
      +--------------+--------------+------------+-----------+----------+--------------+------------+--------------+
      | customer_cnt | lineitem_cnt | nation_cnt | order_cnt | part_cnt | partsupp_cnt | region_cnt | supplier_cnt |
      +--------------+--------------+------------+-----------+----------+--------------+------------+--------------+
      |     15000000 |    600037902 |         25 | 150000000 | 20000000 |     80000000 |          5 |      1000000 |
      +--------------+--------------+------------+-----------+----------+--------------+------------+--------------+
    3. Collect statistics.

      Connect to the PolarDB-X instance by using a command-line client and execute the ANALYZE TABLE statement to collect the statistics of each table.

      analyze table customer;
      analyze table lineitem;
      analyze table nation;
      analyze table orders;
      analyze table part;
      analyze table partsupp;
      analyze table region;
      analyze table supplier;
  5. Perform stress testing.

    1. Download and decompress the test script package tpch-queries.tar.gz.

      tar xzvf tpch-queries.tar.gz
    2. Run the script to perform queries and record the time required to run each query.

      cd tpch-queries
      'time' -f "%e" sh all_query.sh {HOST} {USER} {PASSWORD} {DB} {PORT}

Test results

MySQL 5.7
MySQL 8.0
Note
  • Version: polardb-2.4.0_5.4.19-20240718_xcluster5.4.19-20240630. For more information, see Release notes.

  • The SQL column in the following table displays the SQL query files in the tpch-queries.tar.gz package.

SQL

Execution duration (s)

01.sql

41.44

02.sql

1.66

03.sql

12.35

04.sql

2.84

05.sql

6.95

06.sql

7.71

07.sql

27.13

08.sql

9.44

09.sql

37.43

10.sql

4.92

11.sql

3.02

12.sql

9.7

13.sql

2.88

14.sql

1.72

15.sql

5.46

16.sql

1.63

17.sql

1.74

18.sql

11.28

19.sql

2.78

20.sql

9.77

21.sql

15.01

22.sql

2.39

all_query.sh

219.25

image

Note
  • Version: polardb-2.4.0_5.4.19-20240718_xcluster8.4.19-20240630. For more information, see Release notes.

  • The SQL column in the following table displays the SQL query files in the tpch-queries.tar.gz package.

SQL

Execution duration (s)

01.sql

33.19

02.sql

1.96

03.sql

11.87

04.sql

2.6

05.sql

5.92

06.sql

8.83

07.sql

27.26

08.sql

10.34

09.sql

33.32

10.sql

4.89

11.sql

3.26

12.sql

8.67

13.sql

2.78

14.sql

1.62

15.sql

5.29

16.sql

1.79

17.sql

1.68

18.sql

11.19

19.sql

3.1

20.sql

11.14

21.sql

14.84

22.sql

2.13

all_query.sh

207.67

image

  • On this page (1, T)
  • Background information
  • Test design
  • Procedure
  • Test results
Feedback
phone Contact Us

Chat now with Alibaba Cloud Customer Service to assist you in finding the right products and services to meet your needs.

alicare alicarealicarealicare