All Products
Search
Document Center

PolarDB:Use hybrid plans to accelerate wide table queries

Last Updated:Jun 26, 2024

This topic describes the hybrid plan feature, including how it works, its suitable scenarios, its limits, and its performance improvement effect.

How a hybrid plan works

image.png

For queries that use WHERE, JOIN, or LIMIT clauses on a small number of columns in a wide table, In-Memory Column Indexes (IMCIs) can significantly improve the query performance. However, if IMCIs are used in queries that involve details of a filtered column, the project must obtain all column information, which causes excessive read workloads. In this case, row indexes are more suitable. The method that combines both column indexing and row indexing in a query is called a hybrid plan.

Hybrid plans accelerate wide table queries. IMCIs are used to execute and obtain intermediate results during the part of the execution plan in which column indexes work well. The intermediate results contain only the primary key information. Then, all column information in the project is queried and returned based on the primary key and the primary index of the InnoDB engine, as shown in the preceding figure.

Scenarios

The query is on a wide table and the result contains more than 15 columns.

Prerequisites

The cluster is of the Enterprise Edition and runs PolarDB for MySQL 8.0.1 whose revision version is 8.0.1.1.37 or later.

Limits

  • The tables involved in queries based on hybrid plans must have explicitly defined primary keys.

  • You cannot use hybrid plans on partitioned tables.

Note

If the hybrid plan feature fails to accelerate slow queries due to limits such as the table type, you can contact us for technical support.

Parameters

To enable the hybrid plan feature, set the imci_enable_hybrid_plan parameter to ON. After you enable the Hybrid Plan feature, the optimizer uses the feature on appropriate query statements to accelerate wide table queries. To forcibly use the hybrid plan feature for a specific query statement, add a hint to the query statement to set the force_hybrid_index_search option in the imci_optimizer_switch parameter to ON.

Parameter

Description

imci_enable_hybrid_plan

Specifies whether to enable the Hybrid Plan feature. Valid values:

  • ON (default value)

  • OFF

imci_optimizer_switch

Specifies whether to use Hybrid Plan in query statements that meet the conditions. Valid values:

  • 'force_hybrid_index_search=OFF (default): does not use the hybrid plan feature to accelerate wide table queries. The optimizer automatically selects the best execution plan.

  • 'force_hybrid_index_search=ON ': forcibly uses the hybrid plan feature to accelerate wide table queries.

Procedure

The following section describes how to use the hybrid plan feature to accelerate the execution of the SELECT * FROM t1; statement on wide tables. Perform the following steps:

  1. Connect to the PolarDB cluster. For more information, see Connect to a cluster.

  2. Set the imci_enable_hybrid_plan parameter to ON to enable the hybrid plan feature. After you enable the hybrid plan feature, the optimizer uses the feature on appropriate query statements to accelerate wide table queries.

    SET imci_enable_hybrid_plan=ON;
  3. To forcibly use the hybrid plan feature for a specific query, add a hint to the SQL statement to set the force_hybrid_index_search option in the imci_optimizer_switch parameter to ON.

    SELECT /*+ SET_VAR(imci_optimizer_switch='force_hybrid_index_search=ON') */ * FROM t1;

    You can use the EXPLAIN statement to check whether the hybrid plan feature has taken effect on the SQL statement. If the execution plan contains HybridIndexSearch, the hybrid plan feature has taken effect on the SQL statement.

    EXPLAIN SELECT /*+ SET_VAR(imci_optimizer_switch='force_hybrid_index_search=ON') */ * FROM t1;

    Sample result:

    +----+-------------------------+------+---------------------------------------------------------------+
    | ID | Operator                | Name | Extra Info                                                    |
    +----+-------------------------+------+---------------------------------------------------------------+
    |  1 | Select Statement        |      | IMCI Execution Plan (max_dop = 32, max_query_mem = unlimited) |
    |  2 | └─HybridIndexSearch     |      | Used table: t1                                                |
    |  3 |   └─Compute Scalar      |      |                                                               |
    |  4 |     └─Table Scan        | t1   |                                                               |
    +----+-------------------------+------+---------------------------------------------------------------+
    4 rows in set (0.02 sec)

Performance test

The following test uses the OnTime dataset from ClickHouse to compare query performance in the following scenarios: IMCIs or the Hybrid Plan feature is not used, only IMCIs are used, and the hybrid plan feature is used.

The table in the OnTime dataset is a common wide table that contains 109 columns. In this test, the definition of the table is changed and a new primary key is explicitly defined. The following statement describes the modified schema:

CREATE TABLE `ontime` (
  `ID` int(11) NOT NULL AUTO_INCREMENT,
  `Year` year(4) DEFAULT NULL,
  `Quarter` tinyint(4) DEFAULT NULL,
  `Month` tinyint(4) DEFAULT NULL,
  `DayofMonth` tinyint(4) DEFAULT NULL,
  `DayOfWeek` tinyint(4) DEFAULT NULL,
  `FlightDate` date DEFAULT NULL,
  `UniqueCarrier` char(7) DEFAULT NULL,
  `AirlineID` int(11) DEFAULT NULL,
  `Carrier` char(2) DEFAULT NULL,
  `TailNum` varchar(50) DEFAULT NULL,
  `FlightNum` varchar(10) DEFAULT NULL,
  `OriginAirportID` int(11) DEFAULT NULL,
  `OriginAirportSeqID` int(11) DEFAULT NULL,
  `OriginCityMarketID` int(11) DEFAULT NULL,
  `Origin` char(5) DEFAULT NULL,
  `OriginCityName` varchar(100) DEFAULT NULL,
  `OriginState` char(2) DEFAULT NULL,
  `OriginStateFips` varchar(10) DEFAULT NULL,
  `OriginStateName` varchar(100) DEFAULT NULL,
  `OriginWac` int(11) DEFAULT NULL,
  `DestAirportID` int(11) DEFAULT NULL,
  `DestAirportSeqID` int(11) DEFAULT NULL,
  `DestCityMarketID` int(11) DEFAULT NULL,
  `Dest` char(5) DEFAULT NULL,
  `DestCityName` varchar(100) DEFAULT NULL,
  `DestState` char(2) DEFAULT NULL,
  `DestStateFips` varchar(10) DEFAULT NULL,
  `DestStateName` varchar(100) DEFAULT NULL,
  `DestWac` int(11) DEFAULT NULL,
  `CRSDepTime` int(11) DEFAULT NULL,
  `DepTime` int(11) DEFAULT NULL,
  `DepDelay` int(11) DEFAULT NULL,
  `DepDelayMinutes` int(11) DEFAULT NULL,
  `DepDel15` int(11) DEFAULT NULL,
  `DepartureDelayGroups` int(11) DEFAULT NULL,
  `DepTimeBlk` varchar(20) DEFAULT NULL,
  `TaxiOut` int(11) DEFAULT NULL,
  `WheelsOff` int(11) DEFAULT NULL,
  `WheelsOn` int(11) DEFAULT NULL,
  `TaxiIn` int(11) DEFAULT NULL,
  `CRSArrTime` int(11) DEFAULT NULL,
  `ArrTime` int(11) DEFAULT NULL,
  `ArrDelay` int(11) DEFAULT NULL,
  `ArrDelayMinutes` int(11) DEFAULT NULL,
  `ArrDel15` int(11) DEFAULT NULL,
  `ArrivalDelayGroups` int(11) DEFAULT NULL,
  `ArrTimeBlk` varchar(20) DEFAULT NULL,
  `Cancelled` tinyint(4) DEFAULT NULL,
  `CancellationCode` char(1) DEFAULT NULL,
  `Diverted` tinyint(4) DEFAULT NULL,
  `CRSElapsedTime` int(11) DEFAULT NULL,
  `ActualElapsedTime` int(11) DEFAULT NULL,
  `AirTime` int(11) DEFAULT NULL,
  `Flights` int(11) DEFAULT NULL,
  `Distance` int(11) DEFAULT NULL,
  `DistanceGroup` tinyint(4) DEFAULT NULL,
  `CarrierDelay` int(11) DEFAULT NULL,
  `WeatherDelay` int(11) DEFAULT NULL,
  `NASDelay` int(11) DEFAULT NULL,
  `SecurityDelay` int(11) DEFAULT NULL,
  `LateAircraftDelay` int(11) DEFAULT NULL,
  `FirstDepTime` varchar(10) DEFAULT NULL,
  `TotalAddGTime` varchar(10) DEFAULT NULL,
  `LongestAddGTime` varchar(10) DEFAULT NULL,
  `DivAirportLandings` varchar(10) DEFAULT NULL,
  `DivReachedDest` varchar(10) DEFAULT NULL,
  `DivActualElapsedTime` varchar(10) DEFAULT NULL,
  `DivArrDelay` varchar(10) DEFAULT NULL,
  `DivDistance` varchar(10) DEFAULT NULL,
  `Div1Airport` varchar(10) DEFAULT NULL,
  `Div1AirportID` int(11) DEFAULT NULL,
  `Div1AirportSeqID` int(11) DEFAULT NULL,
  `Div1WheelsOn` varchar(10) DEFAULT NULL,
  `Div1TotalGTime` varchar(10) DEFAULT NULL,
  `Div1LongestGTime` varchar(10) DEFAULT NULL,
  `Div1WheelsOff` varchar(10) DEFAULT NULL,
  `Div1TailNum` varchar(10) DEFAULT NULL,
  `Div2Airport` varchar(10) DEFAULT NULL,
  `Div2AirportID` int(11) DEFAULT NULL,
  `Div2AirportSeqID` int(11) DEFAULT NULL,
  `Div2WheelsOn` varchar(10) DEFAULT NULL,
  `Div2TotalGTime` varchar(10) DEFAULT NULL,
  `Div2LongestGTime` varchar(10) DEFAULT NULL,
  `Div2WheelsOff` varchar(10) DEFAULT NULL,
  `Div2TailNum` varchar(10) DEFAULT NULL,
  `Div3Airport` varchar(10) DEFAULT NULL,
  `Div3AirportID` int(11) DEFAULT NULL,
  `Div3AirportSeqID` int(11) DEFAULT NULL,
  `Div3WheelsOn` varchar(10) DEFAULT NULL,
  `Div3TotalGTime` varchar(10) DEFAULT NULL,
  `Div3LongestGTime` varchar(10) DEFAULT NULL,
  `Div3WheelsOff` varchar(10) DEFAULT NULL,
  `Div3TailNum` varchar(10) DEFAULT NULL,
  `Div4Airport` varchar(10) DEFAULT NULL,
  `Div4AirportID` int(11) DEFAULT NULL,
  `Div4AirportSeqID` int(11) DEFAULT NULL,
  `Div4WheelsOn` varchar(10) DEFAULT NULL,
  `Div4TotalGTime` varchar(10) DEFAULT NULL,
  `Div4LongestGTime` varchar(10) DEFAULT NULL,
  `Div4WheelsOff` varchar(10) DEFAULT NULL,
  `Div4TailNum` varchar(10) DEFAULT NULL,
  `Div5Airport` varchar(10) DEFAULT NULL,
  `Div5AirportID` int(11) DEFAULT NULL,
  `Div5AirportSeqID` int(11) DEFAULT NULL,
  `Div5WheelsOn` varchar(10) DEFAULT NULL,
  `Div5TotalGTime` varchar(10) DEFAULT NULL,
  `Div5LongestGTime` varchar(10) DEFAULT NULL,
  `Div5WheelsOff` varchar(10) DEFAULT NULL,
  `Div5TailNum` varchar(10) DEFAULT NULL,
  PRIMARY KEY (`ID`)
) ENGINE=InnoDB AUTO_INCREMENT=58592318 DEFAULT CHARSET=latin1 COMMENT='columnar=1'

Execute the following SQL statement:

SELECT * FROM ontime ORDER BY ArrTime LIMIT 1000;

The following table describes the amount of time required to execute the query on cold start in the following scenarios: IMCIs or the Hybrid plan feature is not used, only IMCIs are used, and the hybrid plan feature is used.

IMCIs or the hybrid plan feature is not used

Only IMCIs are used

The hybrid plan feature is used

232.48 seconds

2.56 seconds

0.33 seconds

Based on the preceding table, the hybrid plan feature can significantly accelerate queries on wide tables.