This topic describes the index diagnostics feature of PolarDB-X. You can use this feature to diagnose and identify inefficient indexes in your database and take measures to improve database performance.
Background information
Indexing is a method that is commonly used in databases to speed up query operations and greatly enhance the efficiency of data retrieval. However, each time data in a table is updated, the corresponding indexes must also be updated to reflect the changes. This can slow down write operations and reduce the overall efficiency of the database. In addition, indexes occupy the storage space of databases. Therefore, you must carefully consider the use of indexes to avoid excess or unnecessary indexes.
The index diagnostics feature of PolarDB-X can help you identify unnecessary and inefficient indexes (including local indexes and global secondary indexes) in your database. This feature can also provide optimization suggestions to help you make necessary adjustments to the relevant indexes, such as deleting or rebuilding the indexes. This improves write performance and saves storage space without compromising query performance.
To maintain good database performance, we recommend that you perform index diagnostics on a regular basis.
Prerequisites
The index diagnostics feature is available in PolarDB-X 5.4.17-16859297 and later. For information about how to view instance versions, see View and update the version of an instance.
Usage notes
The index diagnostics feature can be used for databases in AUTO mode and DRDS mode.
Syntax
INSPECT [FULL] INDEX [FROM table_name] [MODE= {STATIC|DYNAMIC|MIXED}]
Parameters
MODE: the diagnostics mode. Three diagnostics modes are supported. If you do not specify the MODE parameter, the STATIC mode is used.
STATIC: This mode is used to diagnose and identify static issues, such as duplicate indexes. These types of issues can be detected as soon as indexes are created.
DYNAMIC: This mode is used to identify dynamic issues, such as less selective column indexes or unused indexes. This mode requires collection of data on index usage over a period of time to provide accurate diagnostic results. For more information, see DYNAMIC mode.
MIXED: This mode is used to diagnose both static and dynamic issues. This mode also requires collection of data on index usage over a period of time, ideally spanning a complete business cycle, to provide accurate diagnostic results. For more information, see DYNAMIC mode.
FROM: the data tables that you want to diagnose.
If you do not specify the FROM parameter, all tables in the current database are diagnosed.
If you specify the FROM parameter, only the specified data tables are diagnosed.
FULL: the level of detail of the diagnostic output that is generated during a diagnostic operation.
If you specify the FULL parameter, a comprehensive diagnostic report about indexes is generated, including detailed information on identified issues and specific suggestions.
If you do not specify the FULL parameter, the diagnostic output includes only suggestions.
STATIC mode
Execute the following statement to create the tb1 table:
create table tb1(
id int,
name varchar(20),
code varchar(50),
primary key(id),
global index idx_name(`name`) partition by key(name),
global index idx_name_code(`name`, `code`) partition by key(name, code),
global index idx_id(`id`) partition by key(id)
) partition by key(id);
Run the inspect full index from tb1\G
command to return the diagnostic results in STATIC mode.
inspect full index from tb1\G
*************************** 1. row ***************************
SCHEMA: d5
TABLE: tb1
INDEX: idx_id
INDEX_TYPE: GLOBAL INDEX
INDEX_COLUMN: id
COVERING_COLUMN:
USE_COUNT: 0
LAST_ACCESS_TIME: NULL
DISCRIMINATION: 0.0
PROBLEM: ineffective gsi `idx_id` because it has the same rule as primary table
ADVICE (STEP1): alter table `tb1` alter index `idx_id` invisible;
ADVICE (STEP2): alter table `tb1` drop index `idx_id`;
alter table `tb1` add local index `idx_id` (`id`);
*************************** 2. row ***************************
SCHEMA: d5
TABLE: tb1
INDEX: idx_name
INDEX_TYPE: GLOBAL INDEX
INDEX_COLUMN: name
COVERING_COLUMN: id
USE_COUNT: 0
LAST_ACCESS_TIME: NULL
DISCRIMINATION: 0.0
PROBLEM: index columns duplicate: idx_name, idx_name_code;
ADVICE (STEP1): alter table `tb1` alter index `idx_name` invisible;
ADVICE (STEP2): alter table `tb1` drop index `idx_name`;
alter table `tb1` add local index `idx_name` (`name`);
*************************** 3. row ***************************
SCHEMA: d5
TABLE: tb1
INDEX: idx_name_code
INDEX_TYPE: GLOBAL INDEX
INDEX_COLUMN: name,code
COVERING_COLUMN: id
USE_COUNT: 0
LAST_ACCESS_TIME: NULL
DISCRIMINATION: 0.0
PROBLEM: None
ADVICE (STEP1): None
ADVICE (STEP2): None
3 rows in set (0.42 sec)
The following section describes the diagnostic results:
The PROBLEM line shows issues with indexes.
Issue with the idx_id global index: The idx_id index is reported as ineffective because it has the same partitioning rule as the primary table.
Issue with the idx_name global index: The idx_name index is reported as redundant because the indexed column of idx_name is a duplicate of a column that is already included in the indexed column of idx_name_code.
The ADVICE lines provide suggestions for optimizing the performance of problematic indexes. The suggestions are presented in two steps and may vary based on the specific scenario.
STEP1 of the optimization advice suggests that you use the INVISIBLE INDEX statement to hide the problematic index. The statement hides the index from the optimizer. This way, your business SQL queries do not utilize or rely on the index for query execution. By hiding the index, you can evaluate the impact of index deletion without the need to permanently delete the index.
NoteFor more information about the INVISIBLE INDEX feature and how to evaluate the impact of index deletion on your business, see INVISIBLE INDEX.
STEP2 of the optimization advice provided by PolarDB-X may include suggestions such as deleting the problematic index or replacing the problematic index with an optimized index.
WarningTo prevent unintended consequences or negative impacts on the database system, we recommend that you evaluate the impact of index deletion before you delete an index.
DYNAMIC mode
Index diagnostics in DYNAMIC mode rely on index usage data. Before you perform the diagnostics, we recommend that you collect a complete set of usage data that covers a full business cycle, such as one day or one week, based on your business characteristics.
Usage
Run the
set global GSI_STATISTICS_COLLECTION=true
command to enable index usage statistics collection in your database. This option allows for the collection of usage data over a period of time.Run the
inspect full index mode=dynamic\G
command to perform index diagnostics in DYNAMIC mode.Enabling the GSI_STATISTICS_COLLECTION option can have a minor impact on the query performance of the database (a potential decrease of approximately 1%). After index diagnostics are completed, run the
set global GSI_STATISTICS_COLLECTION=false
command to disable this option.
Scenarios
Improve testing performance during the business stress testing phase. Perform index diagnostics in DYNAMIC mode before business stress testing begins. This allows you to identify inefficient indexes and make adjustments based on the diagnostic suggestions before stress testing begins.
Optimize application performance during runtime. Collect information about index usage, wait for a complete business traffic cycle, and then adjust the index in the DYNAMIC mode to optimize application performance.
Others
PolarDB-X provides the INFORMATION_SCHEMA.GLOBAL_INDEXES view to facilitate the optimization of global indexes. This view allows you to easily access and examine the usage information of global indexes in PolarDB-X.
Execute the following statement to view the global index usage:
select * from information_schema.global_indexes where table="tb1"\G
*************************** 1. row ***************************
SCHEMA: testdb
TABLE: tb1
NON_UNIQUE: 1
KEY_NAME: idx_id_$3449
INDEX_NAMES: id
COVERING_NAMES:
INDEX_TYPE: NULL
DB_PARTITION_KEY:
DB_PARTITION_POLICY:
DB_PARTITION_COUNT: NULL
TB_PARTITION_KEY:
TB_PARTITION_POLICY:
TB_PARTITION_COUNT: NULL
STATUS: PUBLIC
SIZE_IN_MB: 10.03
USE_COUNT: 5
LAST_ACCESS_TIME: 2023-06-08 10:06:33
CARDINALITY: 389090
ROW_COUNT: 404508
*************************** 2. row ***************************
SCHEMA: testdb
TABLE: tb1
NON_UNIQUE: 1
KEY_NAME: idx_name_code_$2986
INDEX_NAMES: name, code
COVERING_NAMES: id
INDEX_TYPE: NULL
DB_PARTITION_KEY:
DB_PARTITION_POLICY:
DB_PARTITION_COUNT: NULL
TB_PARTITION_KEY:
TB_PARTITION_POLICY:
TB_PARTITION_COUNT: NULL
STATUS: PUBLIC
SIZE_IN_MB: 0.03
USE_COUNT: 15
LAST_ACCESS_TIME: 2023-06-08 10:10:06
CARDINALITY: -1
ROW_COUNT: 404508
2 rows in set (0.10 sec)
Response parameters
SIZE_IN_MB: the amount of storage space that is occupied by the index.
USE_COUNT: the number of times that the index has been used since the GSI_STATISTICS_COLLECTION option was enabled.
LAST_ACCESS_TIME: the time when the index was last used since the GSI_STATISTICS_COLLECTION option was enabled.
CARDINALITY: the cardinality of the index.
ROW_COUNT: the number of rows in the indexed table.