All Products
Search
Document Center

AnalyticDB:Storage diagnostics

Last Updated:Aug 09, 2024

If table issues occur, such as data skew, unreasonable partition fields, and excessive indexes, you can perform diagnostics on partition field reasonability, distribution field reasonability, and replicated table reasonability on the Storage Diagnostics page of the AnalyticDB for MySQL console. You can also perform schema optimization by using hot and cold data optimization suggestions and index optimization suggestions to reduce costs and improve efficiency.

Usage notes

  • Only AnalyticDB for MySQL clusters of V3.1.4 or later support the index diagnostics and hot and cold data optimization features.

  • Hot and cold data optimization suggestions and index optimization suggestions are obtained from analysis of historical data and query characteristics. If data and query characteristics are stable, the relevant suggestions remain valid. If data and query characteristics change significantly, the suggestions may also significantly become less valuable as a reference. We recommend that you determine whether to adopt the suggestions based on your business characteristics.

Table diagnostics

Table skew diagnostics

When you create a table, you can use the DISTRIBUTED BY HASH clause to specify a distribution key. Then, AnalyticDB for MySQL uses the hash values of the distribution key values to distribute rows of data across different shards. If data is unevenly distributed across storage nodes, disk storage skew occurs. As a result, disks are locked and data cannot be properly written.

Diagnostic criteria

AnalyticDB for MySQL performs table skew diagnostics on tables that contain more than 10,000 rows of data. You can use the following method to diagnose data skew:

  1. Remove the shard that has the largest size and calculate the average size of the remaining shards.

  2. If the size of a shard is larger than the average shard size multiplied by the threshold or smaller than the average shard size divided by the threshold, the shard is considered skewed. Default value of the threshold: 3. Valid values of the threshold: 0 to 10000000000. You can execute the SET ADB_CONFIG RC_DATA_SKEW_THRESHOLD=Value; statement to change the threshold.

Procedure

  1. Log on to the AnalyticDB for MySQL console. In the upper-left corner of the console, select a region. In the left-side navigation pane, click Clusters. On the Clusters page, click an edition tab. Find the cluster that you want to manage and click the cluster ID.

  2. In the left-side navigation pane, choose Storage Analysis > Storage Diagnostics.

  3. Click the Table Diagnostics tab to view information on the Table Skew Diagnostics tab.

    Storage Node Disk Usage

    You can view the disk usage of each storage node in the chart to determine whether the disk storage is skewed. If the disk storage is skewed, you can optimize tables by using the information in the Top 10 Skewed Tables section. If the disk storage is not skewed but the Top 10 Skewed Tables section contains skewed tables, you must also optimize the tables to ensure query performance.

    Top 10 Skewed Tables

    This section displays the skewed tables that are sorted in descending order based on the total data size. You can find a table and click View Skew Details in the Actions column to view the number of rows of the table on each shard to determine the skew degree of the table.

Optimization methods

You can use one of the following methods to resolve the issue:

  • Increase the storage capacity.

    • For AnalyticDB for MySQL Data Lakehouse Edition clusters, scale up and out reserved storage resources. For more information, see the "Scale up a cluster" section of the Scale a Data Lakehouse Edition cluster topic.

    • For AnalyticDB for MySQL Data Warehouse Edition clusters in elastic mode, scale out elastic I/O resources. For more information, see the "Scale a cluster in elastic mode" section of the Scale a Data Warehouse Edition cluster topic.

    • For AnalyticDB for MySQL Data Warehouse Edition clusters in reserved mode, scale up and out node groups. For more information, see "Scale a cluster in reserved mode" section of the Scale a Data Warehouse Edition cluster topic.

  • Delete idle indexes or partitions to reduce storage usage. For more information, see the "" section of this topic.

  • Create a table and migrate data to the table. For more information, see CREATE TABLE.

Hot and cold table optimization

AnalyticDB for MySQL analyzes the number of times tables are accessed and provides optimization suggestions for the less frequently accessed tables. You can change the hot and cold data storage policies of tables based on the optimization suggestions. For more information, see Separation of hot and cold data storage.

Diagnostic criteria

AnalyticDB for MySQL provides optimization suggestions on the hot tables that are not accessed within the last 15 days and whose access rate is less than 1%.

Procedure

  1. Log on to the AnalyticDB for MySQL console. In the upper-left corner of the console, select a region. In the left-side navigation pane, click Clusters. On the Clusters page, click an edition tab. Find the cluster that you want to manage and click the cluster ID.

  2. In the left-side navigation pane, choose Storage Analysis > Storage Diagnostics.

  3. Click the Table Diagnostics tab and click the Hot and Cold Table Optimization tab.

  4. In the Available Optimization Suggestions section, turn on Enable to enable the hot and cold table optimization feature. If the hot and cold table optimization feature is already enabled for the cluster, you can skip this step.

  5. View available and applied optimization suggestions in the Available Optimization Suggestions and Applied Optimization Suggestions sections.

    Parameter

    Description

    Suggestion ID

    The ID of the optimization suggestion.

    SQL

    The SQL statement that specifies the detailed information of the table to modify.

    Optimization Type

    The type of optimization. This parameter is set to Hot and Cold Data Optimization.

    Optimization Suggestion

    The specific optimization suggestion given for the optimization type.

    Expected Optimization Benefits

    The expected benefits obtained after the optimization suggestion is applied.

    Note

    The expected optimization benefits are estimated values measured based on historical data and for reference only.

    Actions

    You can click Apply to apply the optimization suggestion for a table.

    Note
    • After you click Apply for a table, AnalyticDB for MySQL changes the storage policy of the table to COLD. If you want to change the storage policy to MIXED or HOT, execute the ALTER statement to manually change the storage policy. For more information, see the "Storage policies" section of the ALTER TABLE topic.

    • If you agree to adopt the optimization suggestion, click Apply. After you click Apply for a table, an ALTER statement is executed on the cluster and the suggestion is displayed on the Applied Optimization Suggestions tab.

    • The Apply operation has the same effect as executing the ALTER statement on the client. This operation cannot be revoked. Proceed with caution.

    • The application of an optimization suggestion with the execution of its SQL statement can be complete only after BUILD operations automatically triggered on tables are complete. Before BUILD operations are triggered, the suggestion is in the Running state. After BUILD operations are triggered, the status of the suggestion changes to Completed.

Replicated table diagnostics

AnalyticDB for MySQL allows you to specify the DISTRIBUTED BY BROADCAST clause to create replicated tables. A copy of replicated table data is stored on each shard. If you want to perform a large number of concurrent queries that join a large table with a small table, you can create a replicated table to store the data of the small table. This reduces data transmission of the small table within the cluster and improves concurrency performance. However, replicated tables provide lower write performance and occupy a large amount of storage space, which affects the overall write performance of AnalyticDB for MySQL.

Diagnostic criteria

If a replicated table contains more than 20,000 records, the table is considered unreasonable.

Procedure

  1. Log on to the AnalyticDB for MySQL console. In the upper-left corner of the console, select a region. In the left-side navigation pane, click Clusters. On the Clusters page, click an edition tab. Find the cluster that you want to manage and click the cluster ID.

  2. In the left-side navigation pane, choose Storage Analysis > Storage Diagnostics.

  3. Click the Table Diagnostics tab and click the Replicated Table Diagnostics tab.

Optimization methods

You can create a standard table and migrate data to the table. For more information, see CREATE TABLE.

Partition diagnostics

Partitioned table diagnostics

If you create a partitioned table that contains unreasonable partition fields, the following issues occur:

  • If a partition contains a large amount of data and a BUILD job runs on the partition, the job may require an extended period of time to complete and consume substantial CPU and disk I/O resources of storage nodes. For example, a table is partitioned by year and each partition contains a large amount of data. In this case, the cluster stability is affected.

  • If a partition contains a small amount of data, the cluster may cache the information of many partitions and consume substantial memory resources. For example, a table is partitioned by hour and each partition contains a small amount of data. In this case, many partitions are scanned and the query performance is affected.

Criteria for reasonable partition size

Partition size refers to the number of rows in the partitions of a table 1, which is proportional to the number of shards 2. For a table that has N shards, the partition size is considered reasonable if the number of rows in the partitions is within the range of 1,000,000 × N to 5,000,000 × N.

For example, if the number of shards in a table is 64 and the number of rows in the partitions is within the range of 64 million to 320 million, the partition size is reasonable.

Note
  • 1: To query the number of rows in the partitions of a table, execute the following statement: SELECT partition_id, row_count FROM information_schema.kepler_partitions WHERE schema_name = '$DB' AND table_name ='$TABLE' AND partition_id > 0;.

  • 2: To query the number of shards of a table, execute the following statement: SELECT COUNT(1) FROM information_schema.kepler_meta_shards;.

Partition field reasonability diagnostics

Diagnostic criteria

If 10% or more partitions in a table have an unreasonable size, the partition field of the table is considered unreasonable.

For example, if a table has 100 partitions and 10 or more partitions have an unreasonable size, the partition field of the table is unreasonable.

Procedure
  1. Log on to the AnalyticDB for MySQL console. In the upper-left corner of the console, select a region. In the left-side navigation pane, click Clusters. On the Clusters page, click an edition tab. Find the cluster that you want to manage and click the cluster ID.

  2. In the left-side navigation pane, choose Storage Analysis > Storage Diagnostics.

  3. Click the Partition Diagnostics tab to view information about tables that contain unreasonable partitions and partitions that are unreasonable on the Partitioned Table Diagnostics tab.

Adjust the partition size to a reasonable range

If unreasonable partitions are detected in partitioned table diagnostics, use the following methods to adjust the partition settings:

  • If the number of rows in the partitions of a table does not reach the lower limit of the reasonable range, the partition size is considered excessively small. We recommend that you increase the partition granularity. For example, if the number of shards is 64, the reasonable range of partition rows is 64 million to 320 million. If the number of partition rows is less than 64 million, the partition size is excessively small. We recommend that you change the granularity from day to month.

  • If the number of rows in the partitions of a table exceeds the upper limit of the reasonable range, the partition size is considered excessively large. We recommend that you reduce the partition granularity. For example, if the number of shards is 64, the reasonable range of partition rows is 64 million to 320 million. If the number of partition rows is greater than 320 million, the partition size is excessively large. We recommend that you change the granularity from month to day.

    For information about how to change the partition granularity, see the "Change the partition function format of a table" section of the ALTER TABLE topic.

  • If the total number of rows in a partitioned table does not reach and is not expected to reach the lower limit of a reasonable range, you can create a non-partitioned table and migrate data from the partitioned table to the non-partitioned table.

Non-partitioned table diagnostics

If you do not specify the PARTITION BY clause when you create a table, the table is a non-partitioned table. If you perform DML operations, such as INSERT, UPDATE, and DELETE, on a non-partitioned table, the operations may trigger BUILD jobs on the entire table. If the non-partitioned table contains a large amount of data, the BUILD jobs may temporarily occupy a large amount of storage space. As a result, the disk usage increases and disks are locked. Additionally, BUILD jobs on large tables consume substantial CPU and disk I/O resources, which reduces the overall cluster performance.

Diagnostic criteria

If a non-partitioned table contains more than one billion rows of data, the table is considered unreasonable.

Procedure

  1. Log on to the AnalyticDB for MySQL console. In the upper-left corner of the console, select a region. In the left-side navigation pane, click Clusters. On the Clusters page, click an edition tab. Find the cluster that you want to manage and click the cluster ID.

  2. In the left-side navigation pane, choose Storage Analysis > Storage Diagnostics.

  3. Click the Partition Diagnostics tab and click the Non-partitioned Table Diagnostics tab.

Optimization methods

You can create a partitioned table and migrate data from an unreasonable non-partitioned table to the partitioned table. For more information, see CREATE TABLE.

Index diagnostics

AnalyticDB for MySQL analyzes the usage of data indexes and provides optimization suggestions for data indexes that are not used for an extended period of time. You can delete idle indexes based on the optimization suggestions to reduce the storage costs of data indexes.

Idle index diagnostics

Diagnostic criteria

Data indexes that are not used within the last 15 days and whose usage rate is less than 1% are considered idle indexes.

Procedure

  1. Log on to the AnalyticDB for MySQL console. In the upper-left corner of the console, select a region. In the left-side navigation pane, click Clusters. On the Clusters page, click an edition tab. Find the cluster that you want to manage and click the cluster ID.

  2. In the left-side navigation pane, choose Storage Analysis > Storage Diagnostics.

  3. Click the Index Diagnostics tab to view information on the Idle Index Optimization tab.

  4. In the Available Optimization Suggestions section, turn on Enable to enable the index diagnostics feature. If the index diagnostics feature is already enabled for the cluster, you can skip this step.

  5. View available and applied optimization suggestions in the Available Optimization Suggestions and Applied Optimization Suggestions sections.

    Parameter

    Description

    Suggestion ID

    The ID of the optimization suggestion.

    SQL

    The SQL statement that specifies the detailed information of the table to modify.

    Optimization Type

    The type of optimization. This parameter is set to Index Optimization.

    Optimization Suggestion

    The specific optimization suggestion given for the optimization type.

    Expected Optimization Benefits

    The expected benefits obtained after the optimization suggestion is applied.

    Note

    The expected optimization benefits are estimated values measured based on historical data and for reference only.

    Actions

    You can click Apply to apply the optimization suggestion for a table.

    Note
    • After a data index is deleted, it takes longer to filter a table by using the data referenced in the index.

    • If you agree to adopt the optimization suggestion, click Apply. After you click Apply for a table, an ALTER statement is executed on the cluster and the suggestion is displayed on the Applied Optimization Suggestions tab.

    • The Apply operation has the same effect as executing the ALTER statement on the client. This operation cannot be revoked. Proceed with caution.

    • The application of an optimization suggestion with the execution of its SQL statement can be complete only after BUILD operations automatically triggered on tables are complete. Before BUILD operations are triggered, the suggestion is in the Running state. After BUILD operations are triggered, the status of the suggestion changes to Completed.

Primary key diagnostics

Diagnostic criteria

If a table has more than three primary key fields and the number of primary key fields reaches half of the number of fields in the table, the primary key of the table is considered excessive.

Procedure

  1. Log on to the AnalyticDB for MySQL console. In the upper-left corner of the console, select a region. In the left-side navigation pane, click Clusters. On the Clusters page, click an edition tab. Find the cluster that you want to manage and click the cluster ID.

  2. In the left-side navigation pane, choose Storage Analysis > Storage Diagnostics.

  3. Click the Index Diagnostics tab and click the Primary Key Diagnostics tab.

Related operations

Operation

Description

DescribeTablePartitionDiagnose

Queries the information about partition diagnostics for an AnalyticDB for MySQL Data Warehouse Edition cluster.

DescribeExcessivePrimaryKeys

Queries the information about tables that have excessive primary key fields in an AnalyticDB for MySQL Data Lakehouse Edition cluster.