If the response or query speed of a Hologres instance slows down, you can query and analyze slow query logs to locate the issues. You can identify and analyze slow queries and failed queries by querying and analyzing slow query logs. Then, you can take optimization measures to improve system performance. This topic describes how to view and analyze slow query logs in Hologres.
Release notes
Hologres version | Description | Remarks |
V0.10 | The feature of querying and analyzing slow query logs is added. |
|
V2.2.7 | The default value of the log_min_duration_statement parameter is optimized. | Slow query logs record all data definition language (DDL) statements that are executed, all SQL statements that fail to be executed, and SQL statements whose execution durations exceed the value of In Hologres V2.2.7 and later, the default value of the |
Limits
Take note of the following items when you query slow query logs in Hologres:
By default, the slow query logs of the last month are retained. However, HoloWeb allows you to query only the slow query logs of the last seven days. To query earlier slow query logs, execute SQL statements.
To ensure system stability and prevent traffic overloads, a maximum of 10,000 slow query logs are returned from the server for each query. The length of some fields in slow query logs is limited. For more information, see field descriptions in the
hologres.hg_query_log
table.Slow query logs belong to the metadata warehouse system of Hologres. Failure in querying slow query logs does not negatively affect queries in instances. Therefore, the stability of slow query logs is not specified in the service-level agreement (SLA) of Hologres.
Query the hologres.hg_query_log table
Slow query logs of Hologres are stored in the hologres.hg_query_log system table.
In Hologres V0.10 and later, Hologres collects the logs of data manipulation language (DML) queries whose execution time is longer than 100 ms and the logs of all DDL statements.
In Hologres V3.0.2 and later, Hologres also collects and aggregates the logs of DML and data query language (DQL) queries that consume less than 100 ms.
Hologres collects and aggregates the logs of successfully executed DQL and DML queries that have SQL fingerprints and consume less than 100 ms.
In most cases, Hologres aggregates values of the following fields: server_addr, usename, datname, warehouse_id, application_name, and digest.
Hologres aggregates and reports field values every minute for each connection.
The following table describes the fields that are contained in the hologres.hg_query_log system table.
Field | Data type | Description for DML and DQL queries that consume more than 100 ms | Description for DML and DQL queries that consume less than 100 ms |
usename | text | The name of the user who initiated the query. | The name of the user who initiated the query. |
status | text | The status of the query. The value indicates whether the query is successful. Valid values:
| The status of the query, which is fixed as SUCCESS. Only logs of successful queries are aggregated. |
query_id | text | The unique ID of the query. All failed queries have IDs, but successful queries may not have IDs. | The ID of the first query among queries that have the same primary key value in an aggregation period. Field values are aggregated based on the primary key. |
digest | text | The SQL fingerprint. For SELECT, INSERT, DELETE, or UPDATE queries, the system calculates MD5 hash values as the SQL fingerprints of the queries. For more information about how to collect and calculate SQL fingerprints, see SQL fingerprint in this topic. Note In Hologres V2.2 and later, the digest field is added to this table to store SQL fingerprints of queries. If the version of your Hologres instance is V2.1 or earlier, contact the Hologres technical support to upgrade your instance. | The SQL fingerprint. |
datname | text | The name of the database from which data is queried. | The name of the database from which data is queried. |
command_tag | text | The type of the query. Valid values:
| The type of the first query among queries that have the same primary key value in an aggregation period. |
duration | integer | The amount of time consumed by the query, in milliseconds. The
By default, the system displays all DDL statements and only DML statements that consume longer than 1s. You can modify the duration for collection by using the Grand Unified Configuration (GUC) parameter | The average amount of time consumed by queries that have the same primary key value in an aggregation period. |
message | text | The returned error message. | N/A. |
query_start | timestamptz | The point in time when the query was initiated. | The point in time when the first query among queries that have the same primary key value in an aggregation period was initiated. |
query_date | text | The date when the query was initiated. | The date when the first query among queries that have the same primary key value in an aggregation period was initiated. |
query | text | The statement executed in the query. The statement can contain a maximum of 51,200 characters in length. If the length of the SQL statement exceeds 51,200 characters, only the first 51,200 characters are retained. | The statement executed in the first query among queries that have the same primary key value in an aggregation period. |
result_rows | bigint | The number of rows that were returned. If the INSERT statement is executed in the query, the number of inserted rows is returned. | The average value of the result_rows field for all queries that have the same primary key value in an aggregation period. |
result_bytes | bigint | The number of bytes that were returned. | The average value of the result_bytes field for all queries that have the same primary key value in an aggregation period. |
read_rows | bigint | The number of rows that were read by the query. The value is not an accurate value. If a bitmap index is used, the value is not the same as the actual number of rows that were read by the query. | The average value of the read_rows field for all queries that have the same primary key value in an aggregation period. |
read_bytes | bigint | The number of bytes that were read by the query. | The average value of the read_bytes field for all queries that have the same primary key value in an aggregation period. |
affected_rows | bigint | The number of rows that were affected by the DML statement. | The average value of the affected_rows field for all queries that have the same primary key value in an aggregation period. |
affected_bytes | bigint | The number of bytes that were affected by the DML statement. | The average value of the affected_bytes field for all queries that have the same primary key value in an aggregation period. |
memory_bytes | bigint | The accumulated peak memory usage of nodes. The value is not an accurate value. This field specifies the accumulated peak memory usage consumed by the query on all running nodes. The value of the field roughly reflects the amount of data that is read in the query. | The average value of the memory_bytes field for all queries that have the same primary key value in an aggregation period. |
shuffle_bytes | bigint | The number of bytes that were shuffled. The value is not an accurate value. The value of the field roughly reflects the amount of data transmitted over a network. | The average value of the shuffle_bytes field for all queries that have the same primary key value in an aggregation period. |
cpu_time_ms | bigint | The total CPU time, in milliseconds. The value is not an accurate value. This field specifies the CPU time consumed by all computing tasks, which is the sum of the time for multiple CPU cores. The value of this field basically reflects the complexity of computing tasks. | The average value of the cpu_time_ms field for all queries that have the same primary key value in an aggregation period. |
physical_reads | bigint | The number of physical reads. The field specifies the number of times that batch records are read from disks. The value of the field roughly reflects the number of memory cache misses. | The average value of the physical_reads field for all queries that have the same primary key value in an aggregation period. |
pid | integer | The process identifier (PID) of the query. | The PID of the first query among queries that have the same primary key value in an aggregation period. |
application_name | text | The type of the application that initiates the query. The following common application types are included:
For other applications that are connected to a Hologres instance, we recommend that you explicitly specify the | The type of the application that initiates the query. |
engine_type | text[] | The engine used in the query. Valid values:
| The engine used by the first query among queries that have the same primary key value in an aggregation period. |
client_addr | text | The IP address from which the query was initiated. The field specifies the egress IP address of the application. The value may not be the actual IP address of the application. | The IP address from which the first query among queries that have the same primary key value in an aggregation period was initiated. |
table_write | text | The table to which data was written. | The table to which data was written in the first query among queries that have the same primary key value in an aggregation period. |
table_read | text[] | The table from which data was read. | The table from which data was read in the first query among queries that have the same primary key value in an aggregation period. |
session_id | text | The ID of the query session. | The session ID of the first query among queries that have the same primary key value in an aggregation period. |
session_start | timestamptz | The point in time when the query session started. The field specifies the point in time when the connection was established. | The points in time when sessions of all queries that have the same primary key value in an aggregation period started. |
command_id | text | The ID of the command or the statement. | The IDs of the commands or statements of all queries that have the same primary key value in an aggregation period. |
optimization_cost | integer | The amount of time consumed to generate an execution plan for the query. It may take a long time to generate an execution plan if the SQL statement that you executed is complex. | The amounts of time consumed to generate execution plans for all queries that have the same primary key value in an aggregation period. |
start_query_cost | integer | The amount of time consumed to start the query. The startup of the query may be time-consuming if the process is waiting for a lock to be released or resources. | The amounts of time consumed to start queries that have the same primary key value in an aggregation period. |
get_next_cost | integer | The amount of time consumed to perform the query. The query may be time-consuming if you want to read a large amount of data. In this case, you can optimize the SQL statement based on your business requirements. | The amounts of time consumed to perform queries that have the same primary key value in an aggregation period. |
extended_cost | text | The amount of time consumed by other operations of the query. The total amount of time consumed by the query includes the values of optimization_cost, start_query_cost, get_next_cost, and extended_cost fields. The
| The time consumption details of the first query among queries that have the same primary key value in an aggregation period. |
plan | text | The execution plan of the query. An execution plan can contain a maximum of 102,400 characters. Exceeded characters may be truncated. You can modify the length limit by using the GUC parameter | The execution plan of the first query among queries that have the same primary key value in an aggregation period. |
statistics | text | The execution statistics about the query. The statistics can contain a maximum of 102,400 characters. Exceeded characters may be truncated. You can modify the length limit by using the GUC parameter | The execution statistics about the first query among queries that have the same primary key value in an aggregation period. |
visualization_info | text | The visualization information about the execution plan. | The visualization information about the execution plan of the first query among queries that have the same primary key value in an aggregation period. |
query_detail | text | The extended information about the query in the JSON format. Note The extended information can contain a maximum of 10,240 characters. If the length exceeds 10,240 characters, only the first 10,240 characters are retained. | The extended information about the first query among queries that have the same primary key value in an aggregation period. |
query_extinfo | text[] |
Note If you log on to the Hologres console by using a local account or a service-linked role (SLR) or by using Security Token Service (STS), the AccessKey ID is not recorded. If you log on to the Hologres console by using a temporary account, the temporary AccessKey ID is recorded. | The extended information about the first query among queries that have the same primary key value in an aggregation period. |
calls | INT | The value is 1 because detailed information is not aggregated. | The number of queries that have the same primary key value in an aggregation period. |
agg_stats | JSONB | N/A. | Values of aggregation operations such as MIN, MAX, and AVG on the duration, memory_bytes, cpu_time_ms, physical_reads, optimization_cost, start_query_cost, and get_next_cost fields for queries that have the same primary key value in an aggregation period. |
The calls and agg_stats fields are new in Hologres V3.0.2.
Grant query permissions to users
Only authorized users can query slow query logs. When you grant query permissions to users, take note of the following items:
A user can be granted permissions to query the slow query logs of all databases in a Hologres instance.
Grant superuser permissions to a user.
After the superuser permissions on a Hologres instance are granted to a user, the user can query the slow query logs of all databases in the Hologres instance.
ALTER USER "Account ID" SUPERUSER;-- Replace Account ID with the Alibaba Cloud account ID of the user. If the user is a Resource Access Management (RAM) user, prefix the account ID with p4_.
Add the user to the pg_read_all_stats user group.
In addition to superusers, users in the pg_read_all_stats user group have the permissions to query the slow query logs of all databases in the relevant Hologres instance. If you are a superuser, you can add regular users to this user group. To add a user to this user group, execute one of the following statements:
GRANT pg_read_all_stats TO "Account ID";-- Use the standard PostgreSQL authorization model to grant related permissions to the user. CALL spm_grant('pg_read_all_stats', 'Account ID'); -- Use the simple permission model (SPM) to grant related permissions to the user. CALL slpm_grant('pg_read_all_stats', 'Alibaba Cloud account ID'); -- Use the SLPM to grant relevant permissions to the user.
A user can be granted permissions to query the slow query logs of a specific database in a Hologres instance.
Users in the db_admin user group have the permissions to query the slow query logs of the relevant database. If you are a superuser, you can add regular users to this user group by using the SPM or SLPM.
CALL spm_grant('<db_name>_admin', 'Account ID'); -- Use the SPM to grant related permissions to the user. CALL slpm_grant('<db_name>.admin', 'Account ID'); -- Use the SLPM to grant related permissions to the user.
A regular user has the permissions to query the logs of slow queries that were initiated by the regular user in the database within the relevant Alibaba Cloud account.
Query slow query logs in the HoloWeb console
You can query slow query logs in the HoloWeb console in a visualized manner.
Slow query logs of the last seven days can be queried in the HoloWeb console.
Only a superuser can query slow query logs in the HoloWeb console. If you are a regular user who has related permissions, you can query slow query logs by executing SQL statements.
Log on to the HoloWeb console. For more information, see Connect to HoloWeb and perform queries.
In the top navigation bar, click Diagnostics and Optimization.
In the left-side navigation pane, click Historical Slow Query.
In the upper part of the Historical Slow Query page, set the query conditions.
For more information about the parameters on the Historical Slow Query page, see Query and analyze slow query logs.
Click Query and view the query results in the Query Trend Analysis and Queries sections.
Query Trend Analysis
The Query Trend Analysis section displays the trends of slow queries or failed queries in the specified time range in a visualized manner. This allows you to find the periods during which slow queries and failed queries frequently occurred and troubleshoot issues in a centralized manner.
Queries
The Queries section displays the details of slow queries and failed queries. For more information about the parameters in the Queries section, see Query and analyze slow query logs. You can also click Customize Columns to select the fields that you want to display in the Queries section.
SQL fingerprint
In Hologres V2.2 and later, the digest field is added to the system table hologres.hg_query_log to store SQL fingerprints of queries. For queries that are performed by executing the SELECT, INSERT, DELETE, or UPDATE statement, the system calculates MD5 hash values as the SQL fingerprints of the queries. This facilitates classification and analysis of queries for your business.
SQL fingerprints are collected and calculated based on the following rules:
By default, only the SQL fingerprints of queries that are performed by executing the SELECT, INSERT, DELETE, or UPDATE statement are collected.
If you insert constant values into a table by executing the
INSERT
statement, the SQL fingerprint of this statement is not affected by the amount of inserted data.The calculation of SQL fingerprints follows the same case sensitivity rule as the rule that is used by Hologres to process SQL statements.
During the calculation of SQL fingerprints, the system ignores all whitespace characters and considers only structure information in SQL statements. Whitespace characters include spaces, line breaks, and tabs.
During the calculation of SQL fingerprints, the impacts of constant values on SQL fingerprints are ignored.
For example, the SQL fingerprint of the
SELECT * FROM t WHERE a > 1;
statement is the same as that of theSELECT * FROM t WHERE a > 2;
statement.If a query statement contains an array that consists of constants, the calculation of the SQL fingerprint is not affected by the number of elements in the array.
For example, the SQL fingerprint of the
SELECT * FROM t WHERE a IN (1, 2);
statement is the same as that of theSELECT * FROM t WHERE a IN (3, 4, 5);
statement.During the calculation of SQL fingerprints, a table in a query is identified based on the table name, the name of the database to which the table belongs, and the name of the schema to which the table belongs.
For example, the SQL fingerprint of the
SELECT * FROM t;
statement is the same as that of theSELECT * FROM public.t;
statement only when the tablet
in the former statement belongs to thepublic
schema.
Analyze slow query logs
You can obtain slow query logs by querying the hologres.hg_query_log table. This section describes the SQL statements that are commonly used to analyze slow query logs. You can perform the following operations:
Query the slow queries that are most time-consuming in the last 10 minutes.
Query the access amount and data read amount of slow queries initiated in each of the last 3 hours.
Query the slow queries that are most time-consuming at different query stages in the last 10 minutes.
Execute the following statement to query the total number of slow queries. By default, the number of slow queries for the previous month is returned.
SELECT count(*) FROM hologres.hg_query_log;
The following result is returned. The result indicates that 44 queries whose execution time exceeds the specified threshold were initiated in the previous month.
count ------- 44 (1 row)
Execute the following statement to query the number of slow queries initiated by each user:
SELECT usename AS "User", count(1) as "Number of slow queries" FROM hologres.hg_query_log GROUP BY usename order by count(1) desc;
The following result is returned. The value returned by the
count(1)
function indicates the number of slow queries initiated by each user.User | Number of slow queries -----------------------+----- 1111111111111111 | 27 2222222222222222 | 11 3333333333333333 | 4 4444444444444444 | 2 (4 rows)
Execute the following statement to query the details of a specified slow query:
SELECT * FROM hologres.hg_query_log WHERE query_id = '13001450118416xxxx';
The following result is returned. For more information about the fields, see Query the hologres.hg_query_log table in this topic.
usename | status | query_id | datname| command_tag | duration | message | query_start | query_date | query | result_rows | result_bytes | read_rows |read_bytes | affected_rows | affected_bytes | memory_bytes | shuffle_bytes | cpu_time_ms | physical_reads | pid | application_name | engine_type | client_addr | table_write | table_read | session_id | session_start | command_id | optimization_cost | start_query_cost | get_next_cost | extended_cost | plan | statistics | visualization_info | query_detail | query_extinfo -----------------------+---------+--------------------+---------+-------------+----------+---------+------------------------+------------+---------------------------------------------------------------------------------------------------------+-------------+--------------+-----------+------------+---------------+----------------+--------------+---------------+-------------+----------------+---------+------------------+-------------+---------------+-------------+------------+-----------------+------------------------+------------+-------------------+------------------+---------------+---------------+------+------------+--------------------+--------------+--------------- p4_11111111111xxxx | SUCCESS | 13001450118416xxxx | dbname | SELECT | 149 | | 2021-03-30 23:45:01+08 | 20210330 | explain analyze SELECT * FROM tablename WHERE user_id = '20210330010xxxx' limit 1000; | 1000 | 417172 | 0 | 0 | -1 | -1 | 26731376 | 476603616 | 321626 | 0 | 1984913 | psql | {HQE} | 33.41.xxx.xxx | | | 6063475a.1e4991 | 2021-03-30 23:44:26+08 | 0 | 58 | 22 | 67 | | | | | | (1 row)
Execute the following statement to query the slow queries that are most time-consuming in a recent time range, such as the last 10 minutes. You can specify a time range based on your business requirements.
SELECT status AS "Status", duration AS "Time consumed (ms)", query_start AS "Started at", (read_bytes/1048576)::text || ' MB' AS "Read bytes", (memory_bytes/1048576)::text || ' MB' AS "Memory", (shuffle_bytes/1048576)::text || ' MB' AS "Shuffle", (cpu_time_ms/1000)::text || ' s' AS "CPU time", physical_reads as "Physical reads", query_id as "QueryID", query::char(30) FROM hologres.hg_query_log WHERE query_start >= now() - interval '10 min' ORDER BY duration DESC, read_bytes DESC, shuffle_bytes DESC, memory_bytes DESC, cpu_time_ms DESC, physical_reads DESC LIMIT 100;
The following result is returned:
Status |Time consumed (ms) | Started at | Megabytes read | Memory | Shuffle | CPU time | Physical reads | QueryID | query ---------+---------+------------------------+--------+-------+---------+---------+------+--------------------+-------------------------------- SUCCESS | 149 | 2021-03-30 23:45:01+08 | 0 MB | 25 MB | 454 MB | 321 s | 0 | 13001450118416xxxx | explain analyze SELECT * FROM SUCCESS | 137 | 2021-03-30 23:49:18+08 | 247 MB | 21 MB | 213 MB | 803 s | 7771 | 13001491818416xxxx | explain analyze SELECT * FROM FAILED | 53 | 2021-03-30 23:48:43+08 | 0 MB | 0 MB | 0 MB | 0 s | 0 | 13001484318416xxxx | SELECT ds::bigint / 0 FROM pub (3 rows)
Query the number of queries that were initiated on the previous day.
Execute the following statement to query the total number of queries that were initiated on the previous day:
SELECT COUNT(1) FROM ( SELECT DISTINCT t1.digest FROM hologres.hg_query_log t1 WHERE t1.query_start >= CURRENT_DATE - INTERVAL '1 day' AND t1.query_start < CURRENT_DATE AND NOT EXISTS ( SELECT 1 FROM hologres.hg_query_log t2 WHERE t2.digest = t1.digest AND t2.query_start < CURRENT_DATE - INTERVAL '1 day') AND digest IS NOT NULL ) AS a;
The following result indicates that a total of 10 queries were initiated on the previous day.
count ------- 10 (1 row)
Execute the following statement to query the number of each type of queries that were initiated on the previous day. The query type is specified by command_tag.
SELECT a.command_tag, COUNT(1) FROM ( SELECT DISTINCT t1.digest, t1.command_tag FROM hologres.hg_query_log t1 WHERE t1.query_start >= CURRENT_DATE - INTERVAL '1 day' AND t1.query_start < CURRENT_DATE AND NOT EXISTS ( SELECT 1 FROM hologres.hg_query_log t2 WHERE t2.digest = t1.digest AND t2.query_start < CURRENT_DATE - INTERVAL '1 day') AND t1.digest IS NOT NULL) AS a GROUP BY 1 ORDER BY 2 DESC;
The following result indicates that eight
INSERT
queries and twoSELECT
queries were initiated on the previous day.command_tag | count ------------+-------- INSERT | 8 SELECT | 2 (2 rows)
Execute the following statement to query details of queries that were initiated on the previous day:
SELECT a.usename, a.status, a.query_id, a.digest, a.datname, a.command_tag, a.query, a.cpu_time_ms, a.memory_bytes FROM ( SELECT DISTINCT t1.usename, t1.status, t1.query_id, t1.digest, t1.datname, t1.command_tag, t1.query, t1.cpu_time_ms, t1.memory_bytes FROM hologres.hg_query_log t1 WHERE t1.query_start >= CURRENT_DATE - INTERVAL '1 day' AND t1.query_start < CURRENT_DATE AND NOT EXISTS ( SELECT 1 FROM hologres.hg_query_log t2 WHERE t2.digest = t1.digest AND t2.query_start < CURRENT_DATE - INTERVAL '1 day' ) AND t1.digest IS NOT NULL ) AS a;
The following result is returned:
usename |status |query_id |digest |datname|command_tag |query |cpu_time_ms |memory_bytes -----------------+--------+--------------------+------------------------------------+-------+-------------+-----------------------------------+--------------+-------------- 111111111111xxxx |SUCCESS |100100425827776xxxx |md58cf93d91c36c6bc9998add971458ba1a |dbname |INSERT |INSERT INTO xxx SELECT * FROM xxx | 1748| 898808596 111111111111xxxx |SUCCESS |100100425827965xxxx |md5f7e87e2c9e0b3d9eddcd6b3bc7f04b3b |dbname |INSERT |INSERT INTO xxx SELECT * FROM xxx | 59891| 6819529886 111111111111xxxx |SUCCESS |100100425829654xxxx |md55612dc09d2d81074fd5deed1aa3eca9b |dbname |INSERT |INSERT INTO xxx SELECT * FROM xxx | 3| 2100039 111111111111xxxx |SUCCESS |100100425829664xxxx |md58d3bf67fbdf2247559bc916586d40011 |dbname |INSERT |INSERT INTO xxx SELECT * FROM xxx | 10729| 2052861937 111111111111xxxx |SUCCESS |100100425830099xxxx |md503bd45d6b2d7701c2617d079b4d55a10 |dbname |INSERT |INSERT INTO xxx SELECT * FROM xxx | 2196| 897948034 111111111111xxxx |SUCCESS |100100425830186xxxx |md5c62169eaf3ea3a0c59bdc834a8141ac4 |dbname |INSERT |INSERT INTO xxx SELECT * FROM xxx | 5268| 1734305972 111111111111xxxx |SUCCESS |100100425830448xxxx |md59aa0c73b24c9c9eba0b34c8fdfc23bb0 |dbname |INSERT |INSERT INTO xxx SELECT * FROM xxx | 2| 2098402 111111111111xxxx |SUCCESS |100100425830459xxxx |md57d22c1d37b68984e9472f11a4c9fd04e |dbname |INSERT |INSERT INTO xxx SELECT * FROM xxx | 113| 76201984 111111111111xxxx |SUCCESS |100100525468694xxxx |md5ac7d6556fae123e9ea9527d8f1c94b1c |dbname |SELECT |SELECT * FROM xxx limit 200 | 6| 1048576 111111111111xxxx |SUCCESS |100101025538840xxxx |md547d09cdad4d5b5da74abaf08cba79ca0 |dbname |SELECT |SELECT * FROM xxx limit 200 |\N |\N (10 rows)
Execute the following statement to query the trend of queries that were initiated in each hour of the previous day.
SELECT to_char(a.query_start, 'HH24') AS query_start_hour, a.command_tag, COUNT(1) FROM ( SELECT DISTINCT t1.query_start, t1.digest, t1.command_tag FROM hologres.hg_query_log t1 WHERE t1.query_start >= CURRENT_DATE - INTERVAL '1 day' AND t1.query_start < CURRENT_DATE AND NOT EXISTS ( SELECT 1 FROM hologres.hg_query_log t2 WHERE t2.digest = t1.digest AND t2.query_start < CURRENT_DATE - INTERVAL '1 day' ) AND t1.digest IS NOT NULL ) AS a GROUP BY 1, 2 ORDER BY 3 DESC;
The following result indicates that one SELECT query was initiated in the 11th hour, another SELECT query was initiated in the 13th hour, and eight INSERT queries were initiated in the 21st hour on the previous day.
query_start_hour |command_tag |count ------------------+-------------+----- 21 |INSERT | 8 11 |SELECT | 1 13 |SELECT | 1 (3 rows)
Execute the following statement to classify slow queries:
SELECT digest, command_tag, count(1) FROM hologres.hg_query_log WHERE query_start >= CURRENT_DATE - INTERVAL '1 day' AND query_start < CURRENT_DATE GROUP BY 1,2 ORDER BY 3 DESC;
Execute the following statement to query the top 10 queries that generate the largest CPU overheads on the previous day:
SELECT digest, avg(cpu_time_ms) FROM hologres.hg_query_log WHERE query_start >= CURRENT_DATE - INTERVAL '1 day' AND query_start < CURRENT_DATE AND digest IS NOT NULL AND usename != 'system' AND cpu_time_ms IS NOT NULL GROUP BY 1 ORDER BY 2 DESC LIMIT 10;
Execute the following statement to query the top 10 queries that occupy the largest memory resources in the previous week. The memory is specified by memory_bytes.
SELECT digest, avg(memory_bytes) FROM hologres.hg_query_log WHERE query_start >= CURRENT_DATE - INTERVAL '7 day' AND query_start < CURRENT_DATE AND digest IS NOT NULL AND memory_bytes IS NOT NULL GROUP BY 1 ORDER BY 2 DESC LIMIT 10;
Execute the following statement to query the access amount and data read amount of slow queries initiated in each of the last 3 hours. This helps determine whether the data amount changes in each hour.
SELECT date_trunc('hour', query_start) AS query_start, count(1) AS query_count, sum(read_bytes) AS read_bytes, sum(cpu_time_ms) AS cpu_time_ms FROM hologres.hg_query_log WHERE query_start >= now() - interval '3 h' GROUP BY 1;
Execute the following statement to query the data access amount of slow queries in the past 3 hours and that in the same period on the previous day for comparison. This helps determine whether the data access amount changes compared with that in the same period on the previous day.
SELECT query_date, count(1) AS query_count, sum(read_bytes) AS read_bytes, sum(cpu_time_ms) AS cpu_time_ms FROM hologres.hg_query_log WHERE query_start >= now() - interval '3 h' GROUP BY query_date UNION ALL SELECT query_date, count(1) AS query_count, sum(read_bytes) AS read_bytes, sum(cpu_time_ms) AS cpu_time_ms FROM hologres.hg_query_log WHERE query_start >= now() - interval '1d 3h' AND query_start <= now() - interval '1d' GROUP BY query_date;
Execute the following statement to query the slow queries that are most time-consuming at different query stages in a recent time range, such as the last 10 minutes. You can specify a time range based on your business requirements.
SELECT status AS "Status", duration AS "Time consumed (ms)", optimization_cost AS "Optimization time (ms)", start_query_cost AS "Startup time (ms)", get_next_cost AS "Execution duration (ms)", duration - optimization_cost - start_query_cost - get_next_cost AS "Others (ms)", query_id AS "QueryID", query::char(30) FROM hologres.hg_query_log WHERE query_start >= now() - interval '10 min' ORDER BY duration DESC, start_query_cost DESC, optimization_cost, get_next_cost DESC, duration - optimization_cost - start_query_cost - get_next_cost DESC LIMIT 100;
The following result is returned:
Status | Time consumed (ms) | Optimization duration (ms) | Startup duration (ms) | Execution duration (ms) | Others (ms) | QueryID | query ---------+----------+--------------+--------------+--------------+--------------+--------------------+-------------------------------- SUCCESS | 4572 | 521 | 320 | 3726 | 5 | 6000260625679xxxx | -- /* user: wang ip: xxx.xx.x SUCCESS | 1490 | 538 | 98 | 846 | 8 | 12000250867886xxxx | -- /* user: lisa ip: xxx.xx.x SUCCESS | 1230 | 502 | 95 | 625 | 8 | 26000512070295xxxx | -- /* user: zhang ip: xxx.xx. (3 rows)
Execute the following statement to query the first failed queries:
SELECT status AS "Status", regexp_replace(message, '\n', ' ')::char(150) AS "Error message", duration AS "Time consumed (ms)", query_start AS "Started at", query_id AS "QueryID", query::char(100) AS "Query" FROM hologres.hg_query_log WHERE query_start BETWEEN '2021-03-25 17:00:00'::timestamptz AND '2021-03-25 17:42:00'::timestamptz + interval '2 min' AND status = 'FAILED' ORDER BY query_start ASC LIMIT 100;
The following result is returned:
Status | Error message | Time consumed (ms) | Started at | QueryID | Query --------+--------------------------------------------------------------------------------------------------------------------------------------------------------+-------+------------------------+-------------------+------- FAILED | Query:[1070285448673xxxx] code: kActorInvokeError msg: "[holo_query_executor.cc:330 operator()] HGERR_code XX000 HGERR_msge internal error: status { c | 1460 | 2021-03-25 17:28:54+08 | 1070285448673xxxx | S... FAILED | Query:[1016285560553xxxx] code: kActorInvokeError msg: "[holo_query_executor.cc:330 operator()] HGERR_code XX000 HGERR_msge internal error: status { c | 131 | 2021-03-25 17:28:55+08 | 1016285560553xxxx | S... (2 rows)
Export slow query logs
Hologres allows you to execute the INSERT statement to export slow query logs from the hologres.hg_query_log table to a custom internal table or a foreign table such as a MaxCompute foreign table or an Object Storage Service (OSS) foreign table.
Usage notes
To correctly and efficiently export slow query logs, take note of the following items:
The account that is used to execute the
INSERT INTO ... SELECT ... FROM hologres.hg_query_log;
statement must have access permissions on thehologres.hg_query_log
table. For more information, see Grant query permissions to users. To export all slow query logs of an instance, the account that is used to execute theINSERT
statement must have the superuser permissions or the pg_read_all_stats permission. Otherwise, only part of the data in thehologres.hg_query_log
table can be queried and exported.An index is configured for the query_start field in the hologres.hg_query_log table. If you want to export data generated within a specific time range, use the index to query the data. This ensures better performance and consumes fewer resources.
If you use the index to query data, do not use nested expressions for the
query_start
field. Otherwise, the index may not work. For example, we recommend that you replaceWHERE to_char(query_start, 'yyyymmdd') = '20220101';
withWHERE query_start >= to_char('20220101', 'yyyy-mm-dd') and query_start < to_char('20220102', 'yyyy-mm-dd');
.
Example 1: Export slow query logs to a Hologres internal table
Execute the following statement in Hologres to export slow query logs to the query_log_download internal table:
--Hologres SQL CREATE TABLE query_log_download ( usename text, status text, query_id text, datname text, command_tag text, duration integer, message text, query_start timestamp with time zone, query_date text, query text, result_rows bigint, result_bytes bigint, read_rows bigint, read_bytes bigint, affected_rows bigint, affected_bytes bigint, memory_bytes bigint, shuffle_bytes bigint, cpu_time_ms bigint, physical_reads bigint, pid integer, application_name text, engine_type text[], client_addr text, table_write text, table_read text[], session_id text, session_start timestamp with time zone, trans_id text, command_id text, optimization_cost integer, start_query_cost integer, get_next_cost integer, extended_cost text, plan text, statistics text, visualization_info text, query_detail text, query_extinfo text[] ); INSERT INTO query_log_download SELECT usename, status, query_id, datname, command_tag, duration, message, query_start, query_date, query, result_rows, result_bytes, read_rows, read_bytes, affected_rows, affected_bytes, memory_bytes, shuffle_bytes, cpu_time_ms, physical_reads, pid, application_name, engine_type, client_addr, table_write, table_read, session_id, session_start, trans_id, command_id, optimization_cost, start_query_cost, get_next_cost, extended_cost, plan, statistics, visualization_info, query_detail, query_extinfo FROM hologres.hg_query_log WHERE query_start >= '2022-08-03' AND query_start < '2022-08-04';
Example 2: Export slow query logs to a MaxCompute foreign table
Execute the following statement in MaxCompute to create a MaxCompute table to which you want to export slow query logs:
CREATE TABLE if NOT EXISTS mc_holo_query_log ( username STRING COMMENT 'The name of the user who initiated the query.' ,status STRING COMMENT 'The status of the query. The value indicates whether the query is successful.' ,query_id STRING COMMENT 'The ID of the query.' ,datname STRING COMMENT 'The name of the queried database.' ,command_tag STRING COMMENT 'The type of the statement executed in the query.' ,duration BIGINT COMMENT 'The amount of time consumed by the query, in milliseconds.' ,message STRING COMMENT 'The returned error message.' ,query STRING COMMENT 'The statement executed in the query.' ,read_rows BIGINT COMMENT 'The number of rows that were read by the query.' ,read_bytes BIGINT COMMENT 'The number of bytes that were read by the query.' ,memory_bytes BIGINT COMMENT 'The peak memory usage of a single node. The value is not an accurate value.' ,shuffle_bytes BIGINT COMMENT 'The number of bytes that were shuffled. The value is not an accurate value.' ,cpu_time_ms BIGINT COMMENT 'The total CPU time, in milliseconds. The value is not an accurate value.' ,physical_reads BIGINT COMMENT 'The number of physical reads.' ,application_name STRING COMMENT 'The type of the application that initiated the query.' ,engine_type ARRAY<STRING> COMMENT 'The engine used in the query.' ,table_write STRING COMMENT 'The table to which data was written.' ,table_read ARRAY<STRING> COMMENT 'The table from which data was read.' ,plan STRING COMMENT 'The execution plan of the query.' ,optimization_cost BIGINT COMMENT 'The amount of time consumed to generate an execution plan for the query.' ,start_query_cost BIGINT COMMENT 'The amount of time consumed to start the query.' ,get_next_cost BIGINT COMMENT 'The amount of time consumed to execute the query.' ,extended_cost STRING COMMENT 'The amount of time consumed by other operations of the query.' ,query_detail STRING COMMENT 'The extended information about the query in the JSON format.' ,query_extinfo ARRAY<STRING> COMMENT 'The extended information about the query in the form of an array.' ,query_start STRING COMMENT 'The point in time when the query was initiated.' ,query_date STRING COMMENT 'The date when the query was initiated.' ) COMMENT 'hologres instance query log daily' PARTITIONED BY ( ds STRING COMMENT 'stat date' ) LIFECYCLE 365; ALTER TABLE mc_holo_query_log ADD PARTITION (ds=20220803);
Execute the following statement in Hologres to export slow query logs to the MaxCompute foreign table:
IMPORT FOREIGN SCHEMA project_name LIMIT TO (mc_holo_query_log) FROM SERVER odps_server INTO public; INSERT INTO mc_holo_query_log SELECT usename AS username, status, query_id, datname, command_tag, duration, message, query, read_rows, read_bytes, memory_bytes, shuffle_bytes, cpu_time_ms, physical_reads, application_name, engine_type, table_write, table_read, plan, optimization_cost, start_query_cost, get_next_cost, extended_cost, query_detail, query_extinfo, query_start, query_date, '20220803' FROM hologres.hg_query_log WHERE query_start >= '2022-08-03' AND query_start < '2022-08-04';
Configuration items
Hologres provides the following configuration items that allow you to configure the collection of slow query logs:
log_min_duration_statement
Description
Modification of this configuration item takes effect only for new queries. For queries that have been collected, the original value takes effect.
Only superusers can modify this configuration item.
If you set this configuration item to
-1
, no slow query logs are collected. If you set this configuration item to a positive value, the minimum value is 100 ms.Examples
Enable the system to collect the logs of slow queries that consume
250 ms
or longer:-- Execute the following statement as a superuser for a database: ALTER DATABASE dbname SET log_min_duration_statement = '250ms'; -- Execute the following statement as a regular user for a session: SET log_min_duration_statement = '250ms';
This configuration item is used to specify a threshold of the amount of time consumed by slow queries. By default, Hologres collects the logs of slow queries that consume more than 100 ms, and only queries that consume 1 second or longer are displayed in the query result. You can configure this configuration item to modify the duration. Take note of the following items:
log_min_duration_query_stats
Description
Modification of this configuration item takes effect only for new queries. For queries that have been collected, the original value takes effect.
If you set this configuration item to
-1
, no execution statistics about slow queries are collected.Such statistics require large storage space. If Hologres collects a large amount of execution statistics, the query and analysis of slow query logs may slow down. Therefore, we recommend that you specify a threshold of at least 10 seconds in general scenarios. Specify a threshold of less than 10 seconds only when you troubleshoot specific issues.
Examples
Enable the system to collect execution statistics about the slow queries that consume more than 20 seconds:
-- Execute the following statement as a superuser for a database: ALTER DATABASE dbname SET log_min_duration_query_stats = '20s'; -- Execute the following statement as a regular user for a session: SET log_min_duration_query_stats = '20s';
This configuration item is used to specify a threshold of the amount of time consumed by queries for the collection of execution statistics. By default, Hologres collects execution statistics about the slow queries that consume more than 10 seconds. Take note of the following items:
log_min_duration_query_plan
Description
Modification of this configuration item takes effect only for new queries. For queries that have been collected, the original value takes effect.
If the amount of time consumed by a slow query is greater than or equal to the specified value, the execution plan of the slow query is collected.
By default, you can execute the
EXPLAIN
statement to query execution plans in real time, and you do not need to collect such data.If you set this configuration item to
-1
, no execution plans of slow queries are collected.Examples
Enable the system to collect the execution plans of slow queries that consume more than 10 seconds:
-- Execute the following statement as a superuser for a database: ALTER DATABASE dbname SET log_min_duration_query_plan = '10s'; -- Execute the following statement as a regular user for a session: SET log_min_duration_query_plan = '10s';
This configuration item is used to specify a threshold of the amount of time consumed by queries for the collection of execution plans. By default, Hologres displays the execution plans of slow queries that consume 10 or more seconds. Take note of the following items:
FAQ
Problem description:
In Hologres V1.1, detailed statistics such as the number of queried rows and the number of returned rows cannot be returned in queried slow query logs.
Cause:
The collected slow query logs are incomplete.
Solution:
In Hologres V1.1.36 to Hologres V1.1.49, you can use a GUC parameter to instruct the system to return complete slow query logs. Complete slow query logs are directly returned for Hologres V1.1.49 and later.
NoteIf the version of your Hologres instance is earlier than V1.1.36, manually upgrade your Hologres instance or join the Hologres DingTalk group for technical support. For more information, see Instance upgrades and Obtain online support for Hologres.
-- (Recommended) Configure the GUC parameter at the database level. You need to execute the following statement only once for each database: ALTER databse <db_name> SET hg_experimental_force_sync_collect_execution_statistics = ON; -- Configure the GUC parameter at the session level. SET hg_experimental_force_sync_collect_execution_statistics = ON;
The db_name parameter specifies the name of the database.
References
For more information about how to diagnose and manage queries in an instance, see Manage queries.