If you want to analyze the performance of your Tair instance, identify slow query commands, or pinpoint potential performance bottlenecks, you can do so by examining the slow logs. This analysis can help you uncover clues for resolving performance issues and optimizing queries. Slow logs record commands whose execution time exceeds the threshold specified by slowlog-log-slower-than. By default, this threshold is set to 20 milliseconds. You can customize this value to suit your specific needs.
Overview
Slow logs record requests that take longer than a specified threshold to execute in Tair. Slow logs are classified into slow logs of data nodes and slow logs of proxy nodes.
Slow logs of data nodes
The command execution duration collected in slow logs that were generated on a data node includes only the amount of time required to actually run a command on the data node. The amount of time required for the data node to communicate with a proxy node or client and the execution latency of the command in the single-threaded queue are not included.
Slow logs of data nodes are retained for 72 hours. The number of slow logs that can be stored is unlimited.
In most cases, few slow logs are generated on data nodes due to the high performance of Tair.
Parameters
Parameter | Description |
slowlog-log-slower-than | The threshold of the command execution duration for slow logs of data nodes. If a command runs for a period of time that exceeds this threshold, the command is recorded in a slow log. Default value: 20000. Unit: microseconds. 20000 microseconds is equal to 20 milliseconds. Note In most cases, the actual latency is higher than the specified value of this parameter because this value does not include the amount of time required to transmit and process data among clients, proxy nodes, and data nodes. |
slowlog-max-len | The maximum number of slow log entries that can be stored. Default value: 1024. |
For more information, see Modify the values of parameters for an instance.
Slow logs of proxy nodes
The command execution duration collected in slow logs of proxy nodes starts from the time when a proxy node sends a request to a data node and ends at the time when the proxy node receives the response from the data node. This includes the command execution duration on the data node, the duration of data transmission over the network, and the queuing latency of the command.
Slow logs of proxy nodes are retained for 72 hours. The number of slow logs that can be stored is unlimited.
In most cases, the latency value recorded in a slow log of proxy nodes is closer to the actual latency of the application. Therefore, we recommend that you check this slow log type when you troubleshoot timeout issues of Tair.
Standard instances of Tair do not involve slow logs of proxy nodes.
Parameters
Parameter | Description |
rt_threshold_ms | The threshold of the command execution duration for slow logs of proxy nodes. Default value: 500. Unit: milliseconds. We recommend that you set the threshold to a value close to the client timeout period, which is anywhere from 200 milliseconds to 500 milliseconds. |
For more information, see Modify the values of parameters for an instance.
Procedure
Log on to the Tair console and go to the Instances page. In the top navigation bar, select the region in which the instance that you want to manage resides. Then, find the instance and click the instance ID.
In the left-side navigation pane, choose .
On the Slow Logs page, filter slow logs by time range or keyword. For cluster and read/write splitting instances, you can also filter slow logs by node type and node ID.
NoteBy default, the Host Address parameter for cluster and read/write splitting instances displays the IP addresses of proxy nodes. To obtain the IP address of a specific client, set the ptod_enabled parameter to
1
in the Parameter Settings section. For more information, see Modify the values of parameters for an instance.
Irrelevant slow SQL statements
Specific slow SQL statements are not related to the actual execution rate of your requests but related to the engine logic of an instance. You can ignore the following slow SQL statements.
Slow SQL statement | Description |
latency:eventloop | Tair uses an event-driven model during runtime. An event loop consists of reading, parsing, and running commands and returning outputs. The execution duration of a |
latency:pipeline | Tair allows the client to work in pipeline mode. In this mode, the client sends commands and receives outputs in batches. After all commands are executed, outputs begin to be returned. If your Tair instance uses the cluster architecture, proxy nodes use the pipeline mode to send requests in batches to the backend of Tair. The execution duration of a |
latency:fork | The execution duration of a |
Related API operations
API operation | Description |
Queries the slow logs of a Tair instance that were generated within a specified period of time. |