:Troubleshoot high traffic usage on a Tair instance
Last Updated:Nov 02, 2022
Tair instances run on the data layer that is closer to the application layer. Therefore,
data is frequently written to or read from Tair instances. This consumes large amounts
of bandwidth resources. The maximum bandwidth available to a Tair instance varies
based on the instance type. If the maximum bandwidth of a Tair instance is exceeded,
applications may be unable to access data that resides on the instance.
Step 1: Analyze traffic usage
Check the traffic usage of a Tair instance within a specific period of time. For more
information, see View monitoring data.
In this example, both the inbound and outbound traffic usage stays at 100%, as shown
in the following figure.
Note
In most cases, if the average traffic usage stays around 80%, bandwidth resources
may be exhausted. We recommend that you pay attention to and troubleshoot the issue.
You must check the Intranet In Ratio and Intranet Out Ratio metrics, which separately indicate the inbound and outbound traffic usages.
Step 2: Optimize traffic usage
Adjust the bandwidth of the Tair instance to reduce the impact on your business. This
also provides you with more time to troubleshoot the issue. For more information,
see Manually adjust the bandwidth of a Tair instance.
The amount of user traffic may not match the expected bandwidth consumption. For example,
the trend of traffic usage growth and the trend of queries per second (QPS) growth
are inconsistent. In this case, use the offline key analysis feature to identify large
keys on the Tair instance. For more information, see Offline key analysis.
Optimize large keys. Keys are typically classified as large keys when their size exceeds
10 KB. For example, you can split large keys, reduce access to large keys, or delete
large keys that you no longer need.
Optional:Connect to cluster instances in direct connection mode to deal with heavy network
traffic. For more information, see Enable the direct connection mode.
Note In direct connection mode, the bandwidth limit of a cluster instance is equal to the
bandwidth limit of each data shard multiplied by the number of data shards. For example,
if a cluster instance contains 128 data shards and the bandwidth limit of each data
shard is 96 Mbit/s, the bandwidth limit of the cluster instance is 12,288 Mbit/s after
you enable the direct connection mode.
If the traffic usage is still high after you perform the preceding optimizations,
upgrade your instance to an instance type that has more memory. An upgrade improves
instance performance and allows the instance to handle more traffic. For more information,
see Change the configurations of an instance.
Note Before you upgrade your Tair instance, you can purchase a pay-as-you-go instance to
test whether the upgrade specifications meet your workload requirements. You can release
the pay-as-you-go instance after you complete the test. For more information, see
Release pay-as-you-go instances.