gig is a plug-in developed by Alibaba Cloud Elasticsearch to implement throttling for client nodes in an Elasticsearch cluster. This plug-in integrates the core throttling capabilities possessed by the Taobao team to handle searches. The gig plug-in can perform a switchover within seconds if query jitters caused by accidental node exceptions occur. This minimizes the probability that query jitters occur and ensures the stability of queries. In addition, this plug-in detects traffic to handle query latency surges caused by enabled warm nodes and achieve query warm-up for online business. This topic describes how to use the gig plug-in.
Background information
- The gig plug-in runs on client nodes. For applications that require high query QPS, you can increase the number of replica shards for each primary shard to scale out the cluster. This helps achieve a linear increase in query throughput. The gig plug-in can help client nodes select the most appropriate replica shards to provide query services.
- The plug-in determines the service capabilities of nodes based on query latency and coordinates the nodes that provide services by using the proportion integral differential (PID) algorithm. This ensures rapid and accurate coordination. If exceptions such as surging query latency or rising error rates occur on nodes, the gig plug-in can collect and analyze the metrics of the nodes in real time by using the PID algorithm. Then, the plug-in rapidly isolates anomalous nodes and performs a switchover within seconds.
- When new nodes join the cluster, the plug-in samples online query traffic in real time, replicates some query traffic to the new nodes, and discards query results. The traffic that is replicated is detection traffic. This avoids direct transmission of traffic to nodes that cannot provide services and reduces query latency. If the detection results and metrics show that the latency of the new nodes is in a normal range, the plug-in transmits online query traffic to these nodes. Then, these nodes can provide online services.
Limits
- Cluster version: V6.7.0 or V7.10.0
- Kernel version: V1.3.0 or later, but earlier than V1.6.0
Notice If the kernel version of your cluster does not meet the requirements, you must upgrade the kernel of your cluster before you use the plug-in. For more information, see Upgrade the version of a cluster. You can upgrade only the kernels of Standard Edition V6.7.0 clusters whose kernel versions are V0.3.0, V1.0.2, or V1.2.0.
Precautions
- After you upgrade the kernel of an Elasticsearch V6.7.0 cluster to V1.3.0, the gig plug-in is automatically installed. After the plug-in is installed, the throttling feature of the plug-in is disabled by default. If you want to use the feature, you must manually enable it.
- If the version of your Elasticsearch cluster is V7.10.0, the gig plug-in is integrated into the aliyun-qos plug-in by default. You do not need to manually install the gig plug-in.
- Before you use the gig plug-in, make sure that sufficient resources are reserved for the data nodes in the cluster. If exceptions occur on one of the data nodes, the query traffic is transmitted to other data nodes. This increases the load of these nodes. Therefore, you must reserve sufficient resources for data nodes to ensure business stability.
- All commands provided in this topic can be run in the Kibana console. For more information about how to log on to the Kibana console, see Log on to the Kibana console.