If the disk usage of your Alibaba Cloud Elasticsearch cluster exceeds 85%, the cluster or Kibana may not provide services. This topic describes how to resolve this issue.
Important Disclaimer: This topic may contain information about third-party products. Such information is only for reference. Alibaba Cloud does not make any guarantee, express or implied, with respect to the performance and reliability of third-party products, as well as potential impacts of operations on the products.
Problem description
- After the system receives an index request, it returns an error message similar to
index read_only
, such asFORBIDDEN/12/index read-only / allow delete (api)];]
. - The cluster is in a state that is indicated by the color red. In severe cases, some nodes do not join the cluster. You can run the
GET _cat/nodes?
command to view the nodes in the cluster. In addition, some shards are not allocated to nodes. You can run theGET _cat/allocation?v
command to view the allocation of shards.Note If a cluster is in a state that is indicated by the color red, the primary shards of the cluster are unavailable, and data on the cluster may be lost. - When a pipeline is created or a Beat is enrolled in the Kibana console, the
internal server error
message is returned. - On the Cluster Monitoring page of the cluster or the Monitoring page in the Kibana console of the cluster, the disk usage has reached 100% recently.
Cause
The preceding issues are caused by high disk usage. The disk usage of nodes has the following thresholds:
- 85%: If the disk usage of a node exceeds 85%, the system no longer allocates new shards to the node.
- 90%: If the disk usage of a node exceeds 90%, the system migrates the shards on the node to other data nodes with low disk usage.
- 95%: If the disk usage of a node exceeds 95%, the system forcibly adds the
read_only_allow_delete
attribute to all indexes in the cluster. As a result, data cannot be written to the indexes, and you can only read data from the indexes or delete the indexes.
Solution
- Run the following command to delete data: Warning Deleted data cannot be restored. Proceed with caution. You can also retain the data, but you must resize disks. For more information, see Upgrade the configuration of a cluster.
curl -u <username>:<password> -XDELETE http://<host>:<port>/<index-name>
- Set
<host>
to the internal or public endpoint of the cluster. We recommend that you configure the related whitelist before you run this command. - If the cluster has no response after you run the preceding command, we recommend that you trigger a forced restart and try to run this command during the restart.
- Set
- Check whether indexes are still read-only. If they are, run the following command to set the
index.blocks.read_only_allow_delete
attribute tonull
for all indexes to ensure that all indexes on the cluster are not read-only:PUT _settings { "index.blocks.read_only_allow_delete": null }
- Check whether the cluster is still in a state that is indicated by the color red. If it is, run the
_cat/allocation?v
command to check whether the cluster contains shards that are not allocated. - If the cluster contains shards that are not allocated, run the
GET _cluster/allocation/explain
command to view the reason. If the reason is similar to that shown in the following figure, run thePOST /_cluster/reroute?retry_failed=true
command. - After shards are allocated, view the cluster status. If the cluster is still in a state that is indicated by the color red, contact Alibaba Cloud technical support engineers.
Additional information
To avoid the impact of high disk usage on Alibaba Cloud Elasticsearch, we recommend that you enable disk usage monitoring and alerting. In addition, you must view the alerting text message in time and take appropriate measures in advance. For more information, see Configure cluster alerting.