Time series data increase over time. You can use the index lifecycle management (ILM)
feature to periodically roll over the data to new indexes. This ensures high query
efficiency and reduces query costs. As indexes age and fewer queries are required,
you can migrate the indexes to a less expensive disk and reduce the numbers of primary
and replica shards. This topic describes how to use ILM to manage Heartbeat indexes.
Background information
In this topic, the following test scenario is used:
A large number of time series indexes whose names start with heartbeat- exist in your
Elasticsearch cluster, and the size of a single index is about 4 MB each day. The
number of shards increases with the data volume. This may cause cluster overload.
In this case, you must configure different rollover policies for indexes in the following
four phases: hot, warm, cold, and delete. In the hot phase, data in historical monitoring
indexes whose names start with heartbeat- is rolled over to new indexes. In the warm
phase, indexes are shrunk, and segments in each index are merged. In the cold phase,
data is migrated from hot nodes to warm nodes. In the delete phase, data is deleted
on a regular basis.
Precautions
- An ILM policy can be attached to an index only after an index template and an alias
are configured for the index.
- If you modify an ILM policy during a rollover, the new policy takes effect from the
next rollover.
Step 1: Create an Elasticsearch cluster that uses the hot-warm architecture
- Create an Elasticsearch cluster that uses the hot-warm architecture and view the hot
or warm attribute of nodes in the cluster.
The cluster that uses the hot-warm architecture contains hot nodes and warm nodes.
This architecture improves the performance and stability of your Elasticsearch cluster.
The following table lists the differences between hot nodes and warm nodes.
Node type |
Type of data stored |
Read and write performance |
Specifications |
Disk |
Hot node |
Recent data, such as log data over the last two days. |
High |
High, such as 32 vCPUs and 64 GiB of memory |
We recommend that you use a standard SSD. You can specify the storage space based
on the volume of data.
|
Warm node |
Historical data, such as log data before the last two days. |
Low |
Low, such as 8 vCPUs and 32 GiB of memory |
We recommend that you use an ultra disk. You can specify the storage space based on
the volume of data.
|
- When you purchase an Elasticsearch cluster, you can purchase warm nodes to create an Elasticsearch cluster that uses the hot-warm
architecture.
After you create a cluster that contains
warm nodes, the system adds the
-Enode.attr.box_type parameter to the startup parameters of nodes.
- Hot node: -Enode.attr.box_type=hot
- Warm node: -Enode.attr.box_type=warm
Note
- Data nodes become hot nodes only after you purchase warm nodes.
- In this topic, an Alibaba Cloud Elasticsearch V6.7.0 cluster is used. All operations
described and figures provided in this topic are suitable only for clusters of this
version. If you use a cluster of another version, operations required in the Elasticsearch
console prevail.
- Log on to the Kibana console of the Elasticsearch cluster.
- In the left-side navigation pane, click Dev Tools.
- On the Console tab of the page that appears, run the following command to view the attributes of
nodes:
GET _cat/nodeattrs?v&h=host,attr,value
If the command is successfully run, the result shown in the following figure is returned.
This figure shows that the Elasticsearch cluster contains three hot nodes and two
warm nodes to support the hot-warm architecture.
- Enable the Auto Indexing feature for the Elasticsearch cluster.
- Configure a public IP address whitelist for the Elasticsearch cluster and add the
IP address of the server on which Heartbeat is installed to the whitelist.
Step 2: Enable and configure the ILM feature in the heartbeat.yml file
To manage Heartbeat indexes by using the ILM feature of Elasticsearch, you can configure
the feature in the heartbeat.yml file. For more information, see Set up index lifecycle management.
- Download the Heartbeat installation package and decompress it.
- Specify the heartbeat.monitors, setup.template.settings, setup.kibana, and output.elasticsearch configurations in the heartbeat.yml file.
The following configurations are used in this example:
heartbeat.monitors:
- type: icmp
schedule: '*/5 * * * * * *'
hosts: ["47.111.xx.xx"]
setup.template.settings:
index.number_of_shards: 3
index.codec: best_compression
index.routing.allocation.require.box_type: "hot"
setup.kibana:
# Kibana Host
# Scheme and port can be left out and will be set to the default (http and 5601)
# In case you specify and additional path, the scheme is required: http://localhost:5601/path
# IPv6 addresses should always be defined as: https://[2001:db8::1]:5601
host: "https://es-cn-4591jumei00xxxxxx.kibana.elasticsearch.aliyuncs.com:5601"
output.elasticsearch:
# Array of hosts to connect to.
hosts: ["es-cn-4591jumei00xxxxxx.elasticsearch.aliyuncs.com:9200"]
ilm.enabled: true
setup.template.overwrite: true
ilm.rollover_alias: "heartbeat"
ilm.pattern: "{now/d}-000001"
# Enabled ilm (beta) to use index lifecycle management instead daily indices.
#ilm.enabled: false
# Optional protocol and basic auth credentials.
#protocol: "https"
username: "elastic"
password: "<your_password>"
The following table describes some parameters in the preceding configurations. For
more information about other parameters, see open source Heartbeat configuration documentation.
Parameter |
Description |
index.number_of_shards |
The number of primary shards. Default value: 1. |
index.routing.allocation.require.box_type |
Specifies whether to write data to hot nodes. |
host |
The public IP address that is used to access the Kibana service. You can obtain the
IP address on the Kibana Configuration page.
|
hosts |
The internal or public endpoint that is used to access the Elasticsearch cluster.
You can obtain the endpoint on the Basic Information page of the cluster. For more
information, see View the basic information of a cluster.
Note If you set the hosts parameter to the public endpoint of the cluster, you must configure
a public IP address whitelist for the cluster. For more information, see Configure a public or private IP address whitelist for an Elasticsearch cluster. If you set the hosts parameter to the internal endpoint of the cluster, you must
make sure that the cluster resides in the same virtual private cloud (VPC) as the
server on which Heartbeat is installed.
|
ilm.enabled |
Specifies whether to enable the ILM feature. If this parameter is set to true, the
feature is enabled.
|
setup.template.overwrite |
Specifies whether to overwrite the original index template. If you have loaded an
index template of a specific version to Elasticsearch, you must set this parameter
to true to overwrite the original index template with the loaded template.
|
ilm.rollover_alias |
Specifies the alias of the index that is generated during a rollover. Default value:
heartbeat-\{beat.version\}.
|
ilm.pattern |
The index pattern that is generated during a rollover. date math is supported. Default value: {now/d}-000001. If a rollover condition is met, the system increments the last digit in the index
name by one to generate a new index name.
For example, an index generated after the first rollover is named heartbeat-2020.04.29-000001. If another rollover condition is met, Elasticsearch creates an index named heartbeat-2020.04.29-000002.
|
username |
The default username is elastic. |
password |
The password is specified when you create the cluster. If you forget the password,
you can reset it. For more information about the procedure and precautions for resetting
a password, see Reset the access password for an Elasticsearch cluster. |
Notice If you change the setting of ilm.rollover_alias or ilm.pattern after an index template is loaded, you must set setup.template.overwrite to true to overwrite the original index template with the loaded index template.
- Start the Heartbeat service.
Step 3: Create an ILM policy
Elasticsearch allows you to use API calls or the Kibana console to create an ILM policy.
This step describes how to call the ILM policy operation to create an ILM policy.
Note Heartbeat allows you to run the ./heartbeat setup --ilm-policy
command to load the default policy and write it to Elasticsearch. You can run the
./heartbeat export ilm-policy
command to export the default policy to stdout. Then, you can modify the default
policy to manually create an ILM policy.
Run the following command in the Kibana console of the Elasticsearch cluster to create
an ILM policy:
PUT /_ilm/policy/hearbeat-policy
{
"policy": {
"phases": {
"hot": {
"actions": {
"rollover": {
"max_size": "5mb",
"max_age": "1d",
"max_docs": 100
}
}
},
"warm": {
"min_age": "60s",
"actions": {
"forcemerge": {
"max_num_segments":1
},
"shrink": {
"number_of_shards":1
}
}
},
"cold": {
"min_age": "3m",
"actions": {
"allocate": {
"require": {
"box_type": "warm"
}
}
}
},
"delete": {
"min_age": "1h",
"actions": {
"delete": {}
}
}
}
}
}
The following table describes the configurations in the preceding ILM policy.
Parameter |
Description |
hot |
A rollover is triggered if an index to which the ILM policy is attached meets one
of the following conditions: The volume of data in the index reaches 5 MB, the index
has been used for more than one day, and the number of documents in the index exceeds
100. During the rollover, the system creates an index and enables the ILM policy for
the new index. The original index enters the warm phase 60 seconds after the rollover.
Notice If the value of max_docs, max_size, or max_age is reached during a rollover, Elasticsearch archives the index.
|
warm |
After the index enters the warm phase, the system shrinks it down to a new index that
has only one primary shard and merges segments in the index into one segment. The
index enters the cold phase 3 minutes after the rollover starts.
|
cold |
After the index enters the cold phase, the system migrates the index from hot nodes
to warm nodes. The index enters the delete phase 1 hour later after the rollover starts.
|
delete |
After the index enters the delete phase, it is deleted. |
Note
- After an ILM policy is created, you cannot change the policy name.
- In this step, you can specify the max_age parameter in the minimum unit of seconds. If you use the Kibana console to create
an ILM policy, you can specify this parameter only in the minimum unit of hours.
Step 4: Attach the ILM policy to an index template
After you start Heartbeat, the system creates a Heartbeat index template in your Elasticsearch
cluster. You must attach the ILM policy created in Step 3: Create an ILM policy to this index template.
- Log on to the Kibana console of the Elasticsearch cluster.
- In the left-side navigation pane, click Management.
- In the Elasticsearch section, click Index Lifecycle Policies.
- In the Index lifecycle policies section, find the ILM policy you created, and choose .
- In the dialog box that appears, select an index template from the Index template drop-down list and enter an alias for indexes in the Alias for rollover index field.
- Click Add policy.
Step 5: Attach the ILM policy to an index
After you start Heartbeat, the system creates Heartbeat indexes in your Elasticsearch
cluster. You must attach the ILM policy that is attached to the index template you
created to the first index created by using the template. For more information, see
Step 4: Attach the ILM policy to an index template.
- In the Elasticsearch section of the Management page, click Index Management.
- In the Index management section, find the desired index and click its name.
- On the Summary tab of the pane that appears, choose to remove the default policy of Heartbeat.
- In the dialog box that appears, click Remove policy.
- Choose again.
- In the dialog box that appears, select the ILM policy you created in Step 3: Create an ILM policy from the Lifecycle policy drop-down list and set Index rollover alias to the alias that you specify in Step 4: Attach the ILM policy to an index template. Then, click Add policy.
If the ILM policy is attached to the index, the information shown in the following
figure appears.
Step 6: View indexes in different phases
To view indexes in the hot phase, select
Hot from the
Lifecycle phase drop-down list in the
Index management section.
You can use this method to view indexes in other phases.
FAQ
Q: How do I configure a check interval for an ILM policy?
A: The system periodically checks for indexes that match an ILM policy. The default
interval is 10 minutes. If the system detects matched indexes, it rolls over data
for the indexes. For example, you set
max_docs to 100 when you
create an ILM policy. In this case, if the system detects that the number of documents in an index reaches
100 during a check, it triggers a rollover for the index. You can use the
indices.lifecycle.poll_interval parameter to control the check interval. This ensures that data is rolled over for
indexes in a timely manner.
Notice Set this parameter to an appropriate value. A small value may cause node overload.
In this example, this parameter is set to 1m.
PUT _cluster/settings
{
"transient": {
"indices.lifecycle.poll_interval":"1m"
}
}