Use ILM to manage Heartbeat indexes - Elasticsearch - Alibaba Cloud Documentation Center

Background information

In this topic, the following test scenario is used:

A large number of time series indexes whose names start with heartbeat- exist in your Elasticsearch cluster, and the size of a single index is about 4 MB each day. The number of shards increases with the data volume. This may cause cluster overload. In this case, you must configure different rollover policies for indexes in the following four phases: hot, warm, cold, and delete. In the hot phase, data in historical monitoring indexes whose names start with heartbeat- is rolled over to new indexes. In the warm phase, indexes are shrunk, and segments in each index are merged. In the cold phase, data is migrated from hot nodes to warm nodes. In the delete phase, data is deleted on a regular basis.

Precautions

An ILM policy can be attached to an index only after an index template and an alias are configured for the index.
If you modify an ILM policy during a rollover, the new policy takes effect from the next rollover.

Procedure

Step 1: Create an Elasticsearch cluster that uses the hot-warm architecture
Create an Elasticsearch cluster that uses the hot-warm architecture, enable the Auto Indexing feature for the cluster, and configure a public IP address whitelist for the cluster.
Step 2: Enable and configure the ILM feature in the heartbeat.yml file
In the heartbeat.yml file, enable and configure the ILM feature for the cluster. After the configuration is complete, the system generates a Heartbeat index template for the cluster.
Step 3: Create an ILM policy
Call the ILM policy operation to create an ILM policy. This policy defines the conditions to roll over data and archive indexes.
Step 4: Attach the ILM policy to an index template
Attach the ILM policy to the Heartbeat index template.
Step 5: Attach the ILM policy to an index
Attach the ILM policy to the first index that is created by using the Heartbeat index template. This way, the policy can apply to all indexes that are created by using this template.
Step 6: View indexes in different phases
View the indexes that are archived in the hot, warm, cold, and delete phases.

Step 1: Create an Elasticsearch cluster that uses the hot-warm architecture

Create an Elasticsearch cluster that uses the hot-warm architecture and view the hot or warm attribute of nodes in the cluster.

The cluster that uses the hot-warm architecture contains hot nodes and warm nodes. This architecture improves the performance and stability of your Elasticsearch cluster. The following table lists the differences between hot nodes and warm nodes.


Node type	Type of data stored	Read and write performance	Specifications	Disk
Hot node	Recent data, such as log data over the last two days.	High	High, such as 32 vCPUs and 64 GiB of memory	We recommend that you use a standard SSD. You can specify the storage space based on the volume of data.
Warm node	Historical data, such as log data before the last two days.	Low	Low, such as 8 vCPUs and 32 GiB of memory	We recommend that you use an ultra disk. You can specify the storage space based on the volume of data.

When you purchase an Elasticsearch cluster, you can purchase warm nodes to create an Elasticsearch cluster that uses the hot-warm architecture.
After you create a cluster that contains warm nodes, the system adds the -Enode.attr.box_type parameter to the startup parameters of nodes.
- Hot node: -Enode.attr.box_type=hot
- Warm node: -Enode.attr.box_type=warm
Note
- Data nodes become hot nodes only after you purchase warm nodes.
- In this topic, an Alibaba Cloud Elasticsearch V6.7.0 cluster is used. All operations described and figures provided in this topic are suitable only for clusters of this version. If you use a cluster of another version, operations required in the Elasticsearch console prevail.
Log on to the Kibana console of the Elasticsearch cluster.
For more information about how to log on to the Kibana console, see Log on to the Kibana console.
In the left-side navigation pane, click Dev Tools.
On the Console tab of the page that appears, run the following command to view the attributes of nodes:
```
GET _cat/nodeattrs?v&h=host,attr,value
```
If the command is successfully run, the result shown in the following figure is returned. This figure shows that the Elasticsearch cluster contains three hot nodes and two warm nodes to support the hot-warm architecture.

Enable the Auto Indexing feature for the Elasticsearch cluster.
For more information, see Configure the YML file.
Configure a public IP address whitelist for the Elasticsearch cluster and add the IP address of the server on which Heartbeat is installed to the whitelist.
For more information, see Configure a public or private IP address whitelist for an Elasticsearch cluster.

Step 2: Enable and configure the ILM feature in the heartbeat.yml file

To manage Heartbeat indexes by using the ILM feature of Elasticsearch, you can configure the feature in the heartbeat.yml file. For more information, see Set up index lifecycle management.

Download the Heartbeat installation package and decompress it.

Specify the heartbeat.monitors, setup.template.settings, setup.kibana, and output.elasticsearch configurations in the heartbeat.yml file.

The following configurations are used in this example:

heartbeat.monitors:
- type: icmp
  schedule: '*/5 * * * * * *'
  hosts: ["47.111.xx.xx"]

setup.template.settings:
  index.number_of_shards: 3
  index.codec: best_compression
  index.routing.allocation.require.box_type: "hot"

setup.kibana:

  # Kibana Host
  # Scheme and port can be left out and will be set to the default (http and 5601)
  # In case you specify and additional path, the scheme is required: http://localhost:5601/path
  # IPv6 addresses should always be defined as: https://[2001:db8::1]:5601
  host: "https://es-cn-4591jumei00xxxxxx.kibana.elasticsearch.aliyuncs.com:5601"

output.elasticsearch:
  # Array of hosts to connect to.
  hosts: ["es-cn-4591jumei00xxxxxx.elasticsearch.aliyuncs.com:9200"]
  ilm.enabled: true
  setup.template.overwrite: true
  ilm.rollover_alias: "heartbeat"
  ilm.pattern: "{now/d}-000001"

  # Enabled ilm (beta) to use index lifecycle management instead daily indices.
  #ilm.enabled: false

  # Optional protocol and basic auth credentials.
  #protocol: "https"
  username: "elastic"
  password: "<your_password>"

The following table describes some parameters in the preceding configurations. For more information about other parameters, see open source Heartbeat configuration documentation.


Parameter	Description
index.number_of_shards	The number of primary shards. Default value: 1.
index.routing.allocation.require.box_type	Specifies whether to write data to hot nodes.
host	The public IP address that is used to access the Kibana service. You can obtain the IP address on the Kibana Configuration page.
hosts	The internal or public endpoint that is used to access the Elasticsearch cluster. You can obtain the endpoint on the Basic Information page of the cluster. For more information, see View the basic information of a cluster. Note If you set the hosts parameter to the public endpoint of the cluster, you must configure a public IP address whitelist for the cluster. For more information, see Configure a public or private IP address whitelist for an Elasticsearch cluster. If you set the hosts parameter to the internal endpoint of the cluster, you must make sure that the cluster resides in the same virtual private cloud (VPC) as the server on which Heartbeat is installed.
ilm.enabled	Specifies whether to enable the ILM feature. If this parameter is set to true, the feature is enabled.
setup.template.overwrite	Specifies whether to overwrite the original index template. If you have loaded an index template of a specific version to Elasticsearch, you must set this parameter to true to overwrite the original index template with the loaded template.
ilm.rollover_alias	Specifies the alias of the index that is generated during a rollover. Default value: heartbeat-\{beat.version\}.
ilm.pattern	The index pattern that is generated during a rollover. date math is supported. Default value: {now/d}-000001. If a rollover condition is met, the system increments the last digit in the index name by one to generate a new index name. For example, an index generated after the first rollover is named heartbeat-2020.04.29-000001. If another rollover condition is met, Elasticsearch creates an index named heartbeat-2020.04.29-000002.
username	The default username is elastic.
password	The password is specified when you create the cluster. If you forget the password, you can reset it. For more information about the procedure and precautions for resetting a password, see Reset the access password for an Elasticsearch cluster.

Notice If you change the setting of ilm.rollover_alias or ilm.pattern after an index template is loaded, you must set setup.template.overwrite to true to overwrite the original index template with the loaded index template.

Start the Heartbeat service.
```
sudo ./heartbeat -e
```

Step 3: Create an ILM policy

Elasticsearch allows you to use API calls or the Kibana console to create an ILM policy. This step describes how to call the ILM policy operation to create an ILM policy.

Note Heartbeat allows you to run the ./heartbeat setup --ilm-policy command to load the default policy and write it to Elasticsearch. You can run the ./heartbeat export ilm-policy command to export the default policy to stdout. Then, you can modify the default policy to manually create an ILM policy.

Run the following command in the Kibana console of the Elasticsearch cluster to create an ILM policy:

PUT /_ilm/policy/hearbeat-policy
{
  "policy": {
    "phases": {
      "hot": {
        "actions": {
          "rollover": {
            "max_size": "5mb",
            "max_age": "1d",
            "max_docs": 100
          }
        }
      },
      "warm": {
        "min_age": "60s",
        "actions": {
          "forcemerge": {
                "max_num_segments":1
              },
          "shrink": {
                "number_of_shards":1
              }
        }
      },
      "cold": {
        "min_age": "3m",
        "actions": {
          "allocate": {
            "require": {
              "box_type": "warm"
            }
          }
        }
      },
      "delete": {
        "min_age": "1h",
        "actions": {
          "delete": {}
        }
      }
    }
  }
}

The following table describes the configurations in the preceding ILM policy.


Parameter	Description
hot	A rollover is triggered if an index to which the ILM policy is attached meets one of the following conditions: The volume of data in the index reaches 5 MB, the index has been used for more than one day, and the number of documents in the index exceeds 100. During the rollover, the system creates an index and enables the ILM policy for the new index. The original index enters the warm phase 60 seconds after the rollover. Notice If the value of max_docs, max_size, or max_age is reached during a rollover, Elasticsearch archives the index.
warm	After the index enters the warm phase, the system shrinks it down to a new index that has only one primary shard and merges segments in the index into one segment. The index enters the cold phase 3 minutes after the rollover starts.
cold	After the index enters the cold phase, the system migrates the index from hot nodes to warm nodes. The index enters the delete phase 1 hour later after the rollover starts.
delete	After the index enters the delete phase, it is deleted.

Note

After an ILM policy is created, you cannot change the policy name.
In this step, you can specify the max_age parameter in the minimum unit of seconds. If you use the Kibana console to create an ILM policy, you can specify this parameter only in the minimum unit of hours.

Step 4: Attach the ILM policy to an index template

After you start Heartbeat, the system creates a Heartbeat index template in your Elasticsearch cluster. You must attach the ILM policy created in Step 3: Create an ILM policy to this index template.

Log on to the Kibana console of the Elasticsearch cluster.
For more information, see Log on to the Kibana console.
In the left-side navigation pane, click Management.
In the Elasticsearch section, click Index Lifecycle Policies.
In the Index lifecycle policies section, find the ILM policy you created, and choose Actions > Add policy to index template.
In the dialog box that appears, select an index template from the Index template drop-down list and enter an alias for indexes in the Alias for rollover index field.
Click Add policy.

Step 5: Attach the ILM policy to an index

After you start Heartbeat, the system creates Heartbeat indexes in your Elasticsearch cluster. You must attach the ILM policy that is attached to the index template you created to the first index created by using the template. For more information, see Step 4: Attach the ILM policy to an index template.

In the Elasticsearch section of the Management page, click Index Management.
In the Index management section, find the desired index and click its name.
On the Summary tab of the pane that appears, choose Manage > Remove lifecycle policy to remove the default policy of Heartbeat.
In the dialog box that appears, click Remove policy.
Choose Manage > Add lifecycle policy again.
In the dialog box that appears, select the ILM policy you created in Step 3: Create an ILM policy from the Lifecycle policy drop-down list and set Index rollover alias to the alias that you specify in Step 4: Attach the ILM policy to an index template. Then, click Add policy.

If the ILM policy is attached to the index, the information shown in the following figure appears.

Step 6: View indexes in different phases

To view indexes in the hot phase, select Hot from the Lifecycle phase drop-down list in the Index management section.

You can use this method to view indexes in other phases.

FAQ

Q: How do I configure a check interval for an ILM policy?

A: The system periodically checks for indexes that match an ILM policy. The default interval is 10 minutes. If the system detects matched indexes, it rolls over data for the indexes. For example, you set max_docs to 100 when you create an ILM policy. In this case, if the system detects that the number of documents in an index reaches 100 during a check, it triggers a rollover for the index. You can use the indices.lifecycle.poll_interval parameter to control the check interval. This ensures that data is rolled over for indexes in a timely manner.

Notice Set this parameter to an appropriate value. A small value may cause node overload. In this example, this parameter is set to 1m.

PUT _cluster/settings
{
  "transient": {
    "indices.lifecycle.poll_interval":"1m"
  }
}