Use ILM to separate hot data and cold data - Elasticsearch

Background information

In the era of big data, data constantly changes, and the volume of data stored in your Elasticsearch cluster increases over time. When the data volume reaches a specific level, the memory usage, CPU utilization, and I/O throughput of your cluster also increase. This affects the full-text search performance of the cluster. To address this issue, Elasticsearch V6.6.0 and later provide the ILM feature, which divides the lifecycle of an index into four phases: hot, warm, cold, and delete. For an index in the hot phase, the system may roll over data to be written to the index. For an index in the warm, cold, or delete phase, the system performs the related operations on the index. The following table describes these phases.


Phase	Description
hot	If an index is in this phase, time series data can be written to the index in real time and can be rolled over based on the number of documents in the index, the volume of data stored in the index, and the duration of the index. The data is rolled over by using the rollover API.
warm	If an index is in this phase, data is no longer written to the index, and only data queries can be performed on the index.
cold	If an index is in this phase, the index is no longer updated, few queries are performed on the index, and the query process slows down.
delete	If an index is in this phase, the index will be deleted.

You can use one of the following methods to attach an ILM policy to one or more indexes:

Attach an ILM policy to an index template: If you use this method, the ILM policy takes effect on all indexes that have the same alias. In this topic, this method is used.
Attach an ILM policy to a single index: If you use this method, the ILM policy takes effect only on the current index. The new index generated during a rollover is not affected by the ILM policy.

You can use the ILM feature for time series data, cold data, and hot data to significantly reduce data storage costs. This topic provides an example on how to use the ILM feature for cold data and hot data. The following descriptions provide the business scenario:

Write data to an index in an Elasticsearch cluster in real time. When the volume of data in the index reaches a specific level, the system rolls over excess data to a new index.
The original index stays in the hot phase for 30 minutes and enters the warm phase.
In the warm phase, the system shrinks the original index and merges the segments in the index. The index enters the cold phase 1 hour after the rollover starts.
In the cold phase, the system migrates the index from hot nodes to warm nodes to separate hot data and cold data. The index is deleted 2 hours after the rollover starts.

Precautions

You must configure ILM policies based on your business model. For example, we recommend that you configure different aliases and ILM policies for indexes with different structures. This facilitates index management.
If you want to use the rollover feature, the name of an initial index must end with an auto-increment six-digit number, such as -000001. Otherwise, ILM policies cannot take effect. For example, an initial index is named myindex-000001. During a rollover, a new index named myindex-000002 is generated. If the names of your indexes do not meet the preceding requirements, we recommend that you reindex the data in the indexes.
For indexes in the hot phase, the system writes data to the indexes. To ensure that data is written in chronological order, we recommend that you do not write data to indexes in the warm or cold phase. For example, for the warm phase, set actions to shrink or read only. This way, indexes are read only after they enter the warm phase.

Procedure

Step 1: Create an Alibaba Cloud Elasticsearch cluster that uses the hot-warm architecture and view the hot or warm attribute of nodes in the cluster
Create an Alibaba Cloud Elasticsearch cluster, and specify the hot or warm attribute for data nodes in the cluster.
Step 2: Configure an ILM policy for indexes
Define an ILM policy and attach the policy to an index template.
Step 3: Verify data distribution
Check whether the shards of indexes in the cold phase are distributed on warm nodes.
Step 4: Update the ILM policy
Update the ILM policy.
Step 5: Switch the ILM policy
Switch the ILM policy.

Step 1: Create an Alibaba Cloud Elasticsearch cluster that uses the hot-warm architecture and view the hot or warm attribute of nodes in the cluster

An Elasticsearch cluster that uses the hot-warm architecture contains both hot nodes and warm nodes. This architecture improves the performance and stability of Elasticsearch clusters. The following table describes the differences between hot nodes and warm nodes.


Node type	Type of data stored	Read and write performance	Specifications	Disk
Hot node	Recent data, such as log data over the last two days.	High	High, such as 32 vCPUs and 64 GiB of memory	We recommend that you use a standard SSD. You can specify the storage space based on the volume of data.
Warm node	Historical data, such as log data before the last two days.	Low	Low, such as 8 vCPUs and 32 GiB of memory	We recommend that you use an ultra disk. You can specify the storage space based on the volume of data.

Create an Elasticsearch cluster that uses the hot-warm architecture. When you purchase an Elasticsearch cluster, you can purchase warm nodes to create an Elasticsearch cluster that uses the hot-warm architecture. For more information, see Create an Alibaba Cloud Elasticsearch cluster.
After you create a cluster that contains warm nodes, the system adds the -Enode.attr.box_type parameter to the startup parameters of nodes.
- Hot node: -Enode.attr.box_type=hot
- Warm node: -Enode.attr.box_type=warm
Note
- Data nodes become hot nodes only after you purchase warm nodes.
- In this topic, an Alibaba Cloud Elasticsearch V6.7.0 cluster is used. All operations described and figures provided in this topic are suitable only for clusters of this version. If you use a cluster of another version, operations required in the Elasticsearch console prevail.
Log on to the Kibana console of the cluster. In the left-side navigation pane of the Kibana console, click Dev Tools.
For more information about how to log on to the Kibana console, see Log on to the Kibana console.
On the Console tab of the page that appears, run the following command to view the hot or warm attribute of the nodes in the cluster:
```
GET _cat/nodeattrs?v&h=host,attr,value
```
If the command is successfully run, the result shown in the following figure is returned. This figure shows that the Elasticsearch cluster contains three hot nodes and two warm nodes to support the hot-warm architecture.

Step 2: Configure an ILM policy for indexes

In the Kibana console of the cluster, run the following command to define an ILM policy:

PUT /_ilm/policy/game-policy
{
  "policy": {
    "phases": {
      "hot": {
        "actions": {
          "rollover": {
            "max_size": "1GB",
            "max_age": "1d",
            "max_docs": 1000
          }
        }
      },
      "warm": {
        "min_age": "30m",
        "actions": {
          "forcemerge": {
                "max_num_segments":1
              },
          "shrink": {
                "number_of_shards":1
              }
        }
      },
      "cold": {
        "min_age": "1h",
        "actions": {
          "allocate": {
            "require": {
              "box_type": "warm"
            }
          }
        }
      },
      "delete": {
        "min_age": "2h",
        "actions": {
          "delete": {}
        }
      }
    }
  }
}


Parameter	Description
hot	A rollover is triggered if an index to which the ILM policy is attached meets one of the following conditions: The volume of data in the index reaches 1 GB, the index has been used for more than one day, and the number of documents in the index exceeds 1,000. During the rollover, the system creates an index and enables the ILM policy for the new index. The original index enters the warm phase 30 minutes after the rollover. Notice If the value of max_docs, max_size, or max_age is reached during a rollover, Elasticsearch archives the index.
warm	After the index enters the warm phase, the system shrinks it down to an index that has only one primary shard and merges segments in the index into one segment. The index enters the cold phase 1 hour after the rollover starts.
cold	After the index enters the cold phase, the system migrates the index from hot nodes to warm nodes. The index enters the delete phase 2 hours after the rollover starts.
delete	After the index enters the delete phase, it is deleted.

Note

After an ILM policy is created, you cannot change the policy name.
In this step, you can specify the max_age parameter in the minimum unit of seconds. If you use the Kibana console to create an ILM policy, you can specify this parameter only in the minimum unit of hours.

Create an index template.

In the settings configuration, specify the hot attribute. This way, data can be stored in hot nodes after it is written.

PUT _template/gamestabes_template
{
  "index_patterns" : ["gamestabes-*"],
  "settings": {
    "index.number_of_shards": 5,
    "index.number_of_replicas": 1,
    "index.routing.allocation.require.box_type":"hot",
    "index.lifecycle.name": "game-policy", 
    "index.lifecycle.rollover_alias": "gamestabes"
  }
}


Parameter	Description
index.routing.allocation.require.box_type	The type of nodes to which the index generated during a rollover is allocated.
index.lifecycle.name	The name of the ILM policy.
index.lifecycle.rollover_alias	The alias of the index generated during a rollover.

Create an index based on an auto-increment number.
```
PUT gamestabes-000001
{
"aliases": {
    "gamestabes":{
       "is_write_index": true
        }
      }
}
```
You can also create an index based on time. For more information, see Using date math.
Write data to the index based on the index alias.
The system periodically checks whether the index matches the ILM policy. If the system detects that the index matches the ILM policy, the system rolls over data to a new index.
```
PUT gamestabes/_doc/1
{
    "EU_Sales" : 3.58,
    "Genre" : "Platform",
    "Global_Sales" : 40.24,
    "JP_Sales" : 6.81,
    "Name" : "Super Mario Bros.",
    "Other_Sales" : 0.77,
    "Platform" : "NES",
    "Publisher" : "Nintendo",
    "Year_of_Release" : "1985",
    "na_Sales" : 29.08
}
```
Note By default, the system checks for indexes that match an ILM policy at 10-minute intervals. You can configure the indices.lifecycle.poll_interval parameter to change the check interval. After data is rolled over for an index, the index enters the next phase.
Filter indexes based on lifecycle phases and view detailed index configurations.
1. In the left-side navigation pane, click Management.
2. In the Elasticsearch section, click Index Management.
3. In the Index management section, click Lifecycle phase next to Lifecycle status and select a phase.
4. Click an index name to view the details about the index.

Step 3: Verify data distribution

Query indexes in the cold phase and view the configurations of the indexes.
Query the distribution of shards for indexes in the cold phase.
```
GET _cat/shards?shrink-gamestables-000012
```
If the command is successfully run, the result shown in the following figure is returned. The figure shows that the shards of the indexes in the cold phase are distributed on warm nodes.

Step 4: Update the ILM policy

Update the running ILM policy.
View the version of the updated policy.
1. In the left-side navigation pane, click Management.
2. In the Elasticsearch section, click Index Lifecycle Policies.
3. In the Index lifecycle policies section, view the version of the updated policy.
  The version number of the updated policy is one more than the version number of the original policy. The updated policy takes effect from the next rollover.

Step 5: Switch the ILM policy

Create another ILM policy.

PUT /_ilm/policy/game-new
{
  "policy": {
    "phases": {
      "hot": {
        "actions": {
          "rollover": {
            "max_size": "3GB",
            "max_age": "1d",
            "max_docs": 1000
          }
        }
      },
      "warm": {
        "min_age": "30m",
        "actions": {
          "forcemerge": {
                "max_num_segments":1
              },
          "shrink": {
                "number_of_shards":1
              }
        }
      },
      "cold": {
        "min_age": "1h",
        "actions": {
          "allocate": {
            "require": {
              "box_type": "warm"
            }
          }
        }
      },
      "delete": {
        "min_age": "2h",
        "actions": {
          "delete": {}
        }
      }
    }
  }
}

Attach the new ILM policy to the index template.
```
PUT _template/gamestabes_template
{
  "index_patterns" : ["gamestabes-*"],
  "settings": {
    "index.number_of_shards": 5,
    "index.number_of_replicas": 1,
    "index.routing.allocation.require.box_type":"hot",
    "index.lifecycle.name": "game-new", 
    "index.lifecycle.rollover_alias": "gamestabes"
  }
}
```
Notice
- The new policy takes effect from the next rollover.
- If you want to attach the new policy to the indexes that are created based on the original policy, you can run the PUT gamestabes-*/_settings command. For more information, see Switching policies for an index.

FAQ

Q: How do I configure a check interval for an ILM policy?

A: The system periodically checks for indexes that match an ILM policy. The default interval is 10 minutes. If matched indexes are detected, the system rolls over data for the indexes. For example, you set max_docs to 1000 when you create an ILM policy. In this case, if the system detects that the number of documents in an index reaches 1,000 during a check, the system triggers a rollover for the index. You can configure the indices.lifecycle.poll_interval parameter to change the check interval. This ensures that data is rolled over for indexes in a timely manner.

Notice Set this parameter to an appropriate value. A small value may cause node overload. In this example, this parameter is set to 1m.

PUT _cluster/settings
{
  "transient": {
    "indices.lifecycle.poll_interval":"1m"
  }
}