Alibaba Cloud Elasticsearch provides some commands for you to create manual snapshots for the index data stored on your Elasticsearch cluster, store the snapshots in a shared repository, or restore data from the snapshots. This topic describes how to create manual snapshots and restore data from the snapshots.
Background information
The data backup and restoration of Alibaba Cloud Elasticsearch clusters depend on the elasticsearch-repository-oss plug-in. The plug-in is installed on Alibaba Cloud Elasticsearch clusters by default and cannot be removed. For more information about this plug-in, see elasticsearch-repository-oss.
Prerequisites
Object Storage Service (OSS) is activated, and an OSS bucket whose storage class is Standard and access control list (ACL) is Public Read is created in the region where your Elasticsearch cluster resides. Elasticsearch does not support OSS buckets of the Archive storage class. For more information, see Activate OSS and Create a bucket.
If you are using a RAM user, you must make sure that the
AliyunOSSFullAccess
policy is attached to the RAM user. For more information, see Grant permissions to a RAM user.
Precautions
Snapshots store only index data. The following information of an Elasticsearch cluster is not stored in snapshots: monitoring data (such as indexes whose names start with
.monitoring
or.security_audit
), metadata, translogs, configurations, software packages, built-in and custom plug-ins, and logs.You can run all the code provided in this topic in the Kibana console of your Elasticsearch cluster. For more information, see Log on to the Kibana console.
Create a repository
Create a repository named my_backup.
Create a repository for a cluster in the cloud.
PUT _snapshot/my_backup/ { "type": "oss", "settings": { "endpoint": "http://oss-cn-hangzhou-internal.aliyuncs.com", "access_key_id": "xxxx", "secret_access_key": "xxxxxx", "bucket": "xxxxxx", "compress": true, "chunk_size": "500mb", "base_path": "snapshot/" } }
Create a repository for a self-managed V8.X cluster. In this case, you must install the elasticsearch-repository-oss plug-in on the cluster. For more information, see Install the elasticsearch-repository-oss plug-in.
NoteFor more information about the plug-in, see elasticsearch-repository-oss.
PUT /_snapshot/my_backup { "type": "oss", "settings": { "oss.client.endpoint": "oss-cn-shanghai.aliyuncs.com", "oss.client.access_key_id": "xxx", "oss.client.secret_access_key": "xxx", "oss.client.bucket": "xxxxxx", "oss.client.base_path":"snapshot/", "oss.client.compress": true } }
Parameter | Description |
endpoint | The internal endpoint of the OSS bucket. For more information about how to obtain the endpoint, see Regions and endpoints. |
access_key_id | The AccessKey ID of your account. For more information about how to obtain the AccessKey ID, see Obtain an AccessKey pair. |
secret_access_key | The AccessKey secret of your account. For more information about how to obtain the AccessKey secret, see Obtain an AccessKey pair. |
bucket | The name of the OSS bucket. For more information about how to obtain the name, see Create a bucket. |
compress | Specifies whether to enable the data compression feature for snapshots. Valid values:
|
chunk_size | If you want to upload large volumes of data to the OSS bucket, you can upload the data in multiple parts. In this case, you can use this parameter to set the size of each part. If the size of a part reaches the value of this parameter, the excess data is distributed to another part. |
base_path | The start location of the repository. The default value is the root directory. You can specify the directory where specific snapshots are stored. Example: snapshot/myindex/. |
Query repository information
Query information about all repositories
GET _snapshot
Query information about a specific repository
GET _snapshot/my_backup
Create a snapshot
Create a snapshot for all enabled indexes
PUT _snapshot/my_backup/snapshot_1
The preceding command creates the snapshot_1 snapshot for all enabled indexes and stores the snapshot in the my_backup repository. After you run the command, the system immediately returns a response and creates the snapshot. If you want the system to return a response after the snapshot is created, specify the wait_for_completion parameter in the command. This parameter blocks all API calls until the snapshot is created. If the total size of the indexes is large, the response is returned after a long period of time.
PUT _snapshot/my_backup/snapshot_1?wait_for_completion=true
A repository stores multiple snapshots. Each snapshot is a copy of all indexes, specific indexes, or a single index in a cluster.
The first snapshot is a full copy of the data in a cluster. Subsequent snapshots store only incremental data. If you create a subsequent snapshot, the system only adds data to or removes data from the previous snapshot. Therefore, less time is required to create a subsequent snapshot than the first snapshot.
Create a snapshot for specific indexes
By default, a snapshot contains all enabled indexes. If Kibana is used when you create a snapshot, you may want to ignore all diagnostic indexes (the .kibana
indexes) because of limited disk space. In this case, you can run the following command to create a snapshot only for specific indexes:
PUT _snapshot/my_backup/snapshot_2
{
"indices": "index_1,index_2"
}
The preceding command creates a snapshot only for the index_1 and index_2 indexes.
Query snapshot information
Query information about all snapshots
GET _snapshot/my_backup/_all
If the command is successfully run, the following result is returned:
{
"snapshots": [
{
"snapshot": "snapshot_1",
"uuid": "vIdSCkthTeGa0nSj4D****",
"version_id": 5050399,
"version": "5.5.3",
"indices": [
".kibana"
],
"state": "SUCCESS",
"start_time": "2018-06-28T01:22:39.609Z",
"start_time_in_millis": 1530148959609,
"end_time": "2018-06-28T01:22:39.923Z",
"end_time_in_millis": 1530148959923,
"duration_in_millis": 314,
"failures": [],
"shards": {
"total": 1,
"failed": 0,
"successful": 1
}
},
{
"snapshot": "snapshot_3",
"uuid": "XKO_Uwz_Qu6mZrU3Am****",
"version_id": 5050399,
"version": "5.5.3",
"indices": [
".kibana"
],
"state": "SUCCESS",
"start_time": "2018-06-28T01:25:00.764Z",
"start_time_in_millis": 1530149100764,
"end_time": "2018-06-28T01:25:01.482Z",
"end_time_in_millis": 1530149101482,
"duration_in_millis": 718,
"failures": [],
"shards": {
"total": 1,
"failed": 0,
"successful": 1
}
}
]
}
Query information about a specific snapshot based on the snapshot name
GET _snapshot/my_backup/snapshot_3
If the command is successfully run, the following result is returned:
{
"snapshots": [
{
"snapshot": "snapshot_3",
"uuid": "vIdSCkthTeGa0nSj4D****",
"version_id": 5050399,
"version": "5.5.3",
"indices": [
".kibana"
],
"state": "SUCCESS",
"start_time": "2018-06-28T01:22:39.609Z",
"start_time_in_millis": 1530148959609,
"end_time": "2018-06-28T01:22:39.923Z",
"end_time_in_millis": 1530148959923,
"duration_in_millis": 314,
"failures": [],
"shards": {
"total": 1,
"failed": 0,
"successful": 1
}
}
]
}
Call the _status API to query information about a specific snapshot
GET _snapshot/my_backup/snapshot_3/_status
The _status API allows you to query the detailed information about a snapshot. The information includes both the status of the snapshot and the statistics about each index and shard. If the command is successfully run, the following result is returned:
{
"snapshots": [
{
"snapshot": "snapshot_3",
"repository": "my_backup",
"state": "IN_PROGRESS",
"shards_stats": {
"initializing": 0,
"started": 1,
"finalizing": 0,
"done": 4,
"failed": 0,
"total": 5
},
"stats": {
"number_of_files": 5,
"processed_files": 5,
"total_size_in_bytes": 1792,
"processed_size_in_bytes": 1792,
"start_time_in_millis": 1409663054859,
"time_in_millis": 64
},
"indices": {
"index_3": {
"shards_stats": {
"initializing": 0,
"started": 0,
"finalizing": 0,
"done": 5,
"failed": 0,
"total": 5
},
"stats": {
"number_of_files": 5,
"processed_files": 5,
"total_size_in_bytes": 1792,
"processed_size_in_bytes": 1792,
"start_time_in_millis": 1409663054859,
"time_in_millis": 64
},
"shards": {
"0": {
"stage": "DONE",
"stats": {
"number_of_files": 1,
"processed_files": 1,
"total_size_in_bytes": 514,
"processed_size_in_bytes": 514,
"start_time_in_millis": 1409663054862,
"time_in_millis": 22
}
}
}
}
}
}
]
}
Delete a snapshot
You can run the following command to delete a specific snapshot. If the snapshot is being created, the system stops the creation and deletes the snapshot from the repository.
DELETE _snapshot/my_backup/snapshot_3
You can delete snapshots only by calling the DELETE API. You cannot manually delete a snapshot because the snapshot may be associated with the data in other snapshots. If you manually delete the snapshot, other snapshots may be damaged. If some data in the snapshot that you want to delete is associated with other snapshots, the DELETE API finds the data and deletes only the data that is not associated with other snapshots.
Restore indexes from a snapshot
We recommend that you do not restore indexes whose names start with a period (
.
). If you restore these indexes, you may fail to access the Kibana console.If an index with the same name as the index to be restored exists in your cluster, you must delete or disable the index in the cluster first. Otherwise, the restoration fails.
If you want to restore data in a snapshot across regions, you must migrate the data to an OSS bucket in the destination region and restore the data to the desired Elasticsearch cluster in the destination region. For more information, see Migrate data.
Create a shared OSS repository in the destination cluster
Before you restore data from a snapshot to a cluster, you must create a shared OSS repository in the cluster and map it to the same OSS endpoint as the snapshot. For more information, see Create a repository.
PUT _snapshot/my_backup_restore/
{
"type": "oss",
"settings": {
"endpoint": "http://oss-cn-hangzhou-internal.aliyuncs.com",
"access_key_id": "xxxx",
"secret_access_key": "xxxxxx",
"bucket": "xxxxxx",
"compress": true,
"chunk_size": "500mb",
"base_path": "snapshot/"
}
}
Restore a specific index
If you only want to verify or process the data in an index and do not want to overwrite the data in the Elasticsearch cluster, use this method to restore the index.
POST /_snapshot/my_backup_restore/snapshot_1/_restore
{
"indices": "index_1",
"rename_pattern": "index_(.+)",
"rename_replacement": "restored_index_$1"
}
Parameter | Description |
indices | The name of the index that you want to restore. In this example, the system restores only the index_1 index from the specified snapshot. |
rename_pattern | The format of the index name. The system searches for the index that you want to restore based on the index name. The name must match the format you specified. |
rename_replacement | The regular expression that is used to rename the index. |
Restore all indexes other than indexes whose names start with a period (.
)
POST _snapshot/my_backup_restore/snapshot_1/_restore
{"indices":"*,-.monitoring*,-.security*,-.kibana*","ignore_unavailable":"true"}
Restore all indexes, including indexes whose names start with a period (.
)
POST _snapshot/my_backup_restore/snapshot_1/_restore
For example, if the snapshot_1 snapshot contains five indexes, the preceding command restores all these indexes to the Elasticsearch cluster.
After you call the _restore API, the system immediately returns a response and restores the indexes. If you want to block all API calls until the restoration is complete, you can specify the wait_for_completion parameter in the command.
POST _snapshot/my_backup_restore/snapshot_1/_restore?wait_for_completion=true
Query restoration information
You can call the _recovery API to query information about an index restoration task, such as the status and progress of the task.
Query information about the restoration of a specific index
GET restored_index_3/_recovery
Query information about the restoration of all indexes
The information may include information about shards that are not involved in the restoration process.
GET /_recovery/
If the command is successfully run, the following result is returned:
{
"restored_index_3" : {
"shards" : [ {
"id" : 0,
"type" : "snapshot",
"stage" : "index",
"primary" : true,
"start_time" : "2014-02-24T12:15:59.716",
"stop_time" : 0,
"total_time_in_millis" : 175576,
"source" : {
"repository" : "my_backup",
"snapshot" : "snapshot_3",
"index" : "restored_index_3"
},
"target" : {
"id" : "ryqJ5lO5S4-lSFbGnt****",
"hostname" : "my.fqdn",
"ip" : "10.0.**.**",
"name" : "my_es_node"
},
"index" : {
"files" : {
"total" : 73,
"reused" : 0,
"recovered" : 69,
"percent" : "94.5%"
},
"bytes" : {
"total" : 79063092,
"reused" : 0,
"recovered" : 68891939,
"percent" : "87.1%"
},
"total_time_in_millis" : 0
},
"translog" : {
"recovered" : 0,
"total_time_in_millis" : 0
},
"start" : {
"check_index_time" : 0,
"total_time_in_millis" : 0
}
} ]
}
}
The returned result lists all indexes that are being restored and all shards of these indexes. The result contains the following information for each shard: restoration start time, restoration end time, restoration duration, restoration progress, and transmitted bytes. The following table describes some parameters in the preceding result.
Parameter | Description |
type | The type of the restoration. The value snapshot indicates that the shard is being restored from a snapshot. |
source | The snapshot and repository to which the shard belongs. |
percent | The progress of the restoration. The value 94.5% indicates that 94.5% of the data in the shard is restored. |
Delete an index that is being restored from a snapshot
You can use the DELETE API to delete an index that is being restored to cancel the restoration of the index.
DELETE /restored_index_3
If the restored_index_3 index is being restored, the preceding command stops the restoration and deletes the data that is restored to the Elasticsearch cluster.