This topic describes how to use the solr-to-es tool to migrate documents from a Solr cluster to an Alibaba Cloud Elasticsearch cluster. The tool is provided by a third-party community.
Prepare the environment
- Create an Alibaba Cloud Elasticsearch V6.X cluster. In this example, an Elasticsearch V6.3.2 cluster is used. For more information, see Create an Alibaba Cloud Elasticsearch cluster. Important The solr-to-es tool used in this example supports only Elasticsearch V6.X clusters. If you want to use an Elasticsearch cluster of another version, first perform a compatibility test.
- Enable the Auto Indexing feature for the Elasticsearch cluster. For more information, see Configure the YML file.
- Create an Alibaba Cloud Elastic Compute Service (ECS) instance. For more information, see Step 1: Create an ECS instance. In this example, an ECS instance that runs CentOS 7.3 is used. Important The ECS instance must reside in the same region, zone, and virtual private cloud (VPC) as the Elasticsearch cluster.
- Install Solr on the ECS instance. In this example, Solr 5.0.0 is used. For more information, see Official Solr documentation.
- Install Python on the ECS instance. The version must be 3.0 or later. In this example, Python 3.6.2 is used.
- Install pysolr on the ECS instance. The version must be 3.3.3 or later but earlier than 4.0.
Install solr-to-es
- Connect to the ECS instance and download solr-to-es. For more information about how to connect to an ECS instance, see Connect to a Linux instance by using a password or key.Note In this example, a common user is used.
- Navigate to the directory where setup.py is stored and run the
sudo python setup.py install
command to install solr-to-es. - After solr-to-es is installed, run the following command to migrate documents:
sudo python __main__.py <solr_url>:8983/solr/<my_core>/select http://<username>:<password>@<elasticsearch_url>:9200 <elasticsearch_index> <doc_type>
Table 1. Parameters Parameter Description <solr_url>
The endpoint of your Solr cluster. Example: http://116.62.**.**. <my_core>
The name of the Solr Core that contains the documents you want to migrate. <username>
The username that is used to access your Elasticsearch cluster. The default username is elastic. <password>
The password that is used to access your Elasticsearch cluster. The password is specified when you create the cluster. <elasticsearch_url>
The internal or public endpoint of your Elasticsearch cluster. You can obtain the public endpoint of the cluster from the Basic Information page of the cluster. For more information, see View the basic information of a cluster. <elasticsearch_index>
The name of the index to which documents will be migrated. <doc_type>
The type of the index. Important If you are using solr-to-es of a version that is different from the one described in this topic, you can try the following command to migrate documents. For more information, see solr-to-es.sudo solr-to-es [-h] [--solr-query SOLR_QUERY] [--solr-fields COMMA_SEP_FIELDS] [--rows-per-page ROWS_PER_PAGE] [--es-timeout ES_TIMEOUT] solr_url elasticsearch_url elasticsearch_index doc_type
If you use the preceding command in the environment described in this topic, the
-bash: solr-to-es.py: command not found
error is returned.
Example
Query all documents in the my_core
Solr Core and write these documents to the index in your Elasticsearch cluster. The name of the index is elasticsearch_index
, and the type of the index is doc_type
.
- In the Solr environment, navigate to the solr-to-es-master/solr_to_es directory.
- Run the following command:
sudo python __main__.py 'http://116.62.**.**:8983/solr/my_core/select?q=*%3A*&wt=json&indent=true' 'http://elastic:Your password@es-cn-so4lwf40ubsrf****.public.elasticsearch.aliyuncs.com:9200' elasticsearch_index doc_type
Parameter Description q
Required. This parameter defines a query that uses the standard query syntax in Solr. Operators are supported. The value *%3A*
indicates that all documents will be queried.wt
The format of the data to return. Supported formats include JSON, XML, Python, Ruby, and CSV. indent
Specifies whether to use indentations to ensure that the returned data is easier to read. Default value: false
.For more information about other parameters, see Parameters.
- Log on to the Kibana console of your Elasticsearch cluster and go to the homepage of the Kibana console as prompted. For more information, see Log on to the Kibana console.Note In this example, the Kibana console of an Elasticsearch V6.7 cluster is used. The operations for other versions of clusters may vary. The actual operations in the Kibana console prevail.
- In the left-side navigation pane, click Dev Tools. On the page that appears, click Go to work.
- On the Console tab of the page that appears, run the following command to check whether the
elasticsearch_index
index is created in the Elasticsearch cluster:GET _cat/indices?v
- On the Console tab, run the following command to view the details about the migrated documents:
GET /elasticsearch_index/doc_type/_search
If the command is successfully run, the following result is returned:{ "took" : 12, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : 2, "max_score" : 1.0, "hits" : [ { "_index" : "elasticsearch_index", "_type" : "doc_type", "_id" : "Tz8WNW4BwRjcQciJ****", "_score" : 1.0, "_source" : { "id" : "2", "title" : [ "test" ], "_version_" : 1648195017403006976 } }, { "_index" : "elasticsearch_index", "_type" : "doc_type", "_id" : "Tj8WNW4BwRjcQciJ****", "_score" : 1.0, "_source" : { "id" : "1", "title" : [ "change.me" ], "_version_" : 1648195007391203328 } } ] } }