×
Community Blog Mastering Elasticsearch on Alibaba Cloud: Configuration and Troubleshooting Tips

Mastering Elasticsearch on Alibaba Cloud: Configuration and Troubleshooting Tips

This guide covers essential configurations, handling OOM errors, managing shards, and more. Enhance your Elasticsearch experience with code examples and best practices.

Configuring Thread Pool Size for Indexes

To configure the thread pool size in your Elasticsearch cluster, modify the thread_pool.write.queue_size parameter in the YML configuration file:

thread_pool.write.queue_size: 200

For Elasticsearch clusters of version earlier than 6.X, use the thread_pool.index.queue_size parameter.

Handling OutOfMemory (OOM) Issues

When encountering OOM issues, clear the cache and analyze the cause. To clear the cache, run:

curl -u elastic:<password> -XPOST "localhost:9200/<index_name>/_cache/clear?pretty"
  • password: Your Elasticsearch cluster password.
  • index_name: The name of the index.

Upgrade your cluster configuration or adjust your business logic if necessary. For more details, see Upgrade the configuration of a cluster.

Manually Managing Shards

To manage shards manually, use the reroute API:

curl -XPOST 'localhost:9200/_cluster/reroute' -d '{
   "commands" : [ {
       "move" : {
           "index" : "test", "shard" : 0,
           "from_node" : "node1", "to_node" : "node2"
       }
   },
   {
       "allocate" : {
           "index" : "test", "shard" : 1, "node" : "node3"
       }
   }]
}'

Alternatively, you can use Cerebro for managing shards.

Clearing Cache Policies in Elasticsearch

Elasticsearch supports various cache clearing policies:

  • Clear all indexes' caches:
curl localhost:9200/_cache/clear?pretty
  • Clear a specific index's cache:
curl localhost:9200/<index_name>/_cache/clear?pretty
  • Clear multiple indexes' caches:
curl localhost:9200/<index_name1>,<index_name2>,<index_name3>/_cache/clear?pretty

Rerouting Index Shards

To reroute index shards when they are lost or inappropriately allocated:

curl -XPOST 'localhost:9200/_cluster/reroute' -d '{
    "commands": [{
        "move": {
            "index": "test", "shard": 0,
            "from_node": "node1", "to_node": "node2"
        }},
        {
            "allocate": {
                "index": "test", "shard": 1, "node": "node3"
            }
        }]
}'

Handling statusCode: 500 Errors

If a statusCode: 500 error occurs during an index query:

  • Use a third-party plugin like Cerebro to troubleshoot.
  • If the index name is invalid, rename it to include only letters, underscores (_), and digits.
  • Ensure the cluster runs normally and stores the index.

Changing auto_create_index Parameter

To change the auto_create_index parameter:

PUT /_cluster/settings
{
    "persistent": {
        "action": {
            "auto_create_index": "false"
        }
    }
}

Note: The default value of auto_create_index is false, preventing automatic index creation.

Snapshot Creation Time

Creating a snapshot for 80 GB of index data typically takes around 30 minutes, assuming normal shard number, memory usage, disk usage, and CPU utilization.

Specifying Number of Shards for Index

Calculate the number of shards by dividing total data size by the size of each shard. Recommended shard size is 30 GB, and it should not exceed 50 GB to avoid performance degradation.

Migrating Data Using elasticsearch-repository-oss

If encountering errors with the elasticsearch-repository-oss plugin, rename the ZIP file from elasticsearch to elasticsearch-repository-oss and copy it to the plugins directory.

Changing Time Zone in Kibana Console

Change the time zone for data visualization in the Kibana console, using version 6.7.0 as an example. Adjust the server time as needed.

Performing Term Queries

Term queries operate on structured data like numbers, dates, and keywords. They are not suitable for text data as the system performs word-level queries.

Using Aliases with Elasticsearch

Ensure the total number of shards for indexes with the same alias is less than 1,024.

Resolving too_many_buckets_exception

Change the size parameter for bucket aggregations to resolve the too_many_buckets_exception error. For more information, refer to the documentation on limiting buckets in aggregations.

Deleting Multiple Indexes

Enable the deletion of multiple indexes with a wildcard:

PUT /_cluster/settings
{
  "persistent": {
    "action.destructive_requires_name": false
  }
}

Modifying script.painless.regex.enabled Parameter

By default, the script.painless.regex.enabled parameter is set to false. Change it to true in elasticsearch.yml if necessary, but sparingly due to resource consumption.

Adjusting Mapping Configurations and Shards

To modify the mapping configurations for an existing index, execute a reindexing operation. Primary shards cannot be altered post-creation; plan accordingly. To change replica shards, use:

PUT test/_settings
{
  "number_of_replicas": 0
}

Separate Storage for Field Values

To separately store values of a field:

PUT /my_index
{
  "mappings": {
    "properties": {
      "my_field": {
        "type": "text",
        "store": true
      }
    }
  }
}

Aggregating Specific Fields

Fields of numeric, date, or keyword type can be aggregated using doc_values. For text fields, enable fielddata:

PUT /my_index
{
  "mappings": {
    "properties": {
      "my_text_field": {
        "type": "text",
        "fielddata": true
      }
    }
  }
}

To disable aggregation for a particular field, set the enabled property to false or exclude the field from the document.


Ready to start your journey with Elasticsearch on Alibaba Cloud? Explore our tailored Cloud solutions and services to take the first step towards transforming your data into a visual masterpiece.
Click here to Embark on Your 30-Day Free Trial.

0 1 0
Share on

Data Geek

100 posts | 4 followers

You may also like

Comments

Data Geek

100 posts | 4 followers

Related Products

  • Alibaba Cloud Elasticsearch

    Alibaba Cloud Elasticsearch helps users easy to build AI-powered search applications seamlessly integrated with large language models, and featuring for the enterprise: robust access control, security monitoring, and automatic updates.

    Learn More
  • CloudBox

    Fully managed, locally deployed Alibaba Cloud infrastructure and services with consistent user experience and management APIs with Alibaba Cloud public cloud.

    Learn More
  • Alibaba Cloud Flow

    An enterprise-level continuous delivery tool.

    Learn More