Released by ELK Geek
Elasticsearch Transforms allows you to retrieve information from an Elasticsearch index, transform the information, and store it in another index. It enables you to pivot data and create entity-centric indexes that summarize the behaviors of entities. This organizes data into an easy-to-analyze format. This tutorial uses sample Kibana data to demonstrate how to use Transforms to pivot and summarize data.
The exercise in this tutorial uses sample eCommerce orders as an example.
First, prepare the Alibaba Cloud Elasticsearch 6.7 environment, use the user name and password you created to log on to Kibana, and import the data into Elasticsearch.
Click Add data.
You have now imported the eCommerce data. If you are not familiar with the kibana_sample_data_ecommerce
index, use the Revenue dashboard in Kibana to browse the data. Consider the insights you want to gain from the eCommerce data.
The data that is pivoted must be grouped by at least one field, with at least one aggregation applied to it. You can preview the transformed data before proceeding to subsequent operations.
For example, you might want to group the data by product ID and calculate the total quantity sold of each product and the average product price. In addition, you may want to look at the behavior of a customer and calculate how much each customer has spent in total and how many different categories of products they have purchased. In other cases, you may need to consider currencies or geographic locations. What is the most interesting way to transform and interpret the data?
Next, let's do an exercise. Go to Kibana.
Click Create your first transform.
Select [eCommerce] Orders.
Select the items that you are interested in.
As you can see, a Transform pivot preview table is displayed on the right of the page. The table provides data that is not included in the raw data, such as the sum of products.quantity
.
Click Next.
Click Next.
Click Create and start.
The Progress bar indicates that the transform is completed. Click Transforms in the red box to return to the transform management page, as shown in the preceding figure.
As shown in the preceding figure, the transform was completed but the Status is stopped. This occurred due to the small data volume. Click the down arrow to display more information.
You can now see all the details of the transform.
Next, go to the Discover page to view the latest index: ecommerce-customer-sales
.
Select the ecommerce-customer-sales
index.
A total of 3,321 documents are found and each of them contains information like that shown in the preceding figure. The spending information of the current user is displayed. You can search based on this index. This data is very useful in many situations, such as machine learning. You can produce such indexes to analyze data.
In the preceding example, we created a transform on the Graphical User Interface (GUI). However, you can also use API operations to create a transform.
To get started, run the following statement to define the pipeline to be used:
PUT _ingest/pipeline/add_timestamp_pipeline
{
"description": "Adds timestamp to documents",
"processors": [
{
"script": {
"source": "ctx['@timestamp'] = new Date().getTime();"
}
}
]
}
Then, run the following statement:
PUT _transform/ecommerce_transform
{
"source": {
"index": "kibana_sample_data_ecommerce",
"query": {
"term": {
"geoip.continent_name": {
"value": "Asia"
}
}
}
},
"pivot": {
"group_by": {
"customer_id": {
"terms": {
"field": "customer_id"
}
}
},
"aggregations": {
"max_price": {
"max": {
"field": "taxful_total_price"
}
}
}
},
"description": "Maximum priced ecommerce data by customer_id in Asia",
"dest": {
"index": "kibana_sample_data_ecommerce_transform",
"pipeline": "add_timestamp_pipeline"
},
"frequency": "5m",
"sync": {
"time": {
"field": "order_date",
"delay": "60s"
}
}
This statement defines a continuous transform that checks for the latest data and transforms it every 5 minutes. This interval is specified by the frequency field. Then, go to the transform management page.
You can see the newly created continuous transform on the page. Click Start or run the following statement to start this transform:
POST _transform/ecommerce_transform/_start
After you run the preceding statement, go to the transform management page again.
As shown in the preceding figure, the transform is running. You can click Stop to terminate the transform as needed.
The preceding statements do not create an index pattern for the newly created index kibana_sample_data_ecommerce_transform
. Therefore, you need to manually create one. After you create the index pattern, go to the Discover page to view the new transform index.
A total of 13 documents are found because Asia is used as a filter condition. All data is grouped by customer_id
. The maximum price for this customer is displayed. The pipeline timestamp is also displayed.
You can write a new document that meets the Asia conditions to the kibana_sample_data_ecommerce
index and then check whether the kibana_sample_data_ecommerce_transform
index contains one more document.
You can use the following API operations to delete the transform:
POST _transform/ecommerce_transform/_stop
DELETE _transform/ecommerce_transform
Declaration: This article is reproduced with authorization from Liu Xiaoguo, the original author and an advocate of the China Elasticsearch Community. The author reserves the right to hold users legally liable in the case of unauthorized use.
2,599 posts | 762 followers
FollowAlibaba Clouder - July 7, 2020
digoal - June 26, 2019
Data Geek - August 20, 2024
digoal - April 12, 2019
Data Geek - April 19, 2024
Data Geek - April 8, 2024
2,599 posts | 762 followers
FollowAlibaba Cloud provides big data consulting services to help enterprises leverage advanced data technology.
Learn MoreAlibaba Cloud experts provide retailers with a lightweight and customized big data consulting service to help you assess your big data maturity and plan your big data journey.
Learn MoreAlibaba Cloud Elasticsearch helps users easy to build AI-powered search applications seamlessly integrated with large language models, and featuring for the enterprise: robust access control, security monitoring, and automatic updates.
Learn MoreTransform your business into a customer-centric brand while keeping marketing campaigns cost effective.
Learn MoreMore Posts by Alibaba Clouder