This topic describes a complete data transformation process to walk you through the feature and related operations. Website access logs are used as an example to describe the process.
Prerequisites
A project named web-project is created. For more information, see Create a project.
A Logstore named website_log is created in the web-project project, and the Logstore is used as the source Logstore. For more information, see Create a Logstore.
Website access logs are collected and stored in the website_log Logstore. For more information, see Data collection overview.
Destination Logstores are created in the web-project project. The following table describes the destination Logstores.
Destination Logstore
Description
website-success
Logs for successful access are stored in the website-success Logstore, which is configured in the target-success storage destination.
website-fail
Logs for failed access are stored in the website-fail Logstore, which is configured in the target-fail storage destination.
website-etl
Other access logs are stored in the website-etl Logstore, which is configured in the target0 storage destination.
If you use a Resource Access Management (RAM) user, make sure that the user is granted the permissions on data transformation. For more information, see Grant a RAM user the permissions to manage a data transformation job.
Indexes are configured for the source and destination Logstores. For more information, see Create indexes.
ImportantData transformation does not require indexes. However, if you do not configure indexes, you cannot perform query or analysis operations.
Background information
All access logs of a website are stored in a Logstore. You need to specify different topics for the logs to distinguish between logs for successful access and logs for failed access. In addition, you need to distribute the two types of logs to different Logstores for analysis. Sample log:
body_bytes_sent:1061
http_user_agent:Mozilla/5.0 (Windows; U; Windows NT 5.1; ru-RU) AppleWebKit/533.18.1 (KHTML, like Gecko) Version/5.0.2 Safari/533.18.5
remote_addr:192.0.2.2
remote_user:vd_yw
request_method:DELETE
request_uri:/request/path-1/file-5
status:207
time_local:10/Jun/2021:19:10:59
Step 1: Create a data transformation job
Log on to the Simple Log Service console.
Go to the data transformation page.
In the Projects section, click the project that you want to manage.
On the tab, click the Logstore that you want to manage.
On the query and analysis page, click Data Transformation.
In the upper-right corner of the page that appears, specify the time range of the data that you want to manage.
After you specify the time range, verify that logs appear on the Raw Logs tab.
In the code editor, enter transformation statements.
e_if(e_search("status:[200,299]"),e_compose(e_set("__topic__","access_success_log"),e_output(name="target-success"))) e_if(e_search("status:[400,499]"),e_compose(e_set("__topic__","access_fail_log"),e_output(name="target-fail")))
The e_if function indicates that the specified operations are performed if the condition is met. For more information, see e_if.
Condition:
e_search("status:[200,299]")
If the value of the status field meets the condition, the operations 1 and 2 are performed. For more information, see e_search.
Operation 1:
e_set("__topic__","access_success_log")
The function adds the __topic__ field and assigns the value access_success_log to the field. For more information, see e_set.
Operation 2:
e_output(name="target-success", project="web-project", logstore="website-success")
The function stores the transformed data in the website-success Logstore. For more information, see e_output.
Preview transformation results.
Select Quick.
You can select Quick or Advanced. For more information, see Preview mode overview.
Click Preview Data.
View the transformation results.
ImportantDuring the preview, logs are written to a Logstore named internal-etl-log instead of the destination Logstores. The first time that you preview transformation results, Simple Log Service automatically creates the internal-etl-log Logstore in the current project. This Logstore is dedicated. You cannot modify the configurations of this Logstore or write other data to this Logstore. You are not charged for this Logstore.
Create a data transformation job.
Click Save as Transformation Job.
In the Create Data Transformation Job panel, configure the following parameters.
Parameter
Description
Job Name
The name of the data transformation job.
Authorization Method
The method used to authorize the data transformation job to read data from the source Logstore. Valid values:
Default Role: The data transformation job assumes the AliyunLogETLRole system role to read data from the source Logstore.
Custom Role: The data transformation job assumes a custom role to read data from the source Logstore.
You must grant the custom role the permissions to read from the source Logstore. Then, you must enter the Alibaba Cloud Resource Name (ARN) of the custom role in the Role ARN field. For more information, see Access data by using a custom role.
AccessKey Pair: The data transformation job uses the AccessKey pair of an Alibaba Cloud account or a RAM user to read data from the source Logstore.
Alibaba Cloud account: The AccessKey pair of an Alibaba Cloud account has permissions to read from the source Logstore. You can directly enter the AccessKey ID and AccessKey secret of the Alibaba Cloud account in the AccessKey ID and AccessKey Secret fields. For more information about how to obtain an AccessKey pair, see AccessKey pair.
RAM user: You must grant the RAM user the permissions to read from the source Logstore. Then, you can enter the AccessKey ID and AccessKey secret of the RAM user in the AccessKey ID and AccessKey Secret fields. For more information, see Access data by using AccessKey pairs.
Storage Target
Target Name
The name of the storage destination. Storage Target includes Target Project and Target Store.
Make sure that the value of this parameter is the same as the value of name configured in Step 4.
NoteBy default, Simple Log Service uses the storage destination that is numbered 1 to store the logs that do not meet the specified conditions. In this example, the target0 storage destination is used.
Target Region
The region of the project to which the destination Logstore belongs.
If you want to perform data transformation across regions, we recommend that you use HTTPS for data transmission. This ensures the privacy of log data.
For cross-region data transformation, the data is transmitted over the Internet. If the Internet connections are unstable, data transformation latency may exist. You can select DCDN Acceleration to accelerate the cross-region data transmission. Before you can select DCDN Acceleration, make sure that the global acceleration feature is enabled for the project. For more information, see Log collection acceleration.
ImportantIf data is pulled over a public Simple Log Service endpoint, you are charged for read traffic over the Internet. The traffic is calculated based on the size of data after compression. For more information, see Billable items of pay-by-feature.
Target Project
The name of the project to which the destination Logstore belongs.
Target Store
The name of the destination Logstore.
Authorization Method
The method used to authorize the data transformation job to write transformed data to the destination Logstore. Valid values:
Default Role: The data transformation job assumes the AliyunLogETLRole system role to write transformed data to the destination Logstore.
Custom Role: The data transformation job assumes a custom role to write transformed data to the destination Logstore.
You must grant the custom role the permissions to write to the destination Logstore. Then, you must enter the ARN of the custom role in the Role ARN field. For more information, see Access data by using a custom role.
AccessKey Pair: The data transformation job uses the AccessKey pair of an Alibaba Cloud account or a RAM user to write transformed data to the destination Logstore.
Alibaba Cloud account: The AccessKey pair of an Alibaba Cloud account has permissions to write to the destination Logstore. You can directly enter the AccessKey ID and AccessKey secret of the Alibaba Cloud account in the AccessKey ID and AccessKey Secret fields. For more information about how to obtain an AccessKey pair, see AccessKey pair.
RAM user: You must grant the RAM user the permissions to write to the destination Logstore. Then, you can enter the AccessKey ID and AccessKey secret of the RAM user in the AccessKey ID and AccessKey Secret fields. For more information, see Access data by using AccessKey pairs.
Processing Range
Time Range
The time range within which the data is transformed. Valid values:
NoteThe value of Time Range is based on the time when logs are received.
All: transforms data in the source Logstore from the first log until the job is manually stopped.
From Specific Time: transforms data in the source Logstore from the log that is received at the specified start time until the job is manually stopped.
Within Specific Period: transforms data in the source Logstore from the log that is received at the specified start time to the log that is received at the specified end time.
Click OK.
After logs are distributed to the destination Logstores, you can perform query and analysis operations in the destination Logstores. For more information, see Query and analyze logs.
Step 2: View the data transformation job
In the left-side navigation pane, choose
.In the list of data transformation jobs, find and click the job that you created.
On the Data Transformation Overview page, view the details of the job.
You can view the details and status of the job. You can also modify, start, stop, or delete the job. For more information, see Manage a data transformation job.