In big data scenarios that require high concurrency, effective analysis of Java error logs can reduce the O&M costs of Java applications. You can use Simple Log Service to collect Java error logs from Alibaba Cloud services and use the data transformation feature to parse the collected logs.
Prerequisites
The Java error logs of Simple Log Service, Object Storage Service (OSS), Server Load Balancer (SLB), and ApsaraDB RDS are collected and stored in a Logstore named cloud_product_error_log. For more information, see Use Logtail to collect logs.
Scenarios
For example, you have developed a Java application named Application A by using multiple Alibaba Cloud services, such as OSS and Simple Log Service. You have created a Logstore named cloud_product_erro_log in the China (Hangzhou) region to store Java error logs that are generated when you call the API operations of the Alibaba Cloud services. To fix Java errors in an efficient manner, you need to use Simple Log Service to analyze the Java error logs at regular intervals.
To meet the preceding requirements, you must parse the log time, error code, status code, service name, error message, request method, and error line number from the collected logs, and then send the parsed logs to the Logstore of each cloud service for error analysis.
The following example shows a raw log:
__source__:192.0.2.10
__tag__:__client_ip__:203.0.113.10
__tag__:__receive_time__:1591957901
__topic__:
message: 2021-05-15 16:43:35 ParameterInvalid 400
com.aliyun.openservices.log.exception.LogException:The body is not valid json string.
at com.aliyun.openservice.log.Client.ErrorCheck(Client.java:2161)
at com.aliyun.openservice.log.Client.SendData(Client.java:2312)
at com.aliyun.openservice.log.Client.PullLogsk(Client.java:1397)
at com.aliyun.openservice.log.Client.SendData(Client.java:2265)
at com.aliyun.openservice.log.Client.GetCursor(Client.java:1123)
at com.aliyun.openservice.log.Client.PullLogs(Client.java:2161)
at com.aliyun.openservice.log.Client.ErrorCheck(Client.java:2426)
at transformEvent.main(transformEvent.java:2559)
Procedure
The error logs of Application A are collected by using Logtail and are stored in the cloud_product_error_log Logstore. Then, the error logs are transformed and the transformed logs are sent to the Logstore of each cloud service for error analysis. The procedure consists of the following steps:
Design a data transformation statement: In this step, analyze the transformation logic and write a transformation statement.
Create a data transformation job: In this step, send logs to different Logstores of cloud services for error analysis.
Query and analyze data: In this step, analyze error logs in the Logstore of each cloud service.
Step 1: Design a data transformation statement
Transformation procedure
To analyze error logs in a convenient manner, you must complete the following operations:
Extract the log time, error code, status code, service name, error message, request method, and error line number from the message field.
Send error logs to the Logstore of each cloud service.
Transformation logic
In this case, you must analyze the log time, error code, status code, service name, error message, request method, and error line number in the raw log field, and then design regular expressions for each field that you want to extract.
Syntax description
Use the regex_match function to match logs that contain LogException. For more information, see regex_match.
If a log contains LogException, the log is transformed based on the transformation rule of Simple Log Service error logs. If a log contains OSSException, the log is transformed based on the transformation rule of OSS error logs. For more information, see e_switch.
Use the e_regex function to parse error logs for each cloud service. For more information, see e_regex.
Delete the message field and send error logs to the Logstore of the corresponding cloud service. For more information, see e_drop_fields and e_output and e_coutput.
For more information, see the Group section in Regular expressions.
Transformation statement syntax
The following example shows the specific syntax of a data transformation statement:
e_switch(
regex_match(v("message"), r"LogException"),
e_compose(
e_regex(
"message",
"(?P<data_time>\S+\s\S+)\s(?P<error_code>[a-zA-Z]+)\s(?P<status>[0-9]+)\scom\.aliyun\.openservices\.log\.exception\.(?P<product_exception>[a-zA-Z]+)\:(?P<error_message>[a-zA-Z0-9:,\-\s]+)\.(\s+\S+\s\S+){5}\s+\S+\scom\.aliyun\.openservices\.log\.Client\.(?P<method>[a-zA-Z]+)\S+\s+\S+\stransformEvent\.main\(transformEvent\.java\:(?P<error_line>[0-9]+)\)",
),
e_drop_fields("message"),
e_output("sls-error"),
),
regex_match(v("message"), r"OSSException"),
e_compose(
e_regex(
"message",
"(?P<data_time>\S+\s\S+)\scom\.aliyun\.oss\.(?P<product_exception>[a-zA-Z]+)\:(?P<error_message>[a-zA-Z0-9,\s]+)\.\n\[ErrorCode\]\:\s(?P<error_code>[a-zA-Z]+)\n\[RequestId\]\:\s(?P<request_id>[a-zA-Z0-9]+)\n\[HostId\]\:\s(?P<host_id>[a-zA-Z-.]+)\n\S+\n\S+(\s\S+){3}\n\s+\S+\s+(.+)(\s+\S+){24}\scom\.aliyun\.oss\.OSSClient\.(?P<method>[a-zA-Z]+)\S+\s+\S+\stransformEvent\.main\(transformEvent\.java:(?P<error_line>[0-9]+)\)",
),
e_drop_fields("message"),
e_output("oss-error"),
),
)
Step 2: Create a data transformation job
Go to the data transformation page.
In the Projects section, click the project that you want to manage.
On the tab, click the Logstore that you want to manage.
On the query and analysis page, click Data Transformation.
In the upper-right corner of the page, specify a time range for the required log data.
Make sure that log data exists on the Raw Logs tab.
In the code editor, enter the following data transformation statement:
e_switch( regex_match(v("message"), r"LogException"), e_compose( e_regex( "message", "(?P<data_time>\S+\s\S+)\s(?P<error_code>[a-zA-Z]+)\s(?P<status>[0-9]+)\scom\.aliyun\.openservices\.log\.exception\.(?P<product_exception>[a-zA-Z]+)\:(?P<error_message>[a-zA-Z0-9:,\-\s]+)\.(\s+\S+\s\S+){5}\s+\S+\scom\.aliyun\.openservices\.log\.Client\.(?P<method>[a-zA-Z]+)\S+\s+\S+\stransformEvent\.main\(transformEvent\.java\:(?P<error_line>[0-9]+)\)", ), e_drop_fields("message"), e_output("sls-error"), ), regex_match(v("message"), r"OSSException"), e_compose( e_regex( "message", "(?P<data_time>\S+\s\S+)\scom\.aliyun\.oss\.(?P<product_exception>[a-zA-Z]+)\:(?P<error_message>[a-zA-Z0-9,\s]+)\.\n\[ErrorCode\]\:\s(?P<error_code>[a-zA-Z]+)\n\[RequestId\]\:\s(?P<request_id>[a-zA-Z0-9]+)\n\[HostId\]\:\s(?P<host_id>[a-zA-Z-.]+)\n\S+\n\S+(\s\S+){3}\n\s+\S+\s+(.+)(\s+\S+){24}\scom\.aliyun\.oss\.OSSClient\.(?P<method>[a-zA-Z]+)\S+\s+\S+\stransformEvent\.main\(transformEvent\.java:(?P<error_line>[0-9]+)\)", ), e_drop_fields("message"), e_output("oss-error"), ), )
Click Preview Data.
Create a data transformation job.
Click Save as Transformation Job.
In the Create Data Transformation Job panel, configure the parameters and click OK. The following table describes the parameters.
Parameter
Description
Job Name
The name of the data transformation job. Example: test.
Authorization Method
Select Default Role to read data from the source Logstore.
Storage Target
Target Name
The name of the storage destination. Example: sls-error or oss-error.
Target Region
The region where the destination project resides. Example: China (Hangzhou).
Target Project
The name of the project to which the destination Logstore belongs.
Target Store
The name of the destination Logstore. Example: sls-error or oss-error.
Authorization Method
Select Default Role to write transformation results to the destination Logstore.
Processing Range
Time Range
Select All.
After you create a data transformation job, Simple Log Service creates a dashboard for the job by default. You can view the metrics of the job on the dashboard.
On the Exception detail chart, you can view the logs that failed to be parsed, and then modify the regular expression.
If a log fails to be parsed, you can specify the severity of the log as WARNING to report the log. The data transformation job continues running.
If you specify the severity of the log as ERROR to report the log, the data transformation job stops running. In this case, you must identify the cause of the error and modify the regular expression until the data transformation job can parse all required types of error logs.
Step 3: Analyze error logs
After raw error logs are transformed, you can analyze the error logs. In this example, only the Java error logs of Simple Log Service are analyzed.
In the Projects section, click the project that you want to manage.
In the left-side navigation pane, click Log Storage. In the Logstores list, click the Logstore that you want to manage.
Enter a query statement in the search box.
To calculate the number of errors for each request method, execute the following query statement:
* | SELECT COUNT(method) as m_ct, method GROUP BY method
To calculate the number of occurrences of each error message for the PutLogs API operation, execute the following query statement:
* | SELECT error_message,COUNT(error_message) as ct_msg, method WHERE method LIKE 'PutLogs' GROUP BY error_message,method
To calculate the number of occurrences for each error code, execute the following query statement:
* | SELECT error_code,COUNT(error_code) as count_code GROUP BY error_code
To query the error information of each request method by log time, execute the following query statement:
* | SELECT date_format(data_time, '%Y-%m-%d %H:%m:%s') as date_time,status,product_exception,error_line, error_message,method ORDER BY date_time desc
Click 15Minutes(Relative) to specify a time range.
You can select a relative time or a time frame. You can also specify a custom time range.
NoteThe query results may contain logs that are generated 1 minute earlier or later than the specified time range.
Click Search & Analyze to view the query and analysis results.