This topic describes how to import data from Elasticsearch to Simple Log Service. After you import data to Simple Log Service, you can query, analyze, and transform the data in the Simple Log Service console.
Prerequisites
An Elasticsearch cluster is available, and the Elasticsearch version is 6.3 or later.
A project and a Logstore are created. For more information, see Create a project and Create a Logstore.
Create a data import configuration
Log on to the Simple Log Service console.
Click Import Data in the Quick Data Import card. In the dialog box that appears, click the Data Import tab. Then, click Elasticsearch - Data Import.
Select the project and Logstore. Then, click Next.
Configure the parameters for the data import configuration.
In the Import Configuration step, configure the following parameters.
Parameter
Description
Job Name
The name of the import job.
Display Name
The display name of the import job.
Job Description
The description of the import job.
Service Instance URL
The URL of the Elasticsearch server. Format:
http://host:port/
.You can specify multiple URLs. Separate multiple URLs with commas (,). Example:
http://host1:port1/,http://host2:port2/
.In most cases, the service port of an Elasticsearch server is port 9200.
ImportantIf you configure the VPC-based Instance ID parameter, you must set the
host
variable to the IPv4 address of the Elastic Compute Service (ECS) instance that is involved.Elasticsearch Index List
The names of the indexes that you want to import. Separate multiple index names with commas (,). Example:
index1,index2,index3
.Elasticsearch User Name
The username that is used to access the Elasticsearch cluster. This parameter is required only if user authentication is required to access the Elasticsearch cluster.
Elasticsearch User Password
The password that is used to access the Elasticsearch cluster.
Time Field
The time field that is used to record the log time. You can enter the name of the column that represents time in the Elasticsearch indexes.
If you do not specify a time field, Simple Log Service uses the system time when data is imported.
ImportantIf you want to import incremental data, you must configure the Time Field parameter.
Time Field Format
The time format that is used to parse the value of the time field.
You can specify a time format that is supported by Java SimpleDateFormat. Example: yyyy-MM-dd HH:mm:ss. For more information about the time format syntax, see Class SimpleDateFormat. For more information about the common time formats, see Time formats.
You can specify an epoch time format. Valid values: epoch, epochMillis, epochMacro, and epochNano.
ImportantJava SimpleDateFormat does not support UNIX timestamps. If you want to use UNIX timestamps, you must set the Time Field Format parameter to an epoch time format.
Time Zone
The time zone of the time field.
If you set the Time Field Format parameter to an epoch time format, you do not need to configure the Time Zone parameter.
Elasticsearch Query String
The query statement that is used to filter data. The query statement must conform to the Elasticsearch query_string format. Example:
gender:male and city:Shanghai
. For more information, see Query string query.Import Method
The mode that is used to import data. Valid values:
Import Only Historical Data: After data is imported, the import job automatically ends.
Automatically Import Incremental Data: The import job continuously runs.
ImportantIf you select Automatically Import Incremental Data, you must configure the Time Field parameter.
Start At
The start time. After you specify a start time, data is imported to Simple Log Service only if the value of the time field is greater than or equal to the start time.
ImportantThis parameter takes effect only if you configure the Time Field parameter.
End Time
The end time. After you specify an end time, data is imported to Simple Log Service only if the value of the time field is less than or equal to the end time.
ImportantThis parameter takes effect only if you configure the Time Field parameter and set the Import Method parameter to Import Only Historical Data.
Maximum Latency in Seconds
The maximum latency that is allowed between data generation and data written to Elasticsearch.
ImportantIf you specify a value that is less than the actual latency, some data cannot be imported from Elasticsearch to Simple Log Service.
This parameter takes effect only if you configure the Time Field parameter and set the Import Method parameter to Automatically Import Incremental Data.
Incremental Data Check Interval (Seconds)
The interval at which Simple Log Services checks for incremental data in Elasticsearch. Unit: seconds. Default value: 300. Minimum value: 60.
VPC-based Instance ID
If the Elasticsearch cluster is an Alibaba Cloud Elasticsearch cluster in a virtual private cloud (VPC) or a self-managed Elasticsearch cluster on an ECS instance, you can configure this parameter to allow Simple Log Service to read data from the Elasticsearch cluster over an internal network of Alibaba Cloud. Data read over an internal network of Alibaba Cloud provides higher security and network stability.
ImportantThe Elasticsearch cluster must allow access from the CIDR block 100.104.0.0/16.
Click Preview to preview the import result.
After you confirm the result, click Next.
Preview data, create indexes, and then click Next. By default, full-text indexing is enabled in Simple Log Service. You can also manually create field indexes for the collected logs or click Automatic Index Generation. Then, Simple Log Service generates field indexes. For more information, see Create indexes.
ImportantIf you want to query all fields in logs, we recommend that you use full-text indexes. If you want to query only specific fields, we recommend that you use field indexes. This helps reduce index traffic. If you want to analyze fields, you must create field indexes. You must include a SELECT statement in your query statement for analysis.
Click Query Log. Then, you are redirected to the query and analysis page of your Logstore.
You must wait approximately 1 minute for the indexes to take effect. Then, you can view the collected logs on the Raw Logs tab. For more information about how to query and analyze logs, see Query and analyze logs.
View a data import configuration
After you create a data import configuration, you can view the configuration details and related reports in the Simple Log Service console.
In the Projects section, click the project to which the data import configuration belongs.
Find and click the Logstore to which the data import configuration belongs, choose , and then click the name of the data import configuration.
On the Import Configuration Overview page, view the basic information about the data import configuration and the related reports.
What to do next
Delete a data import configuration
On the Import Configuration Overview page, you can click Delete Configuration to delete the data import configuration.
WarningAfter a data import configuration is deleted, it cannot be restored. Proceed with caution.
Stop and restart the import job of a data import configuration
After you create a data import configuration, Simple Log Service creates an import job. On the Import Configuration Overview page, you can click Stop to stop the import job. After the import job is stopped, you can also restart the import job.
ImportantAfter an import job is stopped, the job is in the stopped state for up to 24 hours. If the import job is not restarted during this period, the job becomes unavailable. If you restart an unavailable import job, errors may occur.
FAQ
Issue | Possible cause | Solution |
An Elasticsearch connection error occurs during the preview. Error code: failed to connect. |
|
|
A timeout error occurs during the preview. Error code: preview request timed out. | The Elasticsearch index that you want to import contains no data or contains no data that meets the specified filter conditions. |
|
The log time displayed in Simple Log Service is different from the actual time of imported data. | No time field is specified in the data import configuration, or the specified time format or time zone is invalid. | Specify a time field or specify a valid time format and time zone. For more information, see Create a data import configuration. |
After data is imported, the data cannot be queried or analyzed. |
|
|
The number of imported data entries is less than expected. | Data entries whose size is larger than 3 MB exist in Elasticsearch. You can view the data entries on the Data Processing Insight dashboard. | Make sure that each data entry does not exceed 3 MB in size. |
After incremental import is enabled, a large latency exists when new data is imported. |
|
|
Error handling
Error | Description |
Communication with the Elasticsearch cluster is abnormal. | The import job pulls Elasticsearch data in scroll mode. The default keep-alive duration is 24 hours. If network connection errors occur or other errors that prevent normal communication with Elasticsearch occur, the import job is automatically retried. The other errors include user authentication errors. If the communication cannot be recovered within 24 hours, the scroll session information on Elasticsearch is deleted. As a result, the import job cannot be resumed even when the job is retried. The system reports the "No search context found" error. In this case, you can only re-create the import job. |