When you create indexes, you can set the data type of a field to text, long, double, or JSON. This topic describes the data types of fields and provides some examples.
Text type
To query and analyze log data of the string type, you must set the data type of the related fields to text when you configure indexes. You must also turn on Enable Analytics for these fields.
By default, if you turn on Full Text Index, the data types of all fields except the __time__ field are set to text.
Sample log
Index configurations
Query statements
To query the logs that do not contain GET requests, execute the following search statement:
not request_method : GET
To query the logs that start with cn, execute the following search statement:
cn*
To collect the statistics on the distribution of clients, execute the following query statement:
* | SELECT ip_to_province(client_ip) as province, count(*) AS pv GROUP BY province ORDER BY pv
Long and double types
You can query the value of a field by using a numeric range only after you set the data type of the field to long or double.
If the value of a field is an integer, we recommend that you set the data type of the field to long when you configure indexes.
If the value of a field is a floating-point number, we recommend that you set the data type of the field to double when you configure indexes.
If you set the data type of a field to long but the value of the field is a floating-point number, you cannot query the field.
If you set the data type of a field to long or double but the value of the field is a string, you cannot query the field.
If you set the data type of a field to long or double, you cannot query the field by using an asterisk (*) or a question mark (?). The characters are used to perform a fuzzy match.
If the value of a field is an invalid numeric value, you can query the field by using the not key > -1000000 search statement. The statement returns the logs in which the value of the field is an invalid numeric value. You can replace -1000000 with a valid numeric value that is less than or equal to the smallest valid numeric value of the field in your logs.
Sample log
Index configurations
Query statements
To query the logs whose request duration is greater than 60 seconds, execute the following search statement:
request_time > 60
To query the logs whose request duration is greater than or equal to 60 seconds and less than 200 seconds, execute one of the following search statements:
request_time in [60 200)
request_time >= 60 and request_time < 200
To query the logs whose response status code is 200, execute the following search statement:
status = 200
JSON type
If the value of a field is in the JSON format, you can set the data type of the field to JSON when you configure indexes.
You can set the data type of a field in JSON objects to long, double, or text based on the field value, and turn on Enable Analytics for the field. After you turn on Enable Analytics, Simple Log Service allows you to query and analyze the field.
If you select Automatically Index All Text Fields in JSON Field, indexes are automatically created for all fields of the text type in JSON objects. After indexes are created, index traffic is generated.
For partially valid JSON-formatted data, Simple Log Service can parse only the valid part of the data.
The following example shows an incomplete JSON log. Simple Log Service can parse the content.remote_addr, content.request.request_length, and content.request.request_method fields.
content: { remote_addr:"192.0.2.0" request: { request_length:"73" request_method:"GE
References
For information about the use scenarios and FAQ about the query and analysis of JSON logs, see FAQ about the query and analysis of JSON logs. The FAQ includes the information about how to configure indexes, query and analyze indexed JSON fields, use JSON functions, and analyze JSON arrays.
For information about the configurations and procedures of the query and analysis of JSON logs, see Query and analyze JSON logs.
If the data volume of JSON logs that you want to query and analyze is small, you do not need to configure indexes for JSON leaf nodes. In this case, you can use JSON functions to query and analyze the logs. In some special scenarios, you can use only JSON functions for query and analysis. For more information about the scenarios, see When do I need to use JSON functions? For more information about the descriptions and examples of JSON functions, see JSON functions.
You can configure indexes for leaf nodes in JSON objects. You cannot configure indexes for child nodes that contain leaf nodes.
You cannot configure indexes for fields whose values are JSON arrays or indexes for fields in a JSON array.
If the value of a field is of the Boolean type, you can set the data type of the field to text when you configure indexes.
A query statement is in the
Search statement | Analytic statement
format. In an analytic statement, you must enclose a field name in double quotation marks ("") and a string in single quotation marks ('').
Sample log
The following figure provides an example of a JSON log. The log contains the class, latency, status, and info fields in addition to the reserved fields of Simple Log Service. The value of the info field is a JSON object that contains multiple layers.
Index configurations
The following list provides more details:
The values of the IP and data fields are JSON arrays. Therefore, you cannot configure indexes for the IP or data field. You cannot query or analyze data by using the fields.
The region and CreateTime fields are in a JSON array. Therefore, you cannot configure indexes for the region or CreateTime field. You cannot query or analyze data by using the fields.
Query statements
To query the logs in which the value of the usedTime field is greater than 60 seconds, execute the following search statement:
info.usedTime > 60
To query the logs in which the value of the success field is true, execute the following search statement:
info.success : true
To query the logs in which the value of the usedTime field is greater than 60 seconds and the value of the projectName field is not project01, execute the following search statement:
info.usedTime > 60 not info.param.projectName : project01
To calculate the average duration that is required to obtain the project information, execute the following query statement:
methodName = getProjectInfo | SELECT avg("info.usedTime") AS avg_time