A Logtail configuration is a set of policies that are used by Logtail to collect logs. You can configure parameters such as a data source and a collection mode to customize a Logtail configuration. This topic describes how to configure parameters for a Logtail configuration when you use the Simple Log Service API to collect logs.
Basic parameters of a Logtail configuration
Parameter | Type | Required | Example | Description |
configName | string | Yes | config-sample | The name of the Logtail configuration. The name must be unique in the project to which the Logtail configuration belongs. After the Logtail configuration is created, you cannot change the name of the Logtail configuration. The name must meet the following requirements:
|
inputType | string | Yes | file | The type of the data source. Valid values:
|
inputDetail | JSON object | Yes | None | The detailed configuration of the data source. For more information, see inputDetail. |
outputType | string | Yes | LogService | The output type of collected logs. The value is fixed as LogService. Collected logs can be uploaded only to Simple Log Service. |
outputDetail | JSON object | Yes | None | The detailed configuration of collected logs. For more information, see outputDetail. |
logSample | string | No | None | The sample log. |
InputDetail
Basic parameters
Parameter | Type | Required | Example | Description |
filterKey | array | No | ["ip"] | The fields that are used to filter logs. A log is collected only when the values of the fields in the log match the regular expressions that are specified in the filterRegex parameter. |
filterRegex | array | No | ["^10.*"] | The regular expressions that are used to match the values of the fields. The fields are specified in the filterKey parameter. The number of elements in the value of the filterRegex parameter must be the same as the number of elements in the value of the filterKey parameter. |
shardHashKey | array | No | ["__source__"] | The mode that is used to write data. By default, data is written to Simple Log Service in load balancing mode. For more information, see Load balancing mode. If you configure this parameter, data is written to Simple Log Service in shard mode. For more information, see Shard mode. The __source__ field is supported. |
enableRawLog | boolean | No | false | Specifies whether to upload raw logs. Valid values:
|
sensitive_keys | array | No | None | The configuration that is used to mask data. For more information, see sensitive_keys. |
mergeType | string | No | topic | The method that is used to aggregate data. Valid values:
|
delayAlarmBytes | int | No | 209715200 | The alert threshold of log collection progress. Default value: 209715200. This value specifies that an alert is triggered if 200 MB of data is not collected within a specified period of time. |
adjustTimezone | boolean | No | false | Specifies whether to change the time zone of logs. This parameter is valid only if the time parsing feature is enabled. For example, if you configure the timeFormat parameter, you can also configure the adjustTimezone parameter. |
logTimezone | string | No | GMT+08:00 | The offset of the time zone. Format: GMT+HH:MM or GMT-HH:MM. For example, if you want to collect logs whose time zone is UTC+8, set the value to GMT+08:00. |
advanced | JSON object | No | None | The advanced features. For more information, see advanced. |
sensitive_keys
Parameters
Parameter
Type
Required
Example
Description
key
string
Yes
content
The name of the log field.
type
string
Yes
const
The method that is used to mask the content of the log field. Valid values:
const: The sensitive content is replaced by the value of the const field.
md5: The sensitive content is replaced by the MD5 hash value that is generated for the content.
regex_begin
string
Yes
'password':'
The regular expression that is used to match the prefix of sensitive content. The regular expression is used to find sensitive content. You must use the RE2 syntax to specify the regular expression. For more information, visit RE2 syntax.
regex_content
string
Yes
[^']*
The regular expression that is used to match sensitive content. You must use the RE2 syntax to specify the regular expression. For more information, visit RE2 syntax.
all
boolean
Yes
true
Specifies whether to replace all sensitive content in the log field. Valid values:
true: replaces all sensitive content in the log field. This is the recommended value.
false: replaces only the sensitive content that the specified regular expressions match for the first time in the log field.
const
string
No
"********"
If you set the type parameter to const, you must configure this parameter.
Configuration example
If you want to mask
password
in the content field whose value is[{'account':'1812213231432969','password':'04a23f38'}, {'account':'1812213685634','password':'123a'}]
and replace the values of password with ********, use the following settings for the sensitive_keys parameter:sensitive_keys = [{"all": true, "const": "********", "regex_content": "[^']*", "regex_begin": "'password':'", "type": "const", "key": "content"}]
Sample log
[{'account':'1812213231432969','password':'********'}, {'account':'1812213685634','password':'********'}]
advanced
Parameter | Type | Required | Example | Description |
enable_root_path_collection | boolean | No | false | Specifies whether to allow data collection from Windows root directories, such as D:\log*. Valid values:
Important
|
exactly_once_concurrency | int | No | 1 | Specifies whether to enable the ExactlyOnce write feature. The ExactlyOnce write feature allows you to specify the maximum number of log groups that can be concurrently sent when Logtail collects data from a file. Valid values: 0 to 512. For more information, see Additional information: ExactlyOnce write feature. Valid values:
Important
|
enable_log_position_meta | boolean | No | true | Specifies whether to add the metadata information of a source log file to each log that is collected from the file. The metadata information includes the __tag__:__inode__ and __file_offset__ fields. Valid values:
Note Only Logtail V1.0.21 and later support this parameter. |
specified_year | uint | No | 0 | The year that is used to supplement the log time if the time of a raw log does not contain the year information. You can specify the current year or a different year. Valid values:
Note Only Logtail V1.0.21 and later support this parameter. |
force_multiconfig | boolean | No | false | Specifies whether Logtail can use the Logtail configuration to collect data from the files that are matched based on other Logtail configurations. Default value: false. If you want Logtail to collect data from a file by using different Logtail configurations, you can configure this parameter. For example, you can configure this parameter to collect data from a file to two Logstores by using two Logtail configurations. |
raw_log_tag | string | No | __raw__ | The field that is used to store raw logs that are uploaded. Default value: __raw__. |
blacklist | object | No | None | The blacklist configuration. For more information, see Parameters of blacklist. |
tail_size_kb | int | No | 1024 | The size of data that is collected from a file the first time Logtail reads the file. The value determines the start position from which Logtail collects data. The first time Logtail reads a file, Logtail can read up to 1,024 KB of data in the file by default.
Valid values: 0 to 10485760. Unit: KB. You can change the value. |
batch_send_interval | int | No | 3 | The interval at which aggregated data is sent. Default value: 3. Unit: seconds. |
max_rotate_queue_size | int | No | 20 | The maximum length of the queue in which a file is rotated. Default value: 20. |
enable_precise_timestamp | boolean | No | false | Specifies whether to extract time values with high precision. If you do not include this parameter in the configuration, the system uses the value false by default. The value false specifies that time values with high precision are not extracted. If you set this parameter to true, Logtail automatically parses the specified time values into timestamps with millisecond precision and stores the timestamps in the field specified by the precise_timestamp_key parameter. Note
|
precise_timestamp_key | string | No | "precise_timestamp" | The field that stores timestamps with high precision. If you do not include this parameter in the configuration, the system uses the precise_timestamp field by default. |
precise_timestamp_unit | string | No | "ms" | The unit of the timestamps with high precision. If you do not include this parameter in the configuration, the system uses ms by default. Valid values: ms, us, and ns. |
The following table describes the parameters of blacklist.
Parameter | Type | Required | Example | Description |
dir_blacklist | array | No | ["/home/admin/dir1", "/home/admin/dir2*"] | The blacklist of directories, which must be absolute paths. You can use asterisks (*) as wildcard characters to match multiple directories. For example, if you specify /home/admin/dir1, all files in the /home/admin/dir1 directory are skipped during log collection. |
filename_blacklist | array | No | ["app*.log", "password"] | The blacklist of file names. The files that match a name specified in this parameter are skipped during log collection regardless of the directories to which the files belong. You can use asterisks (*) as wildcard characters to match multiple file names. |
filepath_blacklist | array | No | ["/home/admin/private*.log"] | The blacklist of file paths, which must be absolute paths. You can use asterisks (*) as wildcard characters to match multiple files. For example, if you specify /home/admin/private*.log, all files whose name starts with private and ends with .log in the /home/admin/ directory are skipped during log collection. |
Configurations that are specific to Logtail for text log collection
Basic parameters
Parameter | Type | Required | Example | Description |
logType | string | Yes | common_reg_log | The mode in which logs are collected. Valid values:
|
logPath | string | Yes | /var/log/http/ | The log file path. |
filePattern | string | Yes | access*.log | The log file name. |
topicFormat | string | Yes | none | The method that is used to generate a topic. Valid values:
For more information, see Log topics. |
timeFormat | string | No | %Y/%m/%d %H:%M:%S | The format of the log time. For more information, see Time formats. |
preserve | boolean | No | true | Specifies whether to use the timeout mechanism on log files. If a log file is not updated within a specified period of time, Logtail considers the file to be timed out. Valid values:
|
preserveDepth | integer | No | 1 | The maximum levels of directories whose files are monitored by using the timeout mechanism. If you set the preserve parameter to false, you must configure this parameter. Valid values: 1 to 3. |
fileEncoding | string | No | utf8 | The encoding format of log files. Valid values: utf8 and gbk. |
discardUnmatch | boolean | No | true | Specifies whether to discard the logs that fail to be matched. Valid values:
|
maxDepth | int | No | 100 | The maximum levels of directories that are monitored. Valid values: 0 to 1000. The value 0 specifies that only the log file directory that you specify is monitored. |
delaySkipBytes | int | No | 0 | The threshold that is used to determine whether to discard data if the data is not collected within a specified period of time. Valid values:
|
dockerFile | boolean | No | false | Specifies whether to collect logs from container files. Default value: false. |
dockerIncludeLabel | JSON object | No | None | The container label whitelist. The whitelist specifies the containers whose data you want to collect. By default, this parameter is empty, which indicates that you want to collect logs or stdout and stderr from all containers. When you configure the container label whitelist, the LabelKey parameter is required, and the LabelValue parameter is optional.
Note
|
dockerExcludeLabel | JSON object | No | None | The container label blacklist. The blacklist specifies the containers whose data you want to exclude. By default, this parameter is empty, which indicates that you want to collect data from all containers. When you configure the container label blacklist, the LabelKey parameter is required, and the LabelValue parameter is optional.
Note
|
dockerIncludeEnv | JSON object | No | None | The environment variable whitelist. The whitelist specifies the containers whose data you want to collect. By default, this parameter is empty, which indicates that you want to collect logs or stdout and stderr from all containers. When you configure the environment variable whitelist, the EnvKey parameter is required, and the EnvValue parameter is optional.
Note Key-value pairs are in the logical OR relation. If the environment variables of a container match one of the specified key-value pairs, the container is matched. |
dockerExcludeEnv | JSON object | No | None | The environment variable blacklist. The blacklist specifies the containers whose data you want to exclude. By default, this parameter is empty, which indicates that you want to collect data from all containers. When you configure the environment variable blacklist, the EnvKey parameter is required, and the EnvValue parameter is optional.
Note Key-value pairs are in the logical OR relation. If the environment variables of a container match one of the specified key-value pairs, the container is filtered out. |
Configurations that are specific to log collection in full regex mode and simple mode
Parameters
Parameter
Type
Required
Example
Description
key
array
Yes
["content"]
The fields that are specified for raw logs.
logBeginRegex
string
No
.*
The regular expression that is used to match the beginning of the first line of a log.
regex
string
No
(.*)
The regular expression that is used to extract the value of a field.
Configuration example
{ "configName": "logConfigName", "outputType": "LogService", "inputType": "file", "inputDetail": { "logPath": "/logPath", "filePattern": "*", "logType": "common_reg_log", "topicFormat": "default", "discardUnmatch": false, "enableRawLog": true, "fileEncoding": "utf8", "maxDepth": 10, "key": [ "content" ], "logBeginRegex": ".*", "regex": "(.*)" }, "outputDetail": { "projectName": "test-project", "logstoreName": "test-logstore" } }
Configurations that are specific to log collection in JSON mode
Parameter | Type | Required | Example | Description |
timeKey | string | No | time | The key that is used to specify the time field. |
Configurations that are specific to log collection in delimiter mode
Parameters
Parameter
Type
Required
Example
Description
separator
string
No
,
The delimiter. You must select a delimiter based on the format of logs that you want to collect. For more information, see Collect logs in delimiter mode.
quote
string
Yes
\
The quote. If a log field contains delimiters, you must specify a quote to enclose the field. Simple Log Service parses the content that is enclosed in a pair of quotes into a complete field. You must select a quote based on the format of logs that you want to collect. For more information, see Collect logs in delimiter mode.
key
array
Yes
[ "ip", "time"]
The fields that are specified for raw logs.
timeKey
string
Yes
time
The time field. You must specify a field in the value of key as the time field.
autoExtend
boolean
No
true
Specifies whether to upload a log if the number of fields parsed from the log is less than the number of specified keys.
For example, if you specify a vertical bar (|) as the delimiter, the log 11|22|33|44|55 is parsed into the following fields: 11, 22, 33, 44, and 55. You can set the keys to A, B, C, D, and E.
true: The log 11|22|33|55 is uploaded to Simple Log Service, and 55 is uploaded as the value of the D key.
false: The log 11|22|33|55 is discarded because the number of fields parsed from the log does not match the number of specified keys.
Configuration example
{ "configName": "logConfigName", "logSample": "testlog", "inputType": "file", "outputType": "LogService", "inputDetail": { "logPath": "/logPath", "filePattern": "*", "logType": "delimiter_log", "topicFormat": "default", "discardUnmatch": true, "enableRawLog": true, "fileEncoding": "utf8", "maxDepth": 999, "separator": ",", "quote": "\"", "key": [ "ip", "time" ], "autoExtend": true }, "outputDetail": { "projectName": "test-project", "logstoreName": "test-logstore" } }
Configurations that are specific to Logtail plug-ins
Parameters
The following table describes the configurations that are specific to log collection by using plug-ins.
Parameter
Type
Required
Example
Description
plugin
JSON object
Yes
None
If you use a Logtail plug-in to collect logs, you must configure this parameter. For more information, see Use Logtail plug-ins to collect data.
Configuration example
{ "configName": "logConfigName", "outputType": "LogService", "inputType": "plugin", "inputDetail": { "plugin": { "inputs": [ { "detail": { "ExcludeEnv": null, "ExcludeLabel": null, "IncludeEnv": null, "IncludeLabel": null, "Stderr": true, "Stdout": true }, "type": "service_docker_stdout" } ] } }, "outputDetail": { "projectName": "test-project", "logstoreName": "test-logstore" } }
outputDetail
outputDetail is used to specify the project and Logstore that store the collected logs.
Parameter | Type | Required | Example | Description |
projectName | string | Yes | my-project | The name of the project. The name must be the same as the name of the project that you specify in the API request. |
logstoreName | string | Yes | my-logstore | The name of the Logstore. |
Additional information: ExactlyOnce write feature
After you enable the ExactlyOnce write feature, Logtail records fine-grained checkpoints by file to the disk of the server on which Logtail is installed. If exceptions such as a process error or a server restart occur during log collection, Logtail uses the checkpoints to determine the scope of data that must be processed in each file when log collection is resumed, and then uses the incremental sequence numbers that are provided by Simple Log Service to prevent duplicate data from being sent. However, the ExactlyOnce write feature consumes disk write resources. Limits:
Checkpoints are stored in the disk of the server on which Logtail is installed. If checkpoints are lost because the disk has no available space or becomes faulty, the checkpoints cannot be recovered.
Checkpoints record only the metadata information of a file. Checkpoints do not record the data of a file. If the file is deleted or modified, the checkpoints may not be recovered.
The ExactlyOnce write feature is based on the current write sequence numbers that are recorded by Simple Log Service. Each shard supports only 10,000 records. If the limit is exceeded, the previous records are replaced. To ensure reliability, make sure that the value that is calculated by using the following formula does not exceed 9500: Value = Number of active files that are written to the same Logstore × Number of Logtail instances. We recommend that you reserve a gap between the value and 9500.
Number of active files: the number of files that are being read and sent. Files that are generated during log file rotation and have the same logical file name are sent in serial mode. These files are considered as one active file.
Number of Logtail instances: the number of Logtail processes. By default, each server hosts one Logtail instance. The number of Logtail instances is the same as the number of servers.
By default, the sync command is not run when Logtail writes checkpoints to disk. This helps ensure performance. However, if buffered data fails to be written to disk when the server restarts, the checkpoints may be lost. To enable the sync-based write feature, you can add "enable_checkpoint_sync_write": true,
to the startup configuration file /usr/local/ilogtail/ilogtail_config.json of Logtail. For more information, see Configure the startup parameters of Logtail.