All Products
Search
Document Center

Simple Log Service:What is Logtail?

Last Updated:Dec 06, 2024

Logtail is a log collection agent that is provided by Simple Log Service. You can use Logtail to collect logs from multiple data sources, including Alibaba Cloud Elastic Compute Service (ECS) instances, servers in data centers, and servers from third-party cloud service providers. This topic describes the log collection process, features, benefits, limits, and configuration process of Logtail.

Collection process

Monitor logs

After you install Logtail on servers and create a Logtail configuration in the Simple Log Service console, the configuration is synchronized to the servers in real time. Logtail monitors logs in the log files of the servers based on the configuration. Logtail scans log directories and files based on the log file path and the maximum directory depth that you specify for monitoring in the configuration.

If the log files of the servers in a machine group are not updated after the Logtail configuration is applied to the machine group, the log files are considered historical log files. Logtail does not collect logs from historical log files. If log files are updated, Logtail reads and collects logs from the files, and then sends the logs to Simple Log Service. For more information about how to collect logs from historical log files, see Import historical logs from log files.

Logtail registers event listeners to monitor directories from which logs are collected. The event listeners poll the log files in the directories on a regular basis. This ensures that logs are collected at the earliest opportunity in a stable manner. For Linux servers, Inotify is used to monitor directories and poll log files.

Read logs

After Logtail detects updated log files, Logtail reads data in the log files.

  • The first time Logtail reads data in a log file, Logtail can read up to 1,024 KB of data in the log file by default.

    • If the file size is less than 1,024 KB, Logtail reads data from the beginning of the file.

    • If the file size is greater than 1,024 KB, Logtail reads the last 1,024 KB of data in the file.

    Note

    Simple Log Service allows you to specify the data size that Logtail can read in a log file the first time Logtail reads the file.

    • Console mode: Modify the First Collection Size parameter in the Advanced Options section on the Logtail Config page. For more information, see Advanced settings.

    • API mode: Modify the tail_size_kb parameter in the Logtail configuration. For more information, see advanced.

  • If the data in a log file is previously read, Logtail reads data in the file from the previous checkpoint.

  • Logtail can read up to 512 KB of data at a time. Make sure that the size of each log in a log file does not exceed 512 KB. Otherwise, Logtail cannot read data as expected.

Note

If you change the system time on a server, you must restart Logtail. Otherwise, the log time becomes incorrect and logs are dropped.

Process logs

When Logtail reads logs in a log file, Logtail splits each log in the file into multiple lines, parses the log, and then configures the time field for the log.

  • Split a log into multiple lines

    If you specify a regular expression to match the beginning of the first line of a log, Logtail splits the log into multiple lines based on the regular expression. If you do not specify a regular expression, a single log line is processed as a log.

  • Parse logs

    Logtail parses each log based on the collection mode that you specify in the Logtail configuration.

    Note

    If you specify complex regular expressions, Logtail may consume an excessive amount of CPU resources. We recommend that you specify regular expressions that allow Logtail to parse logs in an efficient manner.

    If Logtail fails to parse a log, Logtail handles the failure based on the setting of the Drop Failed to Parse Logs parameter in the Logtail configuration.

    • If you turn on Drop Failed to Parse Logs, Logtail drops the log and reports an error.

    • If you turn off Drop Failed to Parse Logs, Logtail uploads the log. The key of the log is set to raw_log and the value is set to the log content.

  • Configure the time field for a log

    • If you do not configure the time field for a log, the log time is the time when the log is parsed.

    • If you configure the time field for a log, the manner in which the log is processed varies in the following scenarios:

      • If the difference between the time when the log is generated and the current time is within 12 hours, the log time is extracted from the parsed log fields.

      • If the difference between the time when the log is generated and the current time is greater than 12 hours, the log is dropped and an error is reported.

Filter logs

After logs are processed, Logtail filters the logs based on the specified filter conditions.

  • If you do not specify filter conditions in the Filter Configuration field, the logs are not filtered.

  • If you specify filter conditions in the Filter Configuration field, the fields in each log are traversed.

    Logtail collects only the logs that meet the filter conditions.

Aggregate logs

To reduce the number of network requests, Logtail caches the processed and filtered logs for a specified period of time. Then, Logtail aggregates the logs and sends the logs to Simple Log Service. If one of the following conditions is met when data is cached, Logtail sends aggregated logs to Simple Log Service.

  • The aggregation duration exceeds 3 seconds.

  • The number of aggregated logs exceeds 4,000.

  • The total size of aggregated logs exceeds 512 KB.

Send logs

Logtail sends aggregated logs to Log Service. If a log fails to be sent, Logtail retries or no longer sends the log based on the HTTP status code.

HTTP status code

Description

Handling method of Logtail

401

The current account does not have the permissions to collect data. You must grant the account the permissions to access data. For more information, see Configure the permission assistant feature.

Logtail drops the log packet.

404

The project or Logstore that is specified in the Logtail configuration does not exist.

Logtail drops the log packet.

403

The shard quota is exhausted.

Logtail retries after 3 seconds.

500

A server exception occurs.

Logtail retries after 3 seconds.

Note

If you want to change the data transmission rate and the maximum number of concurrent connections, you can modify the max_bytes_per_sec and send_request_concurrency parameters in the Logtail startup configuration file. For more information, see Configure the startup parameters of Logtail.

Benefits

  • Logtail supports non-intrusive log collection based on log files. You do not need to modify your application code, and log collection does not affect the operation of your applications.

  • Logtail can collect text logs, binary logs, HTTP logs, and container logs.

  • Logtail can collect logs from various container clusters, such as Docker and Kubernetes clusters.

  • Logtail can handle exceptions that occur in the log collection process. If issues such as network or server exceptions occur, Logtail retries log collection and caches data locally to ensure data security.

  • Logtail provides centralized management based on Simple Log Service. After you install Logtail on a server from which you want to collect logs and create a machine group and Logtail configuration, Logtail collects logs from the server.

  • Logtail provides comprehensive self-protection mechanism. To ensure that Logtail does not significantly affect the performance of other services that run on the same server as Logtail, Simple Log Service limits the CPU, memory, and network resources that can be used by Logtail and provides a self-protection mechanism.

Configuration process

image
  1. Install Logtail.

    If the Simple Log Service project and an ECS instance belong to the same Alibaba Cloud account and reside in the same region, you can install Logtail on the ECS instance. For more information, see Install Logtail on ECS instances. For more information about how to install Logtail on other servers, see Install Logtail on a Linux server and Install Logtail on a Windows server.

  2. Configure a user identifier.

    If you want to collect logs from an ECS instance that belongs to a different Alibaba Cloud account, a server in a data center, or a server from a third-party cloud service provider, you must configure a user identifier for your server.

  3. Create a machine group.

    You can create an IP address-based machine group or a custom identifier-based machine group for a Simple Log Service project.

  4. Create a Logtail configuration.

    You can perform the preceding operations in the Simple Log Service console. For more information, see collect text logs and collect container logs.

After you perform the preceding operations, Logtail collects logs from your server and sends the logs to a specified Logstore. You can query the logs by using the console, API, SDK, or CLI of Simple Log Service.

Terms

  • Machine group: A machine group contains one or more servers from which logs of a specific type are collected. After you apply a Logtail configuration to a machine group, Simple Log Service collects logs from all servers in the machine group based on the Logtail configuration.

    Simple Log Service uses machine groups to manage all servers from which you want to collect logs by using Logtail. You can define a machine group based on an IP address or a custom identifier. You can manage machine groups in the Simple Log Service console. For example, you can create or delete a machine group and add a server to or remove a server from a machine group. For more information, see Overview.

  • Logtail is a log collection agent that is provided by Simple Log Service. Logtail runs on servers from which you want to collect logs.

    • Linux: In Linux, Logtail is installed in the /usr/local/ilogtail directory and initiates two independent processes whose names start with ilogtail. One is a collection process and the other is a daemon. The program operational logs are stored in the /usr/local/ilogtail/ilogtail.LOG file. For more information, see Install Logtail on a Linux server.

    • Windows:

      • Logtail (32-bit)

        • In 32-bit Windows, Logtail is installed in the C:\Program Files\Alibaba\Logtail directory.

        • In 64-bit Windows, Logtail is installed in the C:\Program Files (x86)\Alibaba\Logtail directory.

          Note

          You can run 32-bit and 64-bit applications in a 64-bit Windows operating system. The operating system stores 32-bit applications in a separate x86 directory to ensure compatibility.

      • Logtail (64-bit)

        You can install Logtail (64-bit) only in 64-bit Windows. The installation directory is C:\Program Files\Alibaba\Logtail.

      To check the status of Logtail, you can perform the following operations: Choose Control Panel > Administrative Tools > Services. If you install Logtail V1.0.0.0 or later, view the LogtailDaemon service. If you install Logtail V0.x.x.x, view the LogtailWorker service. The program operational logs are stored in the ilogtail.LOG file of the installation directory. For more information, see Install Logtail on a Windows server.

  • Logtail configurations for log collection: Logtail configurations for log collection are a set of policies that Logtail uses to collect logs. You can specify the data source and collection mode to create custom Logtail configurations for log collection. A Logtail configuration is used to collect a specific type of logs from servers, parse the collected logs, and send the logs to a specified Logstore of Simple Log Service.

Basic features

Feature

Description

Real-time log collection

Logtail dynamically monitors log files and reads and parses incremental logs in real time. In most cases, logs are sent to Simple Log Service within 3 seconds after the logs are generated. For more information, see Log collection process of Logtail.

Note

Logtail does not collect historical logs. Logs that are read 12 hours or later after the logs are generated are discarded. For more information about how to collect logs from historical log files, see Import historical logs from log files.

Automatic log rotation

Multiple applications rotate log files based on the file size or date. In the rotation process, original log files are renamed and new empty log files are created. For example, files such as app.LOG.1 and app.LOG.2 are generated for the app.LOG file after log rotation. You can specify the file to which collected logs are written. Example: app.LOG. Logtail automatically monitors the log rotation process and ensures that no logs are lost during this process.

Support for multiple data sources

Logtail can collect text logs, syslogs, HTTP logs, and MySQL binary logs. For more information, see Data collection overview.

Compatibility with an open-source collection agent

Logtail can collect data that is collected by using open-source software such as Logstash and Beats to Simple Log Service. For more information, see Data collection overview.

Automatic handling of collection exceptions

If data transmission fails due to an exception such as Simple Log Service errors, network errors, or quota exhaustion, Logtail actively retries log collection based on the specific scenario. If the retry fails, Logtail writes the data to its local cache and sends the data again after 3 seconds. For more information, see How do I use the automatic diagnostic tool of Logtail?

Flexible collection configuration

You can collect logs in a flexible manner based on Logtail configurations. You can specify the directories and files from which logs are collected. Exact match and wildcard match are supported. You can specify the log collection mode and the fields that you want to extract. You can use a regular expression to extract logs.

The log data models of Simple Log Service require that each log have a precise timestamp. Logtail supports custom log time formats, which allows you to extract the required timestamp information from log data of different formats.

Automatic synchronization of Logtail configurations

After you create or update a Logtail configuration in the Simple Log Service console, Logtail automatically receives and applies the configuration within 3 minutes in most cases. No logs are lost during the Logtail update process.

Status monitoring

Logtail monitors its CPU and memory consumption in real time. This helps prevent Logtail from consuming excessive resources. The overconsumption of resources may affect other services that run on the same server as Logtail. If the resource usage of Logtail exceeds the limit, Logtail automatically restarts. If the network bandwidth usage exceeds the limit, Logtail triggers throttling. For more information, see Startup configuration file (ilogtail_config.json).

Transmission of signed data

To prevent data from being tampered with during data transmission, Logtail obtains a private token from Simple Log Service over a trusted channel and signs all log data packets that are sent.

Note

Logtail obtains a private token over HTTPS to ensure the security of your token.

Data collection reliability

During data collection, Logtail stores the collected checkpoint information to a local server on a regular basis. If an exception such as unexpected shutdown of a server occurs or a process unexpectedly exits, Logtail collects data from the last recorded checkpoint after it is restarted. This prevents data loss. Logtail runs based on the startup parameters that are specified in the startup configuration file. If the resource usage of Logtail exceeds a limit for more than 5 minutes, Logtail is forcefully restarted. Duplicate data may be collected after the restart.

Logtail uses internal mechanisms to improve log collection reliability. However, logs may be lost in the following situations: However, logs may fail to be collected in the following scenarios:

  • Logtail is not running, but log files are rotated multiple times.

  • The rotation rate of log files is extremely high, such as one rotation per second.

  • The log collection rate is lower than the log generation rate for a long period of time.

References