Client evaluation in log collection scenarios
In the data technology (DT) era, hundreds of millions of servers, mobile terminals, and network devices generate a large number of logs every day. The centralized log processing solution effectively meets the log consumption requirements in the lifecycle of log data. Before consuming logs, you need to collect logs from devices and synchronize them to the cloud first.
Three log collection tools
Logstash
As a part of the ELK Stack, Logstash is active in the open-source community. It can work with extensive plug-ins in the ecosystem.
Logstash is coded in JRuby and can run across platforms on Java virtual machines (JVMs).
With a modular design, Logstash features high scalability and interoperability.
Fluentd
Fluentd is a popular log collection tool in the open-source community. Its core component, td-agent, is commercially available and maintained by Treasure Data. Fluentd is selected for evaluation in this topic.
Fluentd is coded in CRuby. Some key components related to its performance are re-coded in C. The overall performance of Fluentd is excellent.
Fluentd features a simple design and provides reliable data transmission in pipelines.
Compared with Logstash, Fluentd has fewer plug-ins.
Logtail
As the producer of Alibaba Cloud Log Service, Logtail has been tested in big data scenarios for many years in Alibaba Group.
Logtail, which is coded in C++, delivers excellent performance in stability, resource control, and management.
Compared with Logstash and Fluentd, Logtail obtains less support from the open-source community and focuses more on log collection.
Feature comparison
Feature | Logstash | Fluentd | Logtail |
Log data read | Polling | Polling | Triggered by event |
File rotation | Supported | Supported | Supported |
Failover processing based on local checkpoints | Supported | Supported | Supported |
General log parsing | Parsing by using Grok based on regular expressions | Parsing by using regular expressions | Parsing by using regular expressions |
Specific log types | Mainstream formats such as delimiter, key-value, and JSON | Mainstream formats such as delimiter, key-value, and JSON | Mainstream formats such as delimiter, key-value, and JSON |
Data compression for transmission | Supported by plug-ins | Supported by plug-ins | LZ4 |
Data filtering | Supported | Supported | Supported |
Data buffer for transmission | Supported by plug-ins | Supported by plug-ins | Supported |
Transmission exception handling | Supported by plug-ins | Supported by plug-ins | Supported |
Runtime environment | Coded in JRuby and dependent on the JVM environment | Coded in CRuby and C and dependent on the Ruby environment | Coded in C++, without special requirements for the runtime environment |
Thread support | Multithreading | Multithreading restricted by the global interpreter lock (GIL) | Multithreading |
Hot upgrade | Not supported | Not supported | Supported |
Centralized configuration management | Not supported | Not supported | Supported |
Running status self-check | Not supported | Not supported | CPU or memory threshold protection supported |
Performance comparison in log collection scenarios
For example, the following Nginx access log contains 365 bytes, from which 14 fields can be extracted:
In the simulated test scenario, this log is repeatedly written at different compression ratios. The time field of each log is set to the current system time when the log is written, and the other 13 fields are the same. Compared with the actual scenario, the simulated scenario has no difference in parsing logs. The only difference lies in that a high data compression ratio can reduce the network traffic on writing data.
Logstash
In Logstash 2.0.0, Logstash parses logs by using Grok and writes the logs to Kafka by using a built-in plug-in that enables GZIP compression.
Log parsing configuration:
grok {
patterns_dir=>"/home/admin/workspace/survey/logstash/patterns"
match=>{ "message"=>"%{IPORHOST:ip} %{USERNAME:rt} - \[%{HTTPDATE:time}\] \"%{WORD:method} %{DATA:url}\" %{NUMBER:status} %{NUMBER:size} \"%{DATA:ref}\" \"%{DATA:agent}\" \"%{DATA:cookie_unb}\" \"%{DATA:cookie_cookie2}\" \"%{DATA:monitor_traceid}\" %{WORD:cell} %{WORD:ups} %{BASE10NUM:remote_port}" }
remove_field=>["message"]
}
The following table lists test results.
Write transactions per second (TPS) | Write traffic (Unit: KB/s) | CPU usage (Unit: %) | Memory usage (Unit: MB) |
500 | 178.22 | 22.4 | 427 |
1,000 | 356.45 | 46.6 | 431 |
5,000 | 1,782.23 | 221.1 | 440 |
10,000 | 3,564.45 | 483.7 | 450 |
Fluentd
In td-agent 2.2.1, Fluentd parses logs by using regular expressions and writes the logs to Kafka by using the third-party plug-in fluent-plugin-kafka that enables GZIP compression.
Log parsing configuration:
<source>
type tail
format /^(? <ip>\S+)\s(? <rt>\d+)\s-\s\[(? <time>[^\]]*)\]\s"(? <url>[^\"]+)"\s(? <status>\d+)\s(? <size>\d+)\s"(? <ref>[^\"]+)"\s"(? <agent>[^\"]+)"\s"(? <cookie_unb>\d+)"\s"(? <cookie_cookie2>\w+)"\s"(?
<monitor_traceid>\w+)"\s(? <cell>\w+)\s(? <ups>\w+)\s(? <remote_port>\d+).*$/
time_format %d/%b/%Y:%H:%M:%S %z
path /home/admin/workspace/temp/mock_log/access.log
pos_file /home/admin/workspace/temp/mock_log/nginx_access.pos
tag nginx.access
</source>
The following table lists test results.
Write TPS | Write traffic (Unit: KB/s) | CPU usage (Unit: %) | Memory usage (Unit: MB) |
500 | 178.22 | 13.5 | 61 |
1,000 | 356.45 | 23.4 | 61 |
5,000 | 1,782.23 | 94.3 | 103 |
Due to the restrictions of the GIL, a single process of Fluentd uses only one CPU core. You can install the multiprocess plug-in to use multiple processes for achieving a higher log throughput.
Logtail
In Logtail 0.9.4, Logtail uses regular expressions to extract log fields, compresses data by using the LZ4 compression algorithm, and then writes the data to Alibaba Cloud Log Service in compliance with HTTP. The batch_size parameter is set to 4000.
Log parsing configuration:
logRegex : (\S+)\s(\d+)\s-\s\[([^]]+)]\s"([^"]+)"\s(\d+)\s(\d+)\s"([^"]+)"\s"([^"]+)"\s"(\d+)"\s"(\w+)"\s"(\w+)"\s(\w+)\s(\w+)\s(\d+).*
keys : ip,rt,time,url,status,size,ref,agent,cookie_unb,cookie_cookie2,monitor_traceid,cell,ups,remote_port
timeformat : %d/%b/%Y:%H:%M:%S
The following table lists test results.
Write TPS | Write traffic (Unit: KB/s) | CPU usage (Unit: %) | Memory usage (Unit: MB) |
500 | 178.22 | 1.7 | 13 |
1,000 | 356.45 | 3 | 15 |
5,000 | 1,782.23 | 15.3 | 23 |
10,000 | 3,564.45 | 31.6 | 25 |
Comparison of single-core CPU processing capabilities
Summary
The three log collection tools have their own advantages and disadvantages:
Logstash supports all mainstream log types, the most abundant plug-ins, and flexible customization. However, its performance on log collection is relatively poor, and it requires high memory usage when running in the JVM environment.
Fluentd supports all mainstream log types and many plug-ins. Its performance on log collection is excellent.
Logtail occupies the fewest CPU and memory resources of machines, achieves a high performance throughput, and provides comprehensive support for common log collection scenarios. However, it lacks the support of plug-ins, so it is less flexible and scalable than the preceding two tools.