By Lei Jing
ELK (Elasticsearch/ES, Logstash, and Kibana) is currently a mainstream open-source logging solution widely used in observability scenarios.
As digitalization accelerates and machine data logs increase, self-built ELK faces numerous problems and challenges in terms of large-scale data and query performance. Addressing the issue of making observability data both highly available and cost-effective has become a new focus.
Simple Log Service (SLS) is a cloud-based observability serverless service launched by Alibaba Cloud. In terms of features, it aligns with ELK, providing high availability, high performance, and cost-effective solutions. Furthermore, SLS now offers open-source compatibility with ES and Kafka, enabling a smooth transition from self-built ELK scenarios to SLS. This allows users to enjoy the convenience and cost-effectiveness of cloud logging while maintaining their open-source habits..
Elasticsearch began its journey in 2010, writing its initial code in Java and officially forming a company in 2012. Its foundation is the Lucene full-text search engine. Initially, ES was mainly used for enterprise-level searches, including document and commodity searches. Over the past few years, with the surge in observability data, Elasticsearch has made an entry into the observability market.
Since 2012, SLS has been tailored for observability scenarios, developed within Alibaba Cloud and built upon the robust Alibaba Cloud Apsara Distributed File System, using the C++ language. Thanks to its high performance and reliability, SLS has garnered considerable acclaim from an extensive internal customer base. Starting in 2017, SLS began offering its services on Alibaba Cloud to the public.
As it stands, both Elasticsearch and SLS boast a legacy spanning over a decade. SLS, in particular, has continued to refine its observability services, consistently enhancing quality through underlying technological advancements.
SLS uses the Alibaba Cloud Apsara Distributed File System as the underlying storage layer and supports the storage format of various types of observability data such as logs, metrics, and traces. By default, multi-replica backup is used to ensure high availability. SLS also supports various storage specifications including hot storage, cold storage, and archive. It provides various query and computing capabilities on the storage layer, including:
In addition to the basic storage and computing capabilities, SDKs in various languages are also provided to facilitate business integration. SLS also provides out-of-the-box features for vertical scenarios, including AIOps (anomaly detection and root cause analysis), Copilot (supports for querying data in natural language), alerting, mobile monitoring, and consumption lib of Flink and Spark. In addition, SLS provides open-source compatibility and can be easily integrated with existing open-source ecosystems, including ES and Kafka. By using SLS compatibility, you can easily migrate self-built systems to SLS.
Item | SLS | Open-source self-built ELK |
---|---|---|
Collection | iLogtail (implementation in C++, high performance and open-source) | Beats series and Logstash (low performance) |
Storage | A single Logstore supports petabytes | If a single Index has a large amount of data at the level of hundreds of GBs, the index needs to be split |
Query | Supported | Supported |
Query without indexes | Support for queries without indexes in SPL mode | Not supported |
SQL analysis | Support for standard SQL 92 syntax | Incomplete SQL support |
Streaming consumption | Support for Flink and Spark consumption (support for Kafka protocol and SLS native protocol) | Not supported |
Alerting | Native support for alerts | ELK needs XPack to enable Kibana Watch or third-party alerts (Grafana alerts and ElasticAlert) |
Visualization | SLS native console, Grafana, and Kibana | Kibana and Grafana |
DevOps platform integration | The SLS console page can be directly embedded into the DevOps platform | ELK depends on the limited embedding capability of Kibana and mainly relies on SDK APIs for secondary development |
AIOps | SLS native support for AIOps | Requirement of XPack to enable AIOps |
SLS natively provides a wide range of features. Based on the serverless features, these can be enabled with one click on the cloud.
Item | SLS | Open-source self-built ELK |
---|---|---|
Capacity plan | Serverless and attention-free. | You need to pay attention to the capacity. • If the disk is full, availability will be directly affected. • ES write performance is poor, and sufficient resources need to be reserved for peaks. |
Machine O&M | Serverless and attention-free. | You need to pay attention to the availability of machines. If a batch of machines is down, the availability will be affected. |
Performance tuning | You only need to expand the Logstore Shard. | Specialized ES field support is required and community support may be required. |
Version update | Serverless and attention-free (SLS continues to iterate in the background to improve performance). | Open-source ELK does not guarantee version compatibility and may be unavailable due to upgrades. |
Data reliability | The underlying layer uses the industry-leading Alibaba Cloud Apsara Distributed File System storage with three replicas by default. | You need to set the number of replicas as needed. If it is a single replica, the probability of recovery is low in case of accidental data damage. |
Service SLA | Guaranteed by SLS. | Guaranteed by a dedicated person or team. The cluster may be unavailable due to large queries. |
Since SLS is a serverless service on the cloud, you do not need to purchase instances to use SLS. This eliminates O&M problems. However, you need to pay attention to many O&M issues if you use the self-built ELK. For scenarios with large usage, such as data volumes of more than 10 TB, professional people are often required to maintain and tune ES.
Here is a simple test of query and analysis capabilities in a laboratory environment. When you query and analyze data at the 1 billion level, the response time of SLS is within seconds. However, as the concurrency increases, the response time of ES increases significantly, and the overall latency is higher than that of SLS. We also need to mention the write performance of ES. It is measured that the single-core capacity is about 2 MB/s, while the SLS single-shard write capacity can support up to 10 MB/s. Therefore, you can easily improve the write performance by increasing the number of shards in the Logstore.
The SLS compatabilities with ES and Kafka are built based on the underlying SLS storage and computing capabilities. In essence, ES and Kafka requests are converted into SLS protocols for requests. Therefore, no matter how a piece of data is written to SLS, it can be queried in an ES-compatible way or consumed in a Kafka-compatible way.
Previously, for the Kafka + ELK architecture, many machines were often required to synchronize data (such as LogStash and HangOut). Now, data synchronization is not required at all because you can access by different protocols just with SLS. Simply put, a piece of data provides multiple protocols. Data written by the Kafka protocol can be immediately queried by using the ES protocol. Similarly, data written by the ES protocol can be immediately consumed by using Kafka. If you use the open-source compatibility of SLS, you can create a serverless Kafka and a serverless ES at the same time. The billing method is pay-as-you-go, so you do not need to purchase instances.
Kibana requires three components to access SLS:
Kibana stores the metadata in ES and updates the metadata. Currently, SLS provides non-modifiable storage, so a small ES is required to carry metadata. This ES only processes meta requests. Therefore, the load and data storage capacities are very low, which can be satisfied with a small specification ECS.
For more information about how to use Kibana to access SLS, see Connect to Kibana [1].
In addition to using Kibana for log visualization, you can also use the Grafana ES plug-in to access SLS. Using the Grafana ES plug-in to access SLS ES-compatible interfaces provides the following benefits:
For more information about how to use the ES plug-in provided by Grafana to access SLS, see Use the Grafana ES plug-in to access SLS [2].
You can use the official Kafka SDK to connect to the SLS Kafka-compatible interfaces. It supports Kafka data writing and consumption.
It is recommended that you use the official Kafka SDK for consumption. For more information, see Use Kafka SDK to Consume SLS [3], and Use Various Agents to Write SLS Kafka-compatible Interfaces [4].
You can deploy the iLogtail collection agent of SLS on the original machine, use iLogtail to collect business logs to SLS (one log can be collected by multiple agents without conflicts), and then use the ES-compatible and Kafka-compatible capabilities to connect to the original application. Through this solution, performance and data integrity verifications can be easily performed. After full verification, you can remove the agent of filebeat on the machine to complete the trace switching.
If it is a new business or the APP wants to try SLS without historical burden, but you do not want to install iLogtail on the machine, then, you can reuse the original collection agent and write the logs of the collection agent to SLS by the Kafka protocol. For more information, see Use the Kafka Protocol to Upload Logs [5]. After logs are written to SLS, you can use SLS-compatible interfaces to connect to visualization tools such as Kibana and Grafana if you want to keep open-source usage habits.
If you do not want to change the original collection trace and want to keep the original Kafka (usually, it is not easy to change because many history programs rely on Kafka), then we can use this solution. You can use the Kafka import feature of SLS to import Kafka data to SLS without the need to deploy an instance. You can implement the import of Kafka data to SLS by configuring it on the page, and continuous import is also supported. For more information, see SLS Kafka Import [6]. After you import Kafka data to SLS, you can use the open-source compatible capabilities of SLS to retain the open-source usage habits.
For scenarios where you want to import historical data in the ES to SLS for retention, you can use the ES import feature of SLS. For more information, see ES Import [7].
This article has outlined the fundamental capabilities of SLS and conducted a comparison with the self-hosted, open-source ELK stack, demonstrating that SLS offers significant benefits. The serverless service provided by SLS effectively reduces the operational and maintenance pressure and costs for the operations teams, enhancing the overall log management experience. SLS now presents an extensive array of open-source compatibility features, allowing you to enjoy various SLS functionalities while maintaining your open-source habits. It facilitates an easy transition and smooth migration between SLS and ELK log systems.
You are welcome to use SLS. If you have any questions, please contact us through the work order.
[1] Connect to Kibana
https://www.alibabacloud.com/help/en/sls/developer-reference/connect-log-service-to-kibana
[2] Use the Grafana ES plug-in to access SLS
[3] Use Kafka SDK to Consume SLS
https://www.alibabacloud.com/help/en/sls/user-guide/overview-of-kafka-consumption
[4] Use Various Agents to Write SLS Kafka-compatible Interfaces
https://www.alibabacloud.com/help/en/sls/user-guide/use-the-kafka-protocol-to-upload-logs
[5] Use the Kafka Protocol to Upload Logs
https://www.alibabacloud.com/help/en/sls/user-guide/use-the-kafka-protocol-to-upload-logs
[6] SLS Kafka Import
https://www.alibabacloud.com/help/en/sls/user-guide/import-data-from-kafka-to-log-service
[7] ES Import
https://www.alibabacloud.com/help/en/sls/user-guide/import-data-from-elasticsearch-to-log-service
Non-intrusive Observability Exploration with GraalVM Static Compilation
508 posts | 48 followers
FollowAlibaba Cloud Storage - June 19, 2019
Junho Lee - June 22, 2023
Alibaba Cloud MaxCompute - May 5, 2019
Junho Lee - June 15, 2023
Alibaba Cloud Native - July 10, 2024
ApsaraDB - June 18, 2021
508 posts | 48 followers
FollowAlibaba Cloud provides big data consulting services to help enterprises leverage advanced data technology.
Learn MoreAlibaba Cloud experts provide retailers with a lightweight and customized big data consulting service to help you assess your big data maturity and plan your big data journey.
Learn MoreAlibaba Cloud Elasticsearch helps users easy to build AI-powered search applications seamlessly integrated with large language models, and featuring for the enterprise: robust access control, security monitoring, and automatic updates.
Learn MoreApsaraDB for HBase is a NoSQL database engine that is highly optimized and 100% compatible with the community edition of HBase.
Learn MoreMore Posts by Alibaba Cloud Native Community