×
Community Blog Better Performance and Cost-effectiveness: Migrating the Self-built ELK to SLS

Better Performance and Cost-effectiveness: Migrating the Self-built ELK to SLS

The article outlines the basic capabilities of SLS and compares them with the self-built open-source ELK, highlighting the significant advantages of SLS over open-source ELK.

By Lei Jing

Background

ELK (Elasticsearch/ES, Logstash, and Kibana) is currently a mainstream open-source logging solution widely used in observability scenarios.

As digitalization accelerates and machine data logs increase, self-built ELK faces numerous problems and challenges in terms of large-scale data and query performance. Addressing the issue of making observability data both highly available and cost-effective has become a new focus.

Simple Log Service (SLS) is a cloud-based observability serverless service launched by Alibaba Cloud. In terms of features, it aligns with ELK, providing high availability, high performance, and cost-effective solutions. Furthermore, SLS now offers open-source compatibility with ES and Kafka, enabling a smooth transition from self-built ELK scenarios to SLS. This allows users to enjoy the convenience and cost-effectiveness of cloud logging while maintaining their open-source habits..

_1

The Past and Present of SLS and ES

Elasticsearch began its journey in 2010, writing its initial code in Java and officially forming a company in 2012. Its foundation is the Lucene full-text search engine. Initially, ES was mainly used for enterprise-level searches, including document and commodity searches. Over the past few years, with the surge in observability data, Elasticsearch has made an entry into the observability market.

Since 2012, SLS has been tailored for observability scenarios, developed within Alibaba Cloud and built upon the robust Alibaba Cloud Apsara Distributed File System, using the C++ language. Thanks to its high performance and reliability, SLS has garnered considerable acclaim from an extensive internal customer base. Starting in 2017, SLS began offering its services on Alibaba Cloud to the public.

As it stands, both Elasticsearch and SLS boast a legacy spanning over a decade. SLS, in particular, has continued to refine its observability services, consistently enhancing quality through underlying technological advancements.

Alibaba Cloud SLS Core Architecture

1

SLS uses the Alibaba Cloud Apsara Distributed File System as the underlying storage layer and supports the storage format of various types of observability data such as logs, metrics, and traces. By default, multi-replica backup is used to ensure high availability. SLS also supports various storage specifications including hot storage, cold storage, and archive. It provides various query and computing capabilities on the storage layer, including:

  • Support for SQL analysis standard SQL 92
  • Index query and SPL: index query provides query capabilities similar to those of Lucene
  • Data processing: it facilitates the secondary processing of reported logs
  • Data pipeline: it provides consumption and writing capabilities similar to those of Kafka

In addition to the basic storage and computing capabilities, SDKs in various languages are also provided to facilitate business integration. SLS also provides out-of-the-box features for vertical scenarios, including AIOps (anomaly detection and root cause analysis), Copilot (supports for querying data in natural language), alerting, mobile monitoring, and consumption lib of Flink and Spark. In addition, SLS provides open-source compatibility and can be easily integrated with existing open-source ecosystems, including ES and Kafka. By using SLS compatibility, you can easily migrate self-built systems to SLS.

Comparison of SLS and ES Functions

Item SLS Open-source self-built ELK
Collection iLogtail (implementation in C++, high performance and open-source) Beats series and Logstash (low performance)
Storage A single Logstore supports petabytes If a single Index has a large amount of data at the level of hundreds of GBs, the index needs to be split
Query Supported Supported
Query without indexes Support for queries without indexes in SPL mode Not supported
SQL analysis Support for standard SQL 92 syntax Incomplete SQL support
Streaming consumption Support for Flink and Spark consumption (support for Kafka protocol and SLS native protocol) Not supported
Alerting Native support for alerts ELK needs XPack to enable Kibana Watch or third-party alerts (Grafana alerts and ElasticAlert)
Visualization SLS native console, Grafana, and Kibana Kibana and Grafana
DevOps platform integration The SLS console page can be directly embedded into the DevOps platform ELK depends on the limited embedding capability of Kibana and mainly relies on SDK APIs for secondary development
AIOps SLS native support for AIOps Requirement of XPack to enable AIOps

SLS natively provides a wide range of features. Based on the serverless features, these can be enabled with one click on the cloud.

Comparison of SLS and ES Maintenance

Item SLS Open-source self-built ELK
Capacity plan Serverless and attention-free. You need to pay attention to the capacity.
• If the disk is full, availability will be directly affected.
• ES write performance is poor, and sufficient resources need to be reserved for peaks.
Machine O&M Serverless and attention-free. You need to pay attention to the availability of machines. If a batch of machines is down, the availability will be affected.
Performance tuning You only need to expand the Logstore Shard. Specialized ES field support is required and community support may be required.
Version update Serverless and attention-free (SLS continues to iterate in the background to improve performance). Open-source ELK does not guarantee version compatibility and may be unavailable due to upgrades.
Data reliability The underlying layer uses the industry-leading Alibaba Cloud Apsara Distributed File System storage with three replicas by default. You need to set the number of replicas as needed. If it is a single replica, the probability of recovery is low in case of accidental data damage.
Service SLA Guaranteed by SLS. Guaranteed by a dedicated person or team. The cluster may be unavailable due to large queries.

Since SLS is a serverless service on the cloud, you do not need to purchase instances to use SLS. This eliminates O&M problems. However, you need to pay attention to many O&M issues if you use the self-built ELK. For scenarios with large usage, such as data volumes of more than 10 TB, professional people are often required to maintain and tune ES.

Comparison of SLS and Elasticsearch Performance

_3

Here is a simple test of query and analysis capabilities in a laboratory environment. When you query and analyze data at the 1 billion level, the response time of SLS is within seconds. However, as the concurrency increases, the response time of ES increases significantly, and the overall latency is higher than that of SLS. We also need to mention the write performance of ES. It is measured that the single-core capacity is about 2 MB/s, while the SLS single-shard write capacity can support up to 10 MB/s. Therefore, you can easily improve the write performance by increasing the number of shards in the Logstore.

SLS Open-source Compatibility

_4

The SLS compatabilities with ES and Kafka are built based on the underlying SLS storage and computing capabilities. In essence, ES and Kafka requests are converted into SLS protocols for requests. Therefore, no matter how a piece of data is written to SLS, it can be queried in an ES-compatible way or consumed in a Kafka-compatible way.

Previously, for the Kafka + ELK architecture, many machines were often required to synchronize data (such as LogStash and HangOut). Now, data synchronization is not required at all because you can access by different protocols just with SLS. Simply put, a piece of data provides multiple protocols. Data written by the Kafka protocol can be immediately queried by using the ES protocol. Similarly, data written by the ES protocol can be immediately consumed by using Kafka. If you use the open-source compatibility of SLS, you can create a serverless Kafka and a serverless ES at the same time. The billing method is pay-as-you-go, so you do not need to purchase instances.

Use Kibana to Access SLS

_6

Kibana requires three components to access SLS:

  • Kibana
  • Proxy: is used to distinguish between metadata requests and log data requests of Kibana.
  • ES: is only used to store the metadata of Kibana with a small amount of resource occupation so that a small specification ECS can meet the requirement.

Kibana stores the metadata in ES and updates the metadata. Currently, SLS provides non-modifiable storage, so a small ES is required to carry metadata. This ES only processes meta requests. Therefore, the load and data storage capacities are very low, which can be satisfied with a small specification ECS.

For more information about how to use Kibana to access SLS, see Connect to Kibana [1].

Use the Grafana ES plug-in to Access SLS

_6

In addition to using Kibana for log visualization, you can also use the Grafana ES plug-in to access SLS. Using the Grafana ES plug-in to access SLS ES-compatible interfaces provides the following benefits:

  • You do not need to write SQL statements, and you can visualize charts through interface operations.
  • You do not need to install additional plug-ins in Grafana.

For more information about how to use the ES plug-in provided by Grafana to access SLS, see Use the Grafana ES plug-in to access SLS [2].

Use Kafka SDK to Write or Consume SLS

_7

You can use the official Kafka SDK to connect to the SLS Kafka-compatible interfaces. It supports Kafka data writing and consumption.

It is recommended that you use the official Kafka SDK for consumption. For more information, see Use Kafka SDK to Consume SLS [3], and Use Various Agents to Write SLS Kafka-compatible Interfaces [4].

Smooth Migration Solution for Open-source ELK

Use the Dual-mining Solution for Migration

_8

You can deploy the iLogtail collection agent of SLS on the original machine, use iLogtail to collect business logs to SLS (one log can be collected by multiple agents without conflicts), and then use the ES-compatible and Kafka-compatible capabilities to connect to the original application. Through this solution, performance and data integrity verifications can be easily performed. After full verification, you can remove the agent of filebeat on the machine to complete the trace switching.

Use the Open-source Agent for Direct Write Migration

_9

If it is a new business or the APP wants to try SLS without historical burden, but you do not want to install iLogtail on the machine, then, you can reuse the original collection agent and write the logs of the collection agent to SLS by the Kafka protocol. For more information, see Use the Kafka Protocol to Upload Logs [5]. After logs are written to SLS, you can use SLS-compatible interfaces to connect to visualization tools such as Kibana and Grafana if you want to keep open-source usage habits.

Use the Kafka Import Feature for Migration

_10

If you do not want to change the original collection trace and want to keep the original Kafka (usually, it is not easy to change because many history programs rely on Kafka), then we can use this solution. You can use the Kafka import feature of SLS to import Kafka data to SLS without the need to deploy an instance. You can implement the import of Kafka data to SLS by configuring it on the page, and continuous import is also supported. For more information, see SLS Kafka Import [6]. After you import Kafka data to SLS, you can use the open-source compatible capabilities of SLS to retain the open-source usage habits.

Use the ES Import Feature for Existing Data Migration

_11

For scenarios where you want to import historical data in the ES to SLS for retention, you can use the ES import feature of SLS. For more information, see ES Import [7].

Summary

This article has outlined the fundamental capabilities of SLS and conducted a comparison with the self-hosted, open-source ELK stack, demonstrating that SLS offers significant benefits. The serverless service provided by SLS effectively reduces the operational and maintenance pressure and costs for the operations teams, enhancing the overall log management experience. SLS now presents an extensive array of open-source compatibility features, allowing you to enjoy various SLS functionalities while maintaining your open-source habits. It facilitates an easy transition and smooth migration between SLS and ELK log systems.

You are welcome to use SLS. If you have any questions, please contact us through the work order.

Reference

[1] Connect to Kibana

https://www.alibabacloud.com/help/en/sls/developer-reference/connect-log-service-to-kibana

[2] Use the Grafana ES plug-in to access SLS

https://www.alibabacloud.com/help/en/sls/user-guide/use-grafana-to-access-the-elasticsearch-compatible-api-of-log-service

[3] Use Kafka SDK to Consume SLS

https://www.alibabacloud.com/help/en/sls/user-guide/overview-of-kafka-consumption

[4] Use Various Agents to Write SLS Kafka-compatible Interfaces

https://www.alibabacloud.com/help/en/sls/user-guide/use-the-kafka-protocol-to-upload-logs

[5] Use the Kafka Protocol to Upload Logs

https://www.alibabacloud.com/help/en/sls/user-guide/use-the-kafka-protocol-to-upload-logs

[6] SLS Kafka Import

https://www.alibabacloud.com/help/en/sls/user-guide/import-data-from-kafka-to-log-service

[7] ES Import

https://www.alibabacloud.com/help/en/sls/user-guide/import-data-from-elasticsearch-to-log-service

Related Readings

0 1 0
Share on

You may also like

Comments

Related Products