All Products
Search
Document Center

Simple Log Service:Overview of rule-based data consumption

Last Updated:Sep 24, 2024

Simple Log Service provides the rule-based data consumption feature. You can use Simple Log Service Processing Language (SPL) to process data in Simple Log Service before you consume the data. This topic describes the concept, benefits, scenarios, billing rules, and consumers of the rule-based data consumption feature.

How it works

Rule-based data consumption refers to log data consumption by entities in real time by using SPL. The entities include third-party software, applications in various programming languages, cloud services, and stream computing frameworks. SPL is a high-performance processing language that supports semi-structured log data. Simple Log Service uses SPL to preprocess and cleanse semi-structured log data in . For example, Simple Log Service can filter data by row, prune data by column, extract data by using regular expressions, and extract JSON fields. This way, your client can obtain and consume structured data. For more information about the SPL syntax, see SPL overview.

image

Note

The rule-based data consumption and query and analysis features are both used to read data. For more information about the differences between the two features, see What are the differences between LogHub and LogSearch?

Scenarios

The rule-based data consumption feature is suitable for stream computing and real-time computing scenarios in which you need to preprocess data before you consume the data. For example, you need to filter data by row, prune data by column, extract data by using regular expressions, and extract JSON fields. The rule-based data consumption feature is time-sensitive and implements data consumption within seconds. You can configure a custom data retention period.

Benefits

  • You can consume data over the Internet. This reduces traffic costs.

    • For example, you want to write log data to Simple Log Service, consume the log data over the Internet, filter the log data, and then distribute the log data that is obtained after filtering to an internal system. The rule-based data consumption feature allows you to use SPL to filter log data in Simple Log Service. This way, only valid log data is sent to the related consumers and traffic costs are reduced.

  • You can compute data in Simple Log Service. This reduces local CPU consumption and accelerates the computing process.

    • For example, you want to write log data to Simple Log Service, consume the log data by using an on-premises machine, and then compute the log data on the on-premises machine. The rule-based data consumption feature allows you to use SPL to compute log data in Simple Log Service. This reduces local resource consumption.

Billing rules

  • If your Logstores use the pay-by-ingested-data billing mode, you are not charged for rule-based data consumption. However, if data is pulled over a public Simple Log Service endpoint, you are charged for read traffic over the Internet. The traffic is calculated based on the size of data after compression. For more information, see Billable items of pay-by-ingested-data.

  • If your Logstores use the pay-by-feature billing mode, you are charged for computing in Simple Log Service. If you access a public Simple Log Service endpoint, you may be charged for Internet traffic. For more information, see Billable items of pay-by-feature.

Consumers

The following table describes the consumers that are supported by the rule-based data consumption feature.

Type

Consumer

Description

Applications in various programming languages

Applications in various programming languages

Applications that are developed in programming languages, such as Java, Python, and Go, can consume data in Simple Log Service as consumer groups. For more information, see Consume log data and Use consumer groups to consume data.

Cloud services

Realtime Compute for Apache Flink

You can use Realtime Compute for Apache Flink to consume data in Simple Log Service in real time. For more information, see Simple Log Service connector.

Stream computing frameworks

Apache Kafka

You can use the stream computing framework Apache Kafka to consume data in Simple Log Service in real time. For more information, see Use Kafka to consume data based on SPL statements.

Usage notes

  • Rule-based data consumption involves complex computing in Simple Log Service. The complexity of SPL-based computing and the differences in data features result in slight latency increase when Simple Log Service reads data. For example, when Simple Log Service reads 5 MB of data, the read latency increases by 10 to 100 milliseconds. In most cases, the total amount of time that is used from data read by Simple Log Service to data computing on your on-premises machine decreases even if read latency increases.

  • If you use the rule-based data consumption feature and invalid SPL syntax or missing source fields exist, data may be lost or data consumption may fail. For more information, see Handle errors.

  • If you use the rule-based data consumption feature, you can specify an SPL statement that is up to 4 KB in size.