Simple Log Service lets you preprocess log data with Simple Log Service Processing Language (SPL) before consumers receive it. Rule-based data consumption filters, transforms, and structures log data in real time, reducing Internet traffic costs, offloading compute from local machines, and delivering only the data your applications need.
How it works
Rule-based data consumption applies SPL statements to log data as it is read from Simple Log Service, before the data reaches your consumer. Consuming entities include:
Third-party software
Applications in various programming languages
Cloud services
Stream computing frameworks
SPL is a high-performance processing language designed for semi-structured log data. When a consumer reads data from a Logstore, Simple Log Service uses SPL to preprocess and cleanse the data. The consumer then receives structured data as output.
| Operation | Description | SPL example |
|---|---|---|
| Filter data by row | Keep only rows that match a condition | * | where status = 200 |
| Prune data by column | Return only specified fields | * | project method, url, status |
| Extract data by using regular expressions | Parse unstructured text into fields | * | parse-regexp content, '(\S+) (\S+)' as method, url |
| Extract JSON fields | Extract first-layer JSON key-value pairs into individual fields | * | parse-json content |
For the full SPL syntax reference, see SPL syntax.

Rule-based data consumption and query and analysis are both used to read data, but they serve different purposes. For details, see What are the differences between LogHub and LogSearch?
Scenarios
Rule-based data consumption is designed for stream computing and real-time computing scenarios where you need to preprocess data before consuming it. Typical use cases:
Row filtering -- Discard irrelevant log entries at the source so consumers process only the data they need.
Column pruning -- Remove unnecessary fields to reduce the volume of data transferred to consumers.
Regex extraction -- Parse semi-structured log lines into structured fields before consumption.
JSON field extraction -- Extract first-layer JSON key-value pairs into individual fields for downstream processing.
The feature is time-sensitive and delivers data to consumers within seconds. You can configure a custom data retention period for your Logstore.
Benefits
Reduce Internet traffic costs
Without rule-based consumption, consumers pull all raw log data over the Internet and filter it locally. With rule-based consumption, SPL filters data inside Simple Log Service before it leaves, so only the data you need crosses the network.
Example: You write log data to Simple Log Service and consume it over the Internet. By applying an SPL statement such as * | where level = 'ERROR', only matching rows are sent to the consumer. This reduces the volume of data transferred over the Internet and lowers traffic costs.
Reduce local CPU consumption
Without rule-based consumption, your on-premises machine must download and compute all raw data. With rule-based consumption, SPL runs the compute inside Simple Log Service, so your machine receives pre-processed results.
Example: You write log data to Simple Log Service and consume it on an on-premises machine. By using SPL to compute log data in Simple Log Service, you offload processing from your local machine and accelerate the overall pipeline.
Supported consumers
| Type | Consumer | Description |
|---|---|---|
| Applications in various programming languages | Applications in various programming languages, such as Java, Python, and Go | Applications consume data in Simple Log Service as consumer groups. See Consume log data by using Simple Log Service SDK and Use consumer groups to consume logs. Best practices: Use Simple Log Service SDK to perform SPL-based consumption |
| Cloud services | Realtime Compute for Apache Flink | Consume data from Simple Log Service in real time. See Simple Log Service connector. Best practices: Implement row filtering and column pruning by Flink SQL by using SPL statements and Semi-structured analysis based on SPL in Flink SQL |
| Stream computing frameworks | Apache Kafka | To request support, submit a ticket. |
Billing
How rule-based data consumption is billed depends on the billing mode of your Logstore.
| Billing mode | Rule-based data consumption charges | Internet traffic charges | Details |
|---|---|---|---|
| Pay-by-ingested-data | Not charged | Charged for read traffic over the Internet if data is pulled over a public Simple Log Service endpoint. Traffic is calculated based on the size of data after compression. | Billable items of pay-by-ingested-data |
| Pay-by-feature | Charged for computing in Simple Log Service | You may be charged for Internet traffic if you access a public Simple Log Service endpoint. | Billable items of pay-by-feature |
Limits and constraints
Rule-based data consumption involves complex computing in Simple Log Service. The complexity of SPL-based computing and the differences in data features affect latency and throughput.
| Item | Limit |
|---|---|
| Maximum SPL statement size | 4 KB |
| Shard read limits | Same as regular real-time data consumption. See Data read and write. |
| Read traffic calculation | Based on the size of raw data before SPL-based data processing, not the filtered output. |
| Latency impact | Reading 5 MB of data adds approximately 10 to 100 milliseconds of latency due to SPL processing. The complexity of SPL-based computing and the differences in data features affect actual latency. |
Although SPL processing adds slight read latency, in most cases the total amount of time from data read by Simple Log Service to data computing on your on-premises machine decreases, because the consumer receives less data and performs less local computation.
Error handling: If your SPL statement contains invalid syntax or references missing source fields, data may be lost or data consumption may fail. For details, see Handle errors.
FAQ
What do I do if the ShardReadQuotaExceed error occurs?
This error occurs when the read traffic of data exceeds the read capacity of a shard. To resolve it:
Wait and retry. Allow your consumer to wait and try again.
Split a shard. After the shard is split, new data is consumed across multiple shards, which reduces the read speed of each shard.
What is the throttling policy for rule-based data consumption?
The throttling policy for rule-based data consumption is the same as the throttling policy for regular data consumption. For details, see Data read and write.
The traffic used for throttling is calculated based on the size of raw data before SPL-based data processing. For example:
Compressed raw data size: 100 MB
SPL statement:
* | where method = 'POST'Compressed data returned to consumer: 20 MB
Read traffic counted for throttling: 100 MB (the raw data size, not the filtered output)
Why is the outflow traffic in the Traffic/min chart low after I enable rule-based data consumption?
The outflow traffic in the Traffic/min chart on the Project Monitoring tab represents the size of data after SPL-based data processing. If your SPL statement includes operations that reduce data size, such as row filtering and column pruning, the outflow traffic will be lower than the inflow traffic. For more information, see Project Monitoring.
What's next
Learn SPL syntax: SPL syntax
Get started with SDK-based consumption: Use Simple Log Service SDK to perform SPL-based consumption
Set up Flink integration: Simple Log Service connector
Understand the data model: What are the differences between LogHub and LogSearch?
Review read/write limits: Data read and write