Simple Log Service provides the rule-based data consumption feature. You can use Simple Log Service Processing Language (SPL) to process data in Simple Log Service before you consume the data. This topic describes the concept, benefits, scenarios, billing rules, and consumers of the rule-based data consumption feature.
How it works
Rule-based data consumption refers to log data consumption by entities in real time by using SPL. The entities include third-party software, applications in various programming languages, cloud services, and stream computing frameworks. SPL is a high-performance processing language that supports semi-structured log data. Simple Log Service uses SPL to preprocess and cleanse semi-structured log data in . For example, Simple Log Service can filter data by row, prune data by column, extract data by using regular expressions, and extract JSON fields. This way, your client can obtain and consume structured data. For more information about the SPL syntax, see SPL overview.
The rule-based data consumption and query and analysis features are both used to read data. For more information about the differences between the two features, see What are the differences between LogHub and LogSearch?
Scenarios
The rule-based data consumption feature is suitable for stream computing and real-time computing scenarios in which you need to preprocess data before you consume the data. For example, you need to filter data by row, prune data by column, extract data by using regular expressions, and extract JSON fields. The rule-based data consumption feature is time-sensitive and implements data consumption within seconds. You can configure a custom data retention period.
Benefits
You can consume data over the Internet. This reduces traffic costs.
For example, you want to write log data to Simple Log Service, consume the log data over the Internet, filter the log data, and then distribute the log data that is obtained after filtering to an internal system. The rule-based data consumption feature allows you to use SPL to filter log data in Simple Log Service. This way, only valid log data is sent to the related consumers and traffic costs are reduced.
You can compute data in Simple Log Service. This reduces local CPU consumption and accelerates the computing process.
For example, you want to write log data to Simple Log Service, consume the log data by using an on-premises machine, and then compute the log data on the on-premises machine. The rule-based data consumption feature allows you to use SPL to compute log data in Simple Log Service. This reduces local resource consumption.
Billing rules
If your Logstores use the pay-by-ingested-data billing mode, you are not charged for rule-based data consumption. However, if data is pulled over a public Simple Log Service endpoint, you are charged for read traffic over the Internet. The traffic is calculated based on the size of data after compression. For more information, see Billable items of pay-by-ingested-data.
If your Logstores use the pay-by-feature billing mode, you are charged for computing in Simple Log Service. If you access a public Simple Log Service endpoint, you may be charged for Internet traffic. For more information, see Billable items of pay-by-feature.
Consumers
The following table describes the consumers that are supported by the rule-based data consumption feature.
Type | Consumer | Description |
Applications in various programming languages | Applications in various programming languages | Applications that are developed in programming languages, such as Java, Python, and Go, can consume data in Simple Log Service as consumer groups. For more information, see Consume log data and Use consumer groups to consume data. |
Cloud services | Realtime Compute for Apache Flink | You can use Realtime Compute for Apache Flink to consume data in Simple Log Service in real time. For more information, see Simple Log Service connector. |
Stream computing frameworks | Apache Kafka | You can use the stream computing framework Apache Kafka to consume data in Simple Log Service in real time. For more information, see Use Kafka to consume data based on SPL statements. |
Usage notes
Rule-based data consumption involves complex computing in Simple Log Service. The complexity of SPL-based computing and the differences in data features result in slight latency increase when Simple Log Service reads data. For example, when Simple Log Service reads 5 MB of data, the read latency increases by 10 to 100 milliseconds. In most cases, the total amount of time that is used from data read by Simple Log Service to data computing on your on-premises machine decreases even if read latency increases.
If you use the rule-based data consumption feature and invalid SPL syntax or missing source fields exist, data may be lost or data consumption may fail. For more information, see Handle errors.
If you use the rule-based data consumption feature, you can specify an SPL statement that is up to 4 KB in size.