All Products
Search
Document Center

Simple Log Service:Limits on data import from Kafka to Simple Log Service

Last Updated:Aug 29, 2023

This topic describes the limits on data import from Kafka to Simple Log Service.

Limits on collection

Item

Description

Compression format

Kafka Producer supports data that is compressed in the following formats: gzip, zstd, lz4, and snappy. Kafka Producer discards data that is compressed in other formats.

You can view the number of data entries that are discarded in the Deliver Failed chart on the Data Processing Insight dashboard. For more information, see View the data import configuration.

Maximum number of topics

A maximum of 10,000 topics can be specified in a data import configuration.

Size of a single log

The size of a single log is limited to 3 MB. If the size of a log exceeds the limit, the log is discarded.

You can view the number of logs that are discarded in the Deliver Failed chart on the Data Processing Insight dashboard. For more information, see View the data import configuration.

Starting position

When you configure the Starting Position parameter for a data import configuration, you can select only Earliest or Latest. You cannot specify a point in time as the starting position for data import.

Limits on configuration

Item

Description

Number of data import configurations

The total number of data import configurations that can be created in a single project can be up to 100 regardless of configuration types. If you want to increase the limit, submit a ticket.

Bandwidth

When a data import task reads data from an Alibaba Cloud Kafka cluster over a virtual private cloud (VPC), the maximum network bandwidth allowed for the task is 128 MB/s by default. If you require a higher bandwidth, submit a ticket.

Limits on performance

Item

Description

Number of concurrent subtasks

Simple Log Service creates multiple import subtasks to concurrently import data based on the number of Kafka topics in the backend. Each subtask can process decompressed data at a maximum rate of 50 MB/s.

  • If the number of topics exceeds 2,000, Simple Log Service creates 16 subtasks.

  • If the number of topics exceeds 1,000, Simple Log Service creates 8 subtasks.

  • If the number of topics exceeds 500, Simple Log Service creates 4 subtasks.

  • If the number of topics is less than or equal to 500, Simple Log Service creates 2 subtasks.

If you want to increase the limits, submit a ticket.

Number of partitions for a topic

If a Kafka topic has a large number of partitions, additional subtasks can be created to improve the throughput of data import.

If a Kafka topic has a large amount of data, you can increase the number of partitions for the topic. We recommend that the number of partitions for a topic be no less than 16.

Number of shards in a Logstore

The write performance of Simple Log Service varies based on the number of shards in a Logstore. A single shard supports a write speed of 5 MB/s. If an import task writes a large amount of data to Simple Log Service, we recommend that you increase the number of shards in the Logstore. For more information, see Manage shards.

Data compression

If you want to import a large amount of data from Kafka to Simple Log Service, we recommend that you compress the data when you write the data to Kafka. This way, the amount of data that is read over a network is significantly reduced.

Network transmission is more time-consuming than data decompression, especially when data is imported over the Internet.

Network

If your Alibaba Cloud Kafka cluster is deployed in a VPC, you can read data from the cluster over the VPC. This reduces Internet traffic and accelerates data transmission. In this scenario, the data read bandwidth can reach more than 100 MB/s.

When you import data over the Internet, the performance and bandwidth of the network cannot be guaranteed. This may cause import latency.

Other limits

Item

Description

Metadata synchronization latency

An import task synchronizes the metadata of your Kafka cluster with Simple Log Service at 10-minute intervals. If a topic or partition is newly created, the metadata of the topic or partition is imported after a latency of approximately 10 minutes.

Note

If you set the Starting Position parameter to Latest in a data import configuration, the data that is initially written to a new topic may be skipped (within a maximum of 10 minutes).

Validity period of an offset for a topic

An offset for a Kafka topic is valid for up to seven days. If no data is read from a topic within seven days, the offset is discarded. If new data is written to the topic after seven days, Simple Log Service determines which offset to use based on the starting position specified in a data import configuration.