All Products
Search
Document Center

Realtime Compute for Apache Flink:Streaming lakehouse with Paimon

Last Updated:Mar 09, 2026

Apache Paimon (Paimon) provides a unified storage format for different data types. Paimon can work with Apache Flink and Apache Spark to implement a real-time lakehouse architecture that supports streaming and batch operations. Paimon innovatively combines the lake format and the log-structured merge-tree (LSM) structure to support real-time streaming updates in the lake architecture. You can use Paimon tables in Realtime Compute for Apache Flink to quickly build a data lake based on cloud storage services, such as Object Storage Service (OSS).

Paimon provides the following capabilities:

  • Enhanced real-time data ingestion: Paimon can work with Realtime Compute for Apache Flink to ingest different types of data into a data lake that supports automatic schema change synchronization and real-time updates from various database systems, such as MySQL. Tens of millions of data records can be efficiently ingested with low latency.

  • Unified stream and batch processing: Paimon can work with Apache Flink to facilitate stream processing and Apache Spark to facilitate batch processing. Paimon provides a unified format for data lake storage to improve ease of use and reduces costs.

  • Extensive ecosystem integration: Paimon can seamlessly integrate with a variety of Alibaba Cloud compute services, such as Realtime Compute for Apache Flink, E-MapReduce (Spark, StarRocks, Hive, and Trino), and MaxCompute.

  • Innovative lakehouse storage: Paimon uses deletion vectors and indexes to ensure a minute-level latency for streaming, batch, and online analytical processing (OLAP) queries.

For more information, see Apache Paimon.

Usage

Familiarize yourself with Paimon

Create a Paimon catalog

A Paimon catalog provides access to Paimon tables stored in external systems. It allows you to manage Paimon tables in a centralized manner and can be accessed by other Alibaba Cloud services. You can use Paimon catalogs in the following ways:

Create a Paimon table

Write data to a Paimon table

  • Insert new data to or update data in a Paimon table. For more information, see Write data.

  • Join a Paimon table with other tables and apply aggregate functions. For more information, see Merge engine.

  • Partially or completely overwrite a Paimon table. For more information, see Overwrite data (INSERT OVERWRITE).

  • Delete data from a Paimon table. For more information, see Delete data (DELETE).

  • Delete partitions from a Paimon table. For more information, see Modify a table schema.

Consume data from a Paimon table

  • Query or consume data from a Paimon table. For more information, see Consume data. If you want to consume data from a primary key table in streaming mode, make sure that you complete the changelog producer configuration.

  • Configure the consumer offset of a Paimon table. For more information, see Consume data from a specified offset.

  • Save the consumer offset of a Paimon table or retain expired snapshot files that are still in use. For more information, see Save consumption progress with consumer ID.

  • Run a batch deployment to read the historical states of a Paimon table. For more information, see Time travel.

Maintain a Paimon table

  • Learn how to address common issues related to Paimon. For more information, see Connector FAQ.

  • Optimize the read and write performance of Paimon tables. For more information, see Performance optimization.

  • Query the metadata of a Paimon table, such as the partitions and the total size of files in each partition. For more information, see Paimon system tables.

  • Modify the schema of a table in a Paimon Catalog. For more information, see Modify a table schema.

  • Delete a table from a Paimon catalog. For more information, see Drop a table.

  • Change the number of buckets for a Paimon table that uses fixed bucket mode. For more information, see Change the number of buckets in fixed bucket mode.

  • Clean up obsolete files in the directory of a Paimon table. For more information, see Clean up expired data.