This version is rolling out gradually via canary release across the network. The rollout is expected to complete within 3–4 weeks. To check the current upgrade plan, see the latest announcement on the right side of the Realtime Compute for Apache Flink console homepage. If the new features are not yet available in your account, submit a ticket to request early access.
Ververica Runtime (VVR) 8.0.6 was released on April 1, 2024. This version is built on Apache Flink 1.17.2 and brings improvements across real-time lakehouse integration, connectors, and SQL window functions.
After the canary release completes, upgrade your deployments to this version. For instructions, see Upgrade the engine version of deployments.
What's new
Real-time lakehouse
Apache Paimon writes to OSS-HDFS
Apache Paimon data can now be written to OSS-HDFS, giving you a cost-effective storage option for lakehouse workloads. When you run CREATE TABLE AS or CREATE DATABASE AS to write to Apache Paimon, the resulting table is created in dynamic bucket mode automatically. This release also incorporates all Apache Paimon community features and fixes from the master branch up to March 15, 2024. For more information, visit Apache Paimon.
Hive data backed by OSS-HDFS via Hive catalogs
After configuring a Hive catalog, you can write Hive data directly to OSS-HDFS. This lets you build a Hive data warehouse on OSS-HDFS without changing your catalog-based workflow.
Non-Hive tables in DLF-based Hive catalogs
When Data Lake Formation (DLF) is your Hive catalog's metadata management center, you can now create non-Hive tables through the Hive catalog. This makes it easier to manage different table types from a single catalog.
Connectors
OceanBase CDC connector — Read from OceanBase source tables (public preview)
You can now read data from an OceanBase source table using the Change Data Capture (CDC) connector, ingesting OceanBase data directly into Flink pipelines and building tiered real-time data warehouses on OceanBase. For details, see OceanBase connector (public preview).
MongoDB CDC connector — CREATE TABLE AS and CREATE DATABASE AS (public preview)
Run CREATE TABLE AS or CREATE DATABASE AS to synchronize both data and schema changes from a MongoDB database to downstream tables in real time. For details, see Manage MongoDB catalogs (public preview), CREATE TABLE AS statement, CREATE DATABASE AS statement, and MongoDB connector (public preview).
PostgreSQL CDC connector — Concurrent full data reading (public preview)
Full data is now read concurrently from PostgreSQL CDC source tables, significantly reducing the time needed to complete initial full-data synchronization. For details, see PostgreSQL CDC connector (public preview).
Hologres connector — TIMESTAMP_LTZ data type support
The Hologres connector now supports the TIMESTAMP_LTZ data type, making it straightforward to process and analyze time-zone-aware data and improving data accuracy. In addition, the issue that a time difference exists when data is synchronized from a MySQL CDC source table to Hologres is fixed in this version. For details, see Hologres connector.
MaxCompute connector — Upsert Tunnel and schema support
Two enhancements are included in this release:
Write data to Transaction Table 2.0 tables using MaxCompute Upsert Tunnel.
Specify a schema so the connector can read from and write to tables in a MaxCompute project with the schema feature enabled.
For details, see MaxCompute connector.
Elasticsearch — Column-based routing keys
Specify any column as a routing key for real-time Elasticsearch indexing. This gives you finer control over how documents are distributed across shards. For details, see Elasticsearch.
Apache Kafka connector — Null handling and header-based filtering
Two improvements reduce unwanted data and improve distribution:
Empty column values are no longer written as null values to JSON strings, reducing unnecessary storage consumption.
Kafka data can be filtered based on headers during writing, making it easier to route data to the right destinations.
For details, see Apache Kafka connector and JSON.
OSS connector — Enhanced bucket authentication
After specifying a file system path, you must configure Object Storage Service (OSS) bucket authentication to read from or write to that path. This ensures your jobs have the right credentials before accessing OSS data. For details, see OSS connector.
StarRocks connector — JSON type support
Write JSON type data to StarRocks using the StarRocks connector to meet business requirements that involve semi-structured data.
Simple Log Service connector — Null values as empty strings
Null values are now written as empty strings to logs instead of being dropped, making it easier to handle fields that contain null values. For details, see Simple Log Service connector.
SQL enhancements
CUMULATE function — Update stream support for WindowAggregate
The WindowAggregate operator of the CUMULATE function now supports update streams. With this release, all four window functions — TUMBLE, HOP, CUMULATE, and SESSION — support window aggregation for update streams such as CDC data streams. In Apache Flink 1.18 and earlier, window functions did not support this. For details, see Queries.
Bug fixes
The following issues are fixed in this release:
The
shardWriteparameter configuration for ClickHouse result tables did not take effect.Savepoints of deployments could not be generated in extreme cases.
All issues included in the Apache Flink 1.17.2 community release. For the full list, see Apache Flink 1.17.2 Release Announcement.