All Products
Search
Document Center

Realtime Compute for Apache Flink:April 1, 2024

Last Updated:Nov 08, 2024

This topic describes the release notes for Realtime Compute for Apache Flink and provides links to relevant references. The release notes provide the major updates and bug fixes in Realtime Compute for Apache Flink in the version that was released on April 1, 2024.

Important

A canary release will be gradually performed on the entire network for the upgrade and the upgrade is planned to be complete within three to four weeks. To learn about the upgrade plan, view the most recent announcement on the right side of the homepage of the Realtime Compute for Apache Flink console. If you cannot use new features in Realtime Compute for Apache Flink, this version is still unavailable for your account. If you want to perform an upgrade at the earliest opportunity,

submit a ticket to apply for an upgrade.

Overview

The new engine version 8.0.6 of Ververica Runtime (VVR) of Realtime Compute for Apache Flink was officially released on April 1, 2024. This version is an enterprise-level Flink engine based on Apache Flink 1.17.2. This version includes changes in the following aspects:

  • Real-time lakehouse: Apache Paimon data can be written to OSS-HDFS. When you execute the CREATE TABLE AS or CREATE DATABASE AS statement to write data to Apache Paimon, an Apache Paimon table that uses the dynamic bucket mode can be created.

  • Connectors: The issue that a time difference exists when data is synchronized from a MySQL CDC source table to Hologres is fixed in this version. The Hologres connector supports the TIMESTAMP_LTZ data type. The MongoDB CDC connector supports the CREATE TABLE AS and CREATE DATABASE AS statements. The capability to synchronize data from all tables in a MongoDB database is enhanced. The MaxCompute connector allows you to use MaxCompute Upsert Tunnel to write data to Transaction Table 2.0 tables. You can specify a column as a routing key for real-time Elasticsearch indexing. When you use a Kafka connector to write data, an empty value of a column is not written as a null value to a JSON string. This optimizes the Kafka storage usage. In addition, Kafka data can be filtered based on headers during data writing to facilitate data distribution. You can use a Hive catalog to write Hive data to OSS-HDFS. Data can be read from an OceanBase source table by using the OceanBase CDC connector. This helps you build real-time data warehouses based on OceanBase.

  • SQL enhancements: The new aggregate operator WindowAggregate of the CUMULATE function supports update streams. In this VVR version, the TUMBLE, HOP, CUMULATE, and SESSION window functions support update streams. In Apache Flink 1.18 and earlier versions, window functions do not support window aggregation for update streams.

  • Bug fixes: The following issues are fixed: The configuration of the shardWrite parameter for ClickHouse result tables does not take effect. Savepoints of deployments cannot be generated in extreme cases. The bug fixes help improve system stability and reliability.

The following table describes the main features of this version. The canary release will be gradually complete on the entire network. After the canary release is complete, you can upgrade the engine that is used by your deployment to this version. For more information, see Upgrade the engine version of deployments. We look forward to your feedback.

Features

Feature

Description

References

Support for the TIMESTAMP_LTZ data type by the Hologres connector

The TIMESTAMP_LTZ data type is supported by the Hologres connector. This facilitates the processing and analysis of time-rated data and improves data accuracy.

Hologres connector

Enhancement of the MaxCompute connector

  • MaxCompute Upsert Tunnel can be used to write data to Transaction Table 2.0 tables of MaxCompute.

  • You are allowed to specify a schema. After you specify a schema, the MaxCompute connector can read data from and write data to tables in a MaxCompute project for which the schema feature is enabled.

MaxCompute connector

Specified columns used as routing keys for Elasticsearch result tables

You are allowed to specify a column as a routing key to help you use Elasticsearch more efficiently.

Elasticsearch

Update streams supported by the new aggregate operator WindowAggregation of the CUMULATE function

The window aggregation capability for CDC data streams is enhanced.

Queries

Empty column value not written as a null value to a JSON string by using a Kafka connector and data filtering based on headers

Kafka storage usage is optimized and data distribution is improved.

Data reading from a source table by using the OceanBase CDC connector

Tiered real-time data warehouses can be built based on OceanBase.

OceanBase connector (public preview)

OSS-HDFS used as storage for Hive data based on Hive catalogs

After a Hive catalog is configured, you can use the Hive catalog to write Hive data to OSS-HDFS. This way, Hive data is stored in OSS-HDFS. This helps you build a Hive data warehouse based on OSS-HDFS.

Manage Hive catalogs

Creation of non-Hive tables by using DLF-based Hive catalogs

When Data Lake Formation (DLF) is used as the metadata management center for Hive catalogs, you can use Hive catalogs to create non-Hive tables. This helps you use Hive catalogs to manage different types of tables.

Enhanced capabilities of Apache Paimon

  • Apache Paimon data can be written to OSS-HDFS.

  • When you execute the CREATE TABLE AS or CREATE DATABASE AS statement to write data to Apache Paimon, an Apache Paimon table that uses the dynamic bucket mode can be created.

  • All features and bug fixes on the master branch of the Apache Paimon community that are released until March 15, 2024 are supported. For more information, visit Apache Paimon.

N/A

CREATE TABLE AS and CREATE DATABASE AS statements supported by the MongoDB CDC connector

The CREATE TABLE AS or CREATE DATABASE AS statement can be executed to synchronize data and schema changes from a MongoDB database to downstream tables in real time by using the MongoDB Change Data Capture (CDC) connector.

Concurrent reading of full data from a Postgres CDC source table

Full data can be concurrently read from a Postgres CDC source table. This accelerates the synchronization of full data.

PostgreSQL CDC connector (public preview)

Enhanced authentication for access to OSS buckets

After the path of the file system is specified, the authentication information of Object Storage Service (OSS) buckets must be configured to ensure that data can be read from or written to the specified path of the file system.

OSS connector

JSON type supported by the StarRocks connector

Data of the JSON type can be written to StarRocks by using the StarRocks connector to meet specific business requirements.

N/A

Null values written as empty strings to logs by using the Simple Log Service connector

The Simple Log Service connector can be used to write null values as empty strings to logs. This helps you easily process fields that contain null values.

Simple Log Service connector

Bug fixes

  • The issue that the configuration of the shardWrite parameter for ClickHouse result tables does not take effect is fixed.

  • The issue that savepoints of deployments cannot be generated in extreme cases is fixed.

  • All issues in Apache Flink 1.17.2 are fixed. For more information, visit Apache Flink 1.17.2 Release Announcement.