This topic describes the types of tables and the connectors that are supported by Realtime Compute for Apache Flink.
Table types
Alibaba Cloud Realtime Compute for Apache Flink allows you to use Flink SQL to define a table that provides the mappings between the upstream and downstream storage, or use the DataStream API to access the upstream and downstream storage to perform read and write operations. Realtime Compute for Apache Flink supports the following types of tables that are defined by using Flink SQL:
Source table
A source table serves as a data input table of a Realtime Compute for Apache Flink deployment and triggers a stream processing.
A source table cannot be used as a dimension table and must be used as a driving table to promote subsequent computing. The computing results of a source table trigger the computing chain.
In most cases, a source table can be used for conversion and computing of up to tens of millions or even hundreds of millions of data records.
A source table provides streaming data inputs to trigger and push the data processing of Realtime Compute for Apache Flink. The continuous stream of new data can be from storage such as a message queue service or database change logs.
A source table contains key fields that can be joined and associated, such as user ID and order ID as primary keys.
Dimension table
A dimension table is an auxiliary table, which is used as an extension of the data of a source table.
A dimension table cannot be used as a driving table. It can only be used as a supplement for a source table. A dimension table does not drive computing.
In most cases, a dimension table can store only a small amount of data, such as gigabytes or terabytes of data. A dimension table can be a static table or a low-throughput streaming table.
A dimension table provides supplemental information to business data, such as user names, product details, and region information.
A dimension table can be joined with a source table to enrich the data of the source table to form a more detailed wide table.
Sink table
A sink table is an output data table of a Realtime Compute for Apache Flink deployment.
A sink table stores the final result data after computing and conversion, such as aggregation results and data obtained after filtering.
The final result data that is stored in a sink table can be exported to external systems, such as databases, message queues, and file systems, for subsequent analysis.
A sink table is the final output of the entire deployment processing chain. It stores the output of the computing.
For example, the following source table and dimension table are provided for data analysis in a deployment:
Source table: an order table that contains columns such as user ID, order ID, and order amount.
Dimension table: a user information table that contains columns such as user ID, user name, and address. This is a static table.
When the deployment runs, Realtime Compute for Apache Flink reads real-time order data from the source table and joins the order data stream with the user information in the dimension table. Realtime Compute for Apache Flink aggregates data to obtain the total order amount by region and writes the aggregation results to the sink table.
In this deployment, the order table is used as the source table, the user information table is used as the dimension table, and the statistical sink table is used as the final output of the deployment. The order table cannot be used as a dimension table and must be used as a driving table for data input. The user information table cannot be used as a driving table and can only be used as a dimension table to provide additional order data.
Supported connectors
Connector | Table type | Running mode | API type | Data update or deletion in a sink table | ||
Source table | Dimension table | Sink table | ||||
√ | × | √ | Streaming mode | SQL API, DataStream API, and data ingestion YAML API | Data in a sink table cannot be updated or deleted. Data can only be inserted into a sink table. | |
√ | √ | √ | Streaming mode and batch mode | SQL API, DataStream API, and data ingestion YAML API | Supported | |
√ | × | √ | Streaming mode | SQL API and DataStream API | Data in a sink table cannot be updated or deleted. Data can only be inserted into a sink table. | |
Note The MySQL connector supports ApsaraDB RDS for MySQL, PolarDB for MySQL, and self-managed MySQL databases. | √ | √ | √ | Streaming mode | SQL API, DataStream API, and data ingestion YAML API | Supported |
ApsaraDB RDS for MySQL connector Note The ApsaraDB RDS for MySQL connector will not be supported in the future. We recommend that you use the MySQL connector instead of the ApsaraDB RDS for MySQL connector. | × | √ | √ | Streaming mode and batch mode | SQL API | Supported |
√ | √ | √ | Streaming mode and batch mode | SQL API and DataStream API | Data in a sink table cannot be updated or deleted. Data can only be inserted into a sink table. | |
× | √ | √ | Streaming mode | SQL API | Supported | |
√ | × | √ | Streaming mode | SQL API and data ingestion YAML API | Supported | |
√ | √ | √ | Streaming mode and batch mode | SQL API and DataStream API | Supported | |
× | √ | √ | Streaming mode and batch mode | SQL API | Supported | |
× | × | √ | Streaming mode and batch mode | SQL API | Supported | |
× | × | √ | Streaming mode and batch mode | SQL API and data ingestion YAML API | Supported | |
× | × | √ | Streaming mode and batch mode | SQL API | Supported | |
× | √ | √ | Streaming mode | SQL API | Supported | |
√ | × | × | Streaming mode and batch mode | SQL API | N/A | |
√ | × | √ | Streaming mode | SQL API and DataStream API | Data in a sink table cannot be updated or deleted. Data can only be inserted into a sink table. | |
√ | √ | √ | Streaming mode | SQL API | Supported | |
√ | √ | √ | Streaming mode and batch mode | SQL API | Supported | |
√ | × | √ | Streaming mode and batch mode | SQL API, DataStream API, and data ingestion YAML API | Supported | |
√ | × | × | Streaming mode | SQL API | N/A | |
× | √ | √ | Streaming mode and batch mode | SQL API | Supported | |
× | √ | √ | Streaming mode | SQL API | Supported | |
√ | × | √ | Streaming mode and batch mode | SQL API and DataStream API | Data in a sink table cannot be updated or deleted. Data can only be inserted into a sink table. | |
√ | √ | × | Streaming mode and batch mode | SQL API | N/A | |
√ | × | √ | Streaming mode and batch mode | SQL API | Supported | |
× | × | √ | Streaming mode | SQL API | Not supported | |
√ | √ | √ | Streaming mode and batch mode | SQL API and data ingestion YAML API | Supported | |
√ | × | √ | Streaming mode and batch mode | SQL API and DataStream API | Supported | |
× | × | √ | Streaming mode | SQL API | Supported | |
√ | √ | √ | Streaming mode and batch mode | SQL API | Supported | |
√ | √ | √ | Streaming mode | SQL API and DataStream API | Supported | |
× | × | √ | Streaming mode and batch mode | SQL API | Supported |