This topic describes how to use DataStream connectors to read and write data.
Background information
If you want to use a DataStream connector to read and write data, you must use the related type of DataStream connector to connect to Realtime Compute for Apache Flink. You can use the Ververica Runtime (VVR) DataStream connectors that are stored in the Maven central repository to develop drafts. You can use a connector by using one of the following methods:
Package the connector as a project dependency into the JAR file of your draft
ImportantInterfaces and parameters may change in the future. We recommend that you use the connectors that are specified to provide DataStream APIs in Supported connectors.
Commercial encryption protection is supported for DataStream connectors. If you run a deployment that includes a DataStream connector, an error is reported. For more information about how to run or debug a Realtime Compute for Apache Flink deployment that includes a connector in an on-premises environment, see Run or debug a Flink deployment that includes a connector in an on-premises environment.
(Recommended) Upload the JAR file of the connector to the development console of Realtime Compute for Apache Flink and configure parameters
Log on to the Realtime Compute for Apache Flink console.
Find the workspace that you want to manage and click Console in the Actions column.
In the left-side navigation pane, click Artifacts.
In the upper-right corner of the page, click Upload Artifact and select the JAR file of the connector that you want to upload.
You can upload the JAR file of the connector that you develop or the connector provided by Realtime Compute for Apache Flink. For the download links of the official JAR files provided by Realtime Compute for Apache Flink, see Connectors.
On the SQL Editor page of the desired draft, select the JAR file of the connector that you want to use in the Additional Dependencies section.
Package the connector as a project dependency into the JAR file of your draft
Step 1: Prepare the development environment for a DataStream draft
Add the following configurations to the pom.xml file of the Maven project to reference SNAPSHOT repositories:
<repositories> <repository> <id>oss.sonatype.org-snapshot</id> <name>OSS Sonatype Snapshot Repository</name> <url>http://oss.sonatype.org/content/repositories/snapshots</url> <releases> <enabled>false</enabled> </releases> <snapshots> <enabled>true</enabled> </snapshots> </repository> <repository> <id>apache.snapshots</id> <name>Apache Development Snapshot Repository</name> <url>https://repository.apache.org/content/repositories/snapshots/</url> <releases> <enabled>false</enabled> </releases> <snapshots> <enabled>true</enabled> </snapshots> </repository> </repositories>
Check whether the settings.xml configuration file contains the
<mirrorOf>*</mirrorOf>
configuration.If the
<mirrorOf>*</mirrorOf>
configuration is contained in the configuration file, the current mirror contains all repositories and the Maven project does not download dependencies from the preceding two specified SNAPSHOT repositories. As a result, the Maven project cannot download SNAPSHOT dependencies from these repositories. To prevent the preceding issue, perform the following operations based on the actual scenario:If the
<mirrorOf>*</mirrorOf>
configuration is contained in the configuration file, change the configuration to<mirrorOf>*,!oss.sonatype.org-snapshot,!apache.snapshots</mirrorOf>
.If the
<mirrorOf>external:*</mirrorOf>
configuration is contained in the configuration file, change the configuration to<mirrorOf>external:*,!oss.sonatype.org-snapshot,!apache.snapshots</mirrorOf>
.If the
<mirrorOf>external:http:*</mirrorOf>
configuration is contained in the configuration file, change the configuration to<mirrorOf>external:http:*,!oss.sonatype.org-snapshot,!apache.snapshots</mirrorOf>
.
Add the required connectors as project dependencies to the pom.xml file of the Maven project for your draft.
Different connector versions may correspond to different connector types. We recommend that you use the latest version of the type of the connector that you use. For the complete dependency information, see the pom.xml file in the example of MaxCompute-Demo, DataHub-Demo, Kafka-Demo, or RocketMQ-Demo. The following example shows the project dependency code for a MaxCompute incremental source table.
<dependency> <groupId>com.alibaba.ververica</groupId> <artifactId>ververica-connector-continuous-odps</artifactId> <version>${connector.version}</version> </dependency>
Add the public package
flink-connector-base
of the connectors together with the connectors as project dependencies:<dependency> <groupId>org.apache.flink</groupId> <artifactId>flink-connector-base</artifactId> <version>${flink.version}</version> </dependency>
In the preceding file,
${flink.version}
indicates the Flink version that corresponds to the runtime environment of the draft. If the engine version of your draft is1.15-vvr-6.0.7
, the Flink version is1.15.0
.ImportantYou must search for the connector versions that contain the SNAPSHOT keyword in the SNAPSHOT repository oss.sonatype.org. You cannot find the versions in the Maven central repository search.maven.org.
If you use multiple connectors, you must merge the files in the META-INF directory. To merge the files, add the following code to the pom.xml file:
<transformers> <!-- The service transformer is needed to merge META-INF/services files --> <transformer implementation="org.apache.maven.plugins.shade.resource.ServicesResourceTransformer"/> <transformer implementation="org.apache.maven.plugins.shade.resource.ApacheNoticeResourceTransformer"> <projectName>Apache Flink</projectName> <encoding>UTF-8</encoding> </transformer> </transformers>
Step 2: Develop a DataStream draft
For more information about the configuration information and sample code of DataStream connectors, see the following topics in the DataStream connector documentation:
Step 3: Package the program and publish a DataStream draft
Use Maven to package the program and upload the generated JAR file to the development console of Realtime Compute for Apache Flink. For more information, see Create a JAR deployment.