Run or debug a Flink deployment that includes a connector in an on-premises environment - Realtime Compute for Apache Flink

You can run or debug a deployment that includes a connector of Alibaba Cloud Realtime Compute for Apache Flink in an on-premises environment. This helps you quickly check whether the code is correct, identify and resolve issues, and save the cost of cloud migration. This topic describes how to run or debug a deployment that includes a connector of Alibaba Cloud Realtime Compute for Apache Flink in an on-premises environment.

Background information

When you run or debug a deployment that includes a connector of Alibaba Cloud Realtime Compute for Apache Flink in IntelliJ IDEA, an issue that a connector-related class cannot be found may occur. For example, when you run a deployment that includes a MaxCompute connector, the following error occurs:

Caused by: java.lang.ClassNotFoundException: com.alibaba.ververica.connectors.odps.newsource.split.OdpsSourceSplitSerializer
	at java.net.URLClassLoader.findClass(URLClassLoader.java:387)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:355)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:351)

This issue occurs because specific run classes are missing from the default JAR package of the connector. You can perform the following steps to add the classes to the package. This way, you can use the classes to run or debug Flink deployments in Inteillij IDEA.

Step 1: Add dependencies to the deployment configuration

Download the uber JAR package that contains run classes from the Maven central repository. For example, if the version of the ververica-connector-odps dependency of MaxCompute is 1.17-vvr-8.0.4-1, you can view the ververica-connector-odps-1.17-vvr-8.0.4-1-uber.jar package in the directory of the Maven central repository and download the package to your on-premises directory.

When you write code to create an environment, set the pipeline.classpaths parameter to the path of the uber JAR package. If multiple connector dependencies exist, separate the paths of the packages with semicolons (;). For example, you can set this parameter to file:///path/to/a-uber.jar;file:///path/to/b-uber.jar. In Windows, you must add the name of the related disk to the paths, such as file:///D:/path/to/a-uber.jar;file:///E:/path/to/b-uber/jar. The following sample code shows the configuration for a DataStream API deployment:

Configuration conf = new Configuration();
conf.setString("pipeline.classpaths", "file://" + "Absolute path of the uber JAR package.
StreamExecutionEnvironment env =
  	StreamExecutionEnvironment.getExecutionEnvironment(conf);

The following sample code shows the configuration for a Table API deployment:

Configuration conf = new Configuration();
conf.setString("pipeline.classpaths", "file://" + "Absolute path of the uber JAR package.
EnvironmentSettings envSettings =
  	EnvironmentSettings.newInstance().withConfiguration(conf).build();
TableEnvironment tEnv = TableEnvironment.create(envSettings);

Important

Before you upload the JAR package of the deployment to Realtime Compute for Apache Flink, you must delete the added configuration of pipeline.classpaths from the deployment configuration.

Step 2: Configure the ClassLoader JAR package that is required to run the deployment

To enable Flink to load the run classes of the connector, you must add the ClassLoader JAR package to the deployment configuration. Download the ververica-classloader-1.15-vvr-6.0-SNAPSHOT.jar package to your on-premises environment.

The following example shows how to modify the on-premises run configuration of the deployment in IntelliJ IDEA. Click the green icon to the left of the entry class to expand the menu bar and select Modify Run Configuration.

商业版连接器本地运行-IDEA运行1

商业化连接器本地运行-IDEA运行2

In the window that appears, click Modify options and select Modify classpath in the Java section. In the Build and run section, click the plus sign (+) below Modify classpath and select the ClassLoader JAR package. Then, save the run configuration.

商业版连接器本地运行-IDEA配置1

Step 3: Run or debug the deployment

In the upper-right corner of IntelliJ IDEA, click the name of the run configuration to switch to the run configuration that you saved for on-premises running or debugging.

If an error message indicating that common Flink classes such as org.apache.flink.configuration.Configuration are missing appears, you must select Add dependencies with "provided" scope to classpath for Modify options.

商业版连接器本地运行-IDEA运行3

References

If you want to read and write data in DataStream mode, you must use the related type of the DataStream connector to connect to Realtime Compute for Apache Flink. For more information about how to use a DataStream connector and precautions when you use the DataStream connector, see Develop a JAR draft.
For more information about how to develop and debug a Python API draft, see Develop a Python API draft.