You can run or debug a deployment that includes a connector of Alibaba Cloud Realtime Compute for Apache Flink in an on-premises environment. This helps you quickly check whether the code is correct, identify and resolve issues, and save the cost of cloud migration. This topic describes how to run or debug a deployment that includes a connector of Alibaba Cloud Realtime Compute for Apache Flink in an on-premises environment.
Background information
When you run or debug a deployment that includes a connector of Alibaba Cloud Realtime Compute for Apache Flink in IntelliJ IDEA, an issue that a connector-related class cannot be found may occur. For example, when you run a deployment that includes a MaxCompute connector, the following error occurs:
Caused by: java.lang.ClassNotFoundException: com.alibaba.ververica.connectors.odps.newsource.split.OdpsSourceSplitSerializer
at java.net.URLClassLoader.findClass(URLClassLoader.java:387)
at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:355)
at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
This issue occurs because specific run classes are missing from the default JAR package of the connector. You can perform the following steps to add the classes to the package. This way, you can use the classes to run or debug Flink deployments in Inteillij IDEA.
Step 1: Add dependencies to the deployment configuration
Download the uber JAR package that contains run classes from the Maven central repository. For example, if the version of the ververica-connector-odps dependency of MaxCompute is 1.17-vvr-8.0.4-1, you can view the ververica-connector-odps-1.17-vvr-8.0.4-1-uber.jar package in the directory of the Maven central repository and download the package to your on-premises directory.
When you write code to create an environment, set the pipeline.classpaths
parameter to the path of the uber JAR package. If multiple connector dependencies exist, separate the paths of the packages with semicolons (;). For example, you can set this parameter to file:///path/to/a-uber.jar;file:///path/to/b-uber.jar
. In Windows, you must add the name of the related disk to the paths, such as file:///D:/path/to/a-uber.jar;file:///E:/path/to/b-uber/jar
. The following sample code shows the configuration for a DataStream API deployment:
Configuration conf = new Configuration();
conf.setString("pipeline.classpaths", "file://" + "Absolute path of the uber JAR package.
StreamExecutionEnvironment env =
StreamExecutionEnvironment.getExecutionEnvironment(conf);
The following sample code shows the configuration for a Table API deployment:
Configuration conf = new Configuration();
conf.setString("pipeline.classpaths", "file://" + "Absolute path of the uber JAR package.
EnvironmentSettings envSettings =
EnvironmentSettings.newInstance().withConfiguration(conf).build();
TableEnvironment tEnv = TableEnvironment.create(envSettings);
Before you upload the JAR package of the deployment to Realtime Compute for Apache Flink, you must delete the added configuration of pipeline.classpaths
from the deployment configuration.
Step 2: Configure the ClassLoader JAR package that is required to run the deployment
To enable Flink to load the run classes of the connector, you must add the ClassLoader JAR package to the deployment configuration. Download the ververica-classloader-1.15-vvr-6.0-SNAPSHOT.jar package to your on-premises environment.
The following example shows how to modify the on-premises run configuration of the deployment in IntelliJ IDEA. Click the green icon to the left of the entry class to expand the menu bar and select Modify Run Configuration.
In the window that appears, click Modify options and select Modify classpath in the Java section. In the Build and run section, click the plus sign (+) below Modify classpath and select the ClassLoader JAR package. Then, save the run configuration.
Step 3: Run or debug the deployment
In the upper-right corner of IntelliJ IDEA, click the name of the run configuration to switch to the run configuration that you saved for on-premises running or debugging.
If an error message indicating that common Flink classes such as org.apache.flink.configuration.Configuration are missing appears, you must select Add dependencies with "provided" scope to classpath for Modify options.
References
If you want to read and write data in DataStream mode, you must use the related type of the DataStream connector to connect to Realtime Compute for Apache Flink. For more information about how to use a DataStream connector and precautions when you use the DataStream connector, see Develop a JAR draft.
For more information about how to develop and debug a Python API draft, see Develop a Python API draft.