Getting started with a JAR deployment - Realtime Compute for Apache Flink

This topic describes how to create and start a JAR streaming deployment and a JAR batch deployment in the development console of Realtime Compute for Apache Flink.

Prerequisites

A RAM user or RAM role has the required permissions. This prerequisite must be met if you want to use the RAM user or RAM role to access the development console of Realtime Compute for Apache Flink. For more information, see Permission management.
A Realtime Compute for Apache Flink workspace is created. For more information, see Activate Realtime Compute for Apache Flink.

Step 1: Develop a JAR package

JAR packages cannot be developed in the management console of Realtime Compute for Apache Flink. Therefore, you must develop, compile, and package JAR files in your on-premises environment. For more information about how to configure environment dependencies, use connectors, and read data from an additional dependency file stored in Object Storage Service (OSS), see Develop a JAR draft.

Important

The Flink version that is used when you develop a JAR package must be the same as the Flink version in the engine version that is selected in Step 3: Create a JAR deployment. Take note of the scope of the dependency package.

To help you quickly get started with a JAR streaming deployment and a JAR batch deployment in the development console of Realtime Compute for Apache Flink, a test JAR package and an input data file are provided for subsequent operations. This test JAR package is used to collect the number of times a word appears in the input data file.

Download the test JAR package FlinkQuickStart-1.0-SNAPSHOT.jar.
If you want to analyze the source code, download the package FlinkQuickStart.zip and compile the code.
Download the input data file Shakespeare.

Step 2: Upload the test JAR package and input data file

Log on to the Realtime Compute for Apache Flink console.
Find the workspace that you want to manage and click Console in the Actions column.
In the left-side navigation pane of the development console of Realtime Compute for Apache Flink, click Artifacts.
In the upper-left corner of the Artifacts page, click Upload Artifact and upload the test JAR package and input data file.
In this topic, the test JAR package FlinkQuickStart-1.0-SNAPSHOT.jar and the input data file Shakespeare that are downloaded in Step 1 are uploaded. For more information about the directories of the files, see Artifacts.

Step 3: Create a JAR deployment

Streaming deployment

In the left-side navigation pane of the development console of Realtime Compute for Apache Flink, choose O&M > Deployments. In the upper-left corner of the Deployments page, choose Create Deployment > JAR Deployment.

In the Create Jar Deployment dialog box, configure the parameters. The following table describes the parameters.

部署jar.jpg

Parameter	Description	Example
Deployment Mode	The mode that you want to use to deploy the JAR deployment. Select Stream Mode.	Stream Mode
Deployment Name	The name of the JAR deployment.	flink-streaming-test-jar
Engine Version	The engine version that is used by the current deployment. We recommend that you use an engine version that has the RECOMMENDED or STABLE label. Versions with the labels provide higher reliability and performance. For more information, see Release notes and Engine version.	vvr-8.0.9-flink-1.17
JAR URI	The JAR package. You can use the test JAR package FlinkQuickStart-1.0-SNAPSHOT.jar that you uploaded in Step 2 or click the icon on the right side of the JAR URI field to select and upload your test JAR package. Note Realtime Compute for Apache Flink that uses Ververica Runtime (VVR) 8.0.6 or later allows you to access only OSS buckets that are associated with a Realtime Compute for Apache Flink workspace when you activate the workspace. Other OSS buckets are not supported.	-
Entry Point Class	The entry point class of the JAR application. If you do not specify a main class for the JAR package, enter a standard directory in the Entry Point Class field. In this example, the test JAR package contains both streaming deployment code and batch deployment code. Therefore, you must configure this parameter to specify a program entry point for the streaming deployment.	org.example.WordCountStreaming
Entry Point Main Arguments	The parameters that you want to call in the main method. In this example, enter the directory in which the input data file Shakespeare is stored.	`--input oss://<Name of the associated OSS bucket>/artifacts/namespaces/<Name of the workspace>/Shakespeare` You can go to the Artifacts page and click the name of the input data file Shakespeare to copy the complete directory.
Deployment Target	The destination in which the deployment is deployed. Select the desired queue or session cluster from the drop-down list. We recommend that you do not use session clusters in the production environment. For more information, see Manage queues and Step 1: Create a session cluster. Important Monitoring metrics of deployments that are deployed in session clusters cannot be displayed. Session clusters do not support the monitoring and alerting feature and the Autopilot feature. Session clusters are suitable for development and test environments. We recommend that you do not use session clusters in the production environment. For more information, see Debug a deployment.	default-queue

For more information about other deployment parameters, see Create a deployment.

Click Deploy.

Batch deployment

In the left-side navigation pane of the development console of Realtime Compute for Apache Flink, choose O&M > Deployments. In the upper-left corner of the Deployments page, choose Create Deployment > JAR Deployment.

In the Create Jar Deployment dialog box, configure the parameters. The following table describes the parameters.

部署jar_批_ch.jpg

Parameter	Description	Example
Deployment Mode	The mode that you want to use to deploy the JAR deployment. Select Batch Mode.	Batch Mode
Deployment Name	The name of the JAR deployment.	flink-batch-test-jar
Engine Version	The engine version that is used by the current deployment. We recommend that you use an engine version that has the RECOMMENDED or STABLE label. Versions with the labels provide higher reliability and performance. For more information, see Release notes and Engine version.	vvr-8.0.9-flink-1.17
JAR URI	The JAR package. You can use the test JAR package FlinkQuickStart-1.0-SNAPSHOT.jar that you uploaded in Step 2 or click the icon on the right side of the JAR URI field to select and upload your test JAR package.	-
Entry Point Class	The entry point class of the program. If you do not specify a main class for the JAR package, enter a standard directory in the Entry Point Class field. In this example, the test JAR package contains both streaming deployment code and batch deployment code. Therefore, you must configure this parameter to specify a program entry point for the batch deployment.	org.example.WordCountBatch
Entry Point Main Arguments	The parameters that you want to call in the main method. In this example, enter the directory in which the input data file Shakespeare and the output data file batch-quickstart-test-output.txt are stored. Note You need to only specify the directory of the output data file. You do not need to create an output data file in the specified directory in advance. The directory of the output data file must be the same as the directory of the input data file.	`--input oss://<Name of the associated OSS bucket>/artifacts/namespaces/<Name of the workspace>/Shakespeare` `--output oss://<Name of the associated OSS bucket>/artifacts/namespaces/<Name of the workspace>/batch-quickstart-test-output.txt` You can go to the Artifacts page and click the name of the input data file Shakespeare to copy the complete directory.
Deployment Target	The destination in which the deployment is deployed. Select the desired queue or session cluster from the drop-down list. We recommend that you do not use session clusters in the production environment. For more information, see Manage queues and Step 1: Create a session cluster. Important Monitoring metrics of deployments that are deployed in session clusters cannot be displayed. Session clusters do not support the monitoring and alerting feature and the Autopilot feature. Session clusters are suitable for development and test environments. We recommend that you do not use session clusters in the production environment. For more information, see Debug a deployment.	default-queue

For more information about other deployment parameters, see Create a deployment.

Click Deploy.

Step 4: Start the deployment and view the computing result

Streaming deployment

In the left-side navigation pane of the development console of Realtime Compute for Apache Flink, choose O&M > Deployments. On the Deployments page, find the desired deployment and click Start in the Actions column.
In the Start Job panel, select Initial Mode and click Start. For more information about how to start a deployment, see Start a deployment.
After the deployment enters the RUNNING state, view the computing result of the streaming deployment.
On the Deployments page, click the name of the desired deployment. On the page that appears, click Logs. On the Running Task Managers tab, click the value in the Path, ID column. On the page that appears, click the Log List tab. Find the log file whose name ends with .out in the Log Name column and click the name of the log file. Then, search for the shakespeare keyword in the log file to view the computing result.

Batch deployment

In the left-side navigation pane of the development console of Realtime Compute for Apache Flink, choose O&M > Deployments. On the Deployments page, find the desired deployment and click Start in the Actions column.
In the Start Job panel, click Start. For more information about how to start a deployment, see Start a deployment.
After the deployment enters the FINISHED state, view the computing result of the batch deployment.
Log on to the OSS console and view the computing result in the oss://<Name of the associated OSS bucket>/artifacts/namespaces/<Name of the workspace>/batch-quickstart-test-output.txt directory.

Note

The Taskmanager.out log file contains a maximum of 2,000 data records. Therefore, the number of data records in the computing result of a streaming deployment is different from the number of data records in the computing result of a batch deployment. For more information about the limits on the number of data records that the Taskmanager.out log file contains, see Print connector.

Step 5: (Optional) Cancel the deployment

If you modify the SQL code for a deployment, add or delete parameters to or from the WITH clause, or change the version of a deployment, you must deploy the draft of the deployment, cancel the deployment, and then start the deployment to make the changes take effect. If a deployment fails and cannot reuse the state data to recover or you want to update the parameter settings that do not dynamically take effect, you must cancel and then start the deployment. For more information about how to cancel a deployment, see Cancel a deployment.

References

You can configure resources for a deployment before you start the deployment. You can also modify the resource configurations of a deployment after you publish the draft for the deployment. Realtime Compute for Apache Flink provides the following resource configuration modes: basic mode (coarse-grained) and expert mode (fine-grained). For more information, see Configure resources for a deployment.
You can dynamically update the parameter configuration of a Realtime Compute for Apache Flink deployment. This makes the parameter configuration take effect more quickly and helps reduce the service interruption time caused by deployment startup and cancellation. For more information, see Dynamically update the parameter configuration for dynamic scaling.
You can configure parameters to export logs of a deployment to an external storage and specify the level of the logs that you want to export. For more information, see Configure parameters to export logs of a deployment.
For more information about how to create an SQL deployment for Realtime Compute for Apache Flink, see Getting started with an SQL deployment.
You can build a real-time data warehouse by using Realtime Compute for Apache Flink and Hologres. For more information, see Build a real-time data warehouse by using Realtime Compute for Apache Flink and Hologres.
You can build an OpenLake-based streaming data lakehouse by using Realtime Compute for Apache Flink. For more information, see Build an OpenLake-based streaming data lakehouse by using Realtime Compute for Apache Flink.