Getting started with an SQL deployment - Realtime Compute for Apache Flink

This topic describes how to develop and deploy a simple SQL draft, start the deployment for the draft, and cancel the deployment if required. This helps you get started with SQL deployments in Realtime Compute for Apache Flink.

Prerequisites

A RAM user or RAM role has the required permissions. This prerequisite must be met if you want to use the RAM user or RAM role to access the development console of Realtime Compute for Apache Flink. For more information, see Permission management.
A Realtime Compute for Apache Flink workspace is created. For more information, see Activate Realtime Compute for Apache Flink.

Step 1: Create an SQL draft

Go to the page on which you can create an SQL draft.
1. Log on to the Realtime Compute for Apache Flink console.
2. Find the workspace that you want to manage and click Console in the Actions column.
3. In the left-side navigation pane of the development console of Realtime Compute for Apache Flink, choose Development > ETL.
In the upper-left corner of the Drafts page, click New. On the SQL Scripts tab of the New Draft dialog box, select Blank Stream Draft and click Next.
Realtime Compute for Apache Flink provides various code templates and supports data synchronization. Each code template is applicable to specific scenarios and provides code samples and instructions for you. You can click a template to learn about the features and the related syntax of Realtime Compute for Apache Flink and implement your business logic. For more information, see Code templates and Data synchronization templates.

Configure the parameters for the draft.

Parameter	Description	Example
Name	The name of the draft that you want to create. Note The draft name must be unique in the current namespace.	flink-test
Location	The folder in which the code file of the draft is stored. You can also click the icon to the right of an existing folder to create a subfolder.	Draft
Engine Version	The engine version that is used by the current draft. We recommend that you use an engine version that has the RECOMMENDED or STABLE label. Versions with the labels provide higher reliability and performance. For more information about engine versions, see Release notes and Engine version.	vvr-8.0.8-flink-1.17

Click Create.

Step 2: Write code for the draft

Copy the following SQL statements to the SQL editor. In this example, the Datagen connector is used to generate a random data stream and the Print connector is used to display the computing result in the development console of Realtime Compute for Apache Flink. For more information about supported connectors, see Supported connectors.

-- Create a temporary source table named datagen_source. 
CREATE TEMPORARY TABLE datagen_source(
  randstr VARCHAR
) WITH (
  'connector'='datagen' -- Use the Datagen connector.
);

-- Create a temporary result table named print_table. 
CREATE TEMPORARY TABLE print_table(
  randstr  VARCHAR
) WITH (
  'connector' = 'print',   -- Use the Print connector.
  'logger' = 'true'        -- Display the computing result in the development console of Realtime Compute for Apache Flink.
);

-- Display the data of the randstr field in the print_table table. 
INSERT INTO print_table
SELECT SUBSTRING(randstr,0,8) from datagen_source;

Note

In this example, the INSERT INTO statement is used to write data to a sink. You can also use the INSERT INTO statement to write data to multiple sinks. For more information, see INSERT INTO statement.
When you create a draft, we recommend that you use tables that have been registered in catalogs to reduce the use of temporary tables. For more information, see Manage catalogs.

Step 3: View the configuration information

On the right-side tab of the SQL editor, you can view the configurations or configure the parameters.

Tab name	Configuration description
Configurations	Engine Version: the version of the Flink engine that you select when you create the draft. For more information about engine versions, see Engine versions and Lifecycle policies. We recommend that you use a recommended version or a stable version. Valid values: Recommend: the latest minor version of the current major version. Stable: the latest minor version of the major version that is still in use within the validity period for renewal. Defects in previous versions are fixed. Normal: other minor versions that are still in use within the validity period for renewal. Deprecated: the version that expires. Additional Dependencies: the additional dependencies that are used in the draft, such as temporary functions.
Structure	Flow Diagram: the flow diagram that allows you to view the directions in which data flows. Tree Diagram: the tree diagram that allows you to view the source from which data is processed.
Versions	You can view the engine version of the deployment. For more information about the operations that you can perform in the Actions column in the Draft Versions panel, see Manage deployment versions.

Step 4: (Optional) Perform a syntax check

Check the SQL semantics of the draft, network connectivity, and the metadata information of the tables that are used by the draft. You can also click SQL Advice in the calculated results to view information about SQL risks and related optimization suggestions.

In the upper-right corner of the SQL editor, click Validate.
In the Validate dialog box, click Confirm.

Step 5: (Optional) Debug the draft

You can use the debugging feature to simulate deployment running, check outputs, and verify the business logic of the SELECT and INSERT statements. This feature improves the development efficiency and reduces the risks of poor data quality.

In the upper-right corner of the SQL editor, click Debug.
In the Debug dialog box, select the cluster that you want to debug and click Next.
If no cluster is available, create a session cluster. Make sure that the session cluster uses the same engine version as that of the SQL draft and that the session cluster is running. For more information, see Step 1: Create a session cluster.
Configure debugging data and click Confirm.
For more information, see Step 2: Debug a deployment.

Step 6: Deploy the draft

In the upper-right corner of the SQL editor, click Deploy. In the Deploy draft dialog box, configure the related parameters and click Confirm.

Note

Session clusters are applicable to non-production environments, such as development and test environments. You can deploy or debug drafts in a session cluster to improve the resource utilization of the JobManager and accelerate the deployment startup. However, we recommend that you do not deploy drafts that are in the production environment in session clusters. If you deploy drafts that are in the production environment in session clusters, stability issues may occur.

Step 7: Start the deployment for the draft and view the startup result

In the left-side navigation pane, choose O&M > Deployments.
Find the deployment that you want to start and click Start in the Actions column.
In the Start Job panel, select Initial Mode and click Start. When the deployment enters the RUNNING state, the deployment is running as expected. For more information about the parameters that you must configure when you start a deployment, see Start a deployment.
On the Deployments page, view the computing result.
1. In the left-side navigation pane of the development console of Realtime Compute for Apache Flink, choose O&M > Deployments. Find the desired deployment and click the name of the deployment.
2. Click the Logs tab. Then, click the Running Task Managers tab and view the jobs in the Path, ID column.
3. In the left-side navigation pane of the Running Task Managers tab, click Logs. Then, click the Logs tab and search for logs related to PrintSinkOutputWriter.

Step 8: (Optional) Cancel the deployment

If you modify the SQL code for a deployment, add or delete parameters to or from the WITH clause, or change the version of a deployment, you must deploy the draft of the deployment, cancel the deployment, and then start the deployment to make the changes take effect. If a deployment fails and cannot reuse the state data to recover or you want to update the parameter settings that do not dynamically take effect, you must cancel and then start the deployment. For more information about how to cancel a deployment, see Cancel a deployment.

References

You can configure resources for a deployment before you start the deployment. You can also modify the resource configurations of a deployment after you publish the draft for the deployment. Realtime Compute for Apache Flink provides the following resource configuration modes: basic mode (coarse-grained) and expert mode (fine-grained). For more information, see Configure resources for a deployment.
You can configure parameters to export logs of a deployment to an external storage and specify the level of the logs that you want to export. For more information, see Configure parameters to export logs of a deployment.
For more information about how to create a JAR deployment for Realtime Compute for Apache Flink, see Getting started with a JAR deployment.
For more information about how to create a Python deployment for Realtime Compute for Apache Flink, see Getting started with a Python deployment.
For more information about how to ingest data into data warehouses in real time, see Ingest data into data warehouses in real time.
You can build a real-time data warehouse by using Realtime Compute for Apache Flink and Hologres. For more information, see Build a real-time data warehouse by using Realtime Compute for Apache Flink and Hologres.
You can build an OpenLake-based streaming data lakehouse by using Realtime Compute for Apache Flink. For more information, see Build an OpenLake-based streaming data lakehouse by using Realtime Compute for Apache Flink.