All Products
Search
Document Center

Realtime Compute for Apache Flink:Create a deployment

Last Updated:Oct 28, 2024

After you develop a draft, you must publish the draft as a deployment. This way, development is isolated from production. The deployments that are running are not affected. A draft is officially published only after the deployment for the draft is started or restarted. This topic describes how to create an SQL deployment, a JAR deployment, a YAML deployment, and a Python deployment.

Prerequisites

A draft is developed.

  • An SQL draft is developed before you create an SQL deployment. For more information, see Develop an SQL draft.

  • A YAML draft is developed before you create a YAML deployment.

  • A Python package is developed before you create a Python deployment. For more information, see Develop a Python API draft.

  • A JAR package is developed before you create a JAR deployment. For more information, see Develop a JAR draft.

Limits

  • Only Realtime Compute for Apache Flink that uses Ververica Runtime (VVR) 4.0.0 or later allows you to create a Python deployment.

  • Only Realtime Compute for Apache Flink that uses VVR 8.0.9 or later allows you to create a YAML deployment.

Upload resources

Before you create a deployment, upload a JAR package, a Python deployment file, or a Python dependency file to the development console of Realtime Compute for Apache Flink based on your business requirements.

  1. Log on to the Realtime Compute for Apache Flink console.

  2. On the Fully Managed Flink tab, find the workspace that you want to manage and click Console in the Actions column.

  3. In the left-side navigation pane, click Artifacts.

  4. On the Artifacts page, click Upload Artifact. Select the JAR file, Python deployment file, or Python dependency file that you want to upload.

Note

If your deployment is a Python API deployment, upload the official JAR file of PyFlink. To download the official JAR file of the required version, click PyFlink V1.11 or PyFlink V1.12.

Procedure

  1. Log on to the Realtime Compute for Apache Flink console.

  2. On the Fully Managed Flink tab, find the workspace that you want to manage and click Console in the Actions column. You can perform the following steps based on the type of deployment that you want to create.

    Create an SQL deployment

    1. In the left-side navigation pane, choose Development > ETL. On the page that appears, develop an SQL draft. For more information, see Develop an SQL draft.

    2. After the SQL draft is developed, click Deploy in the upper-right corner of the editor.

    3. In the Deploy draft dialog box, configure the parameters. The following table describes the parameters.

      Parameter

      Description

      Comment

      Optional. Enter the description for the deployment.

      Label

      After you specify labels for a deployment, you can search for the deployment by label key and label value on the Deployments page. You can specify up to three labels for a deployment.

      Deployment Target

      Select the desired queue or session cluster from the drop-down list. We recommend that you do not use session clusters in the production environment. For more information, see Manage queues and the "Step 1: Create a session cluster" section of the Debug a deployment topic.

      Note

      Metrics of deployments that are deployed in session clusters cannot be displayed. Session clusters do not support the monitoring and alerting feature and the Autopilot feature. Session clusters are suitable for development and test environments. We recommend that you do not use session clusters in the production environment. For more information, see Debug a deployment.

      Skip validation on draft before deploying

      If you select this option, you can skip a syntax check before you deploy a draft as a deployment.

    4. In the Deploy draft dialog box, click Confirm.

      You can view the deployment on the Deployments page and start the deployment based on your business requirements.

    Create a YAML deployment

    1. In the left-side navigation pane, choose Development > Data Ingestion. On the page that appears, develop a YAML draft.

    2. After the YAML draft is developed, click Deploy in the upper-right corner of the editor.

    3. In the Deploy draft dialog box, configure the parameters. The following table describes the parameters.

      Parameter

      Description

      Comment

      Optional. Enter the description for the deployment.

      Label

      After you specify labels for a deployment, you can search for the deployment by label key and label value on the Deployments page. You can specify up to three labels for a deployment.

      Deployment Target

      Select the desired queue from the drop-down list. For more information, see Manage queues.

      Skip validation on draft before deploying

      If you select this option, you can skip a syntax check before you deploy a draft as a deployment.

    4. In the Deploy draft dialog box, click Confirm.

      You can view the deployment on the Deployments page and start the deployment based on your business requirements.

    Create a JAR deployment

    1. In the left-side navigation pane, choose O&M > Deployments. On the Deployments page, choose Create Deployment > JAR Deployment in the upper-left corner.

    2. In the Create Jar Deployment dialog box, configure the parameters. The following table describes the parameters.

      Parameter

      Description

      Deployment Mode

      Select Stream Mode or Batch Mode.

      Deployment Name

      Enter the name of the JAR deployment that you want to create.

      Engine Version

      For more information about engine versions, see Engine version and Lifecycle policies. We recommend that you use a recommended version or a stable version.

      • Recommended: the latest minor version of the latest major version.

      • Stable: the latest minor version of a major version that is still in the service period of the product. Defects in previous versions are fixed in such a version.

      • Normal: other minor versions that are still in the service period of the product.

      • Deprecated: the versions that exceed the service period of the product.

      Note

      In VVR 3.0.3 and later, Ververica Platform (VVP) allows you to run JAR deployments that use different engine versions at the same time. The version of the Flink engine that uses VVR 3.0.3 is Flink 1.12. If the engine version of your deployment is Flink 1.12 or earlier, you can perform the following operations to update the engine version based on the engine version that your deployment uses:

      • Flink 1.12: Stop and then restart your deployment. Then, the system automatically updates the engine version of your deployment to vvr-3.0.3-flink-1.12.

      • Flink 1.11 or Flink 1.10: Manually update the engine version of your deployment to vvr-3.0.3-flink-1.12 or vvr-4.0.8-flink-1.13, and then restart the deployment. Otherwise, a timeout error occurs when you start the deployment.

      JAR URI

      Select a file or manually upload a new file. You can drag the file that you want to upload to this field or click the 上传 icon on the right to select the file that you want to upload.

      Note

      If your deployment is a Python API deployment, upload the official JAR file of PyFlink. To download the official JAR file of the required version, click PyFlink V1.11 or PyFlink V1.12.

      Entry Point Class

      Specify the entry point class of the JAR application. If you do not specify a main class for the JAR file, enter a standard path in the Entry Point Class field.

      Note

      If your deployment is a Python API deployment, enter org.apache.flink.client.python.PythonDriver in the Entry Point Class field.

      Entry Point Main Arguments

      You can pass parameters and call them in the main method.

      Note
      • The parameter information cannot be greater than 1,024 characters in length. We recommend that you do not use complex parameters. The parameters that include line breaks, spaces, or other special characters are considered complex parameters. If you want to pass complex parameters, use a dependency file.

      • If your deployment is a Python API deployment, upload the Python deployment file first. By default, after you upload the Python deployment file, the file is uploaded to the /flink/usrlib/ directory of the node that runs the deployment.

        If the Python deployment file is named word_count.py, enter -py /flink/usrlib/word_count.py in the Entry Point Main Arguments field.

        Enter the full path of the Python deployment file. You cannot omit or change /flink/usrlib/.

      Additional Dependencies

      • (Recommended) Select the dependency file that you uploaded.

        You must upload a dependency file in advance. To upload a dependency file, click Upload Artifacts in the upper-left corner of the Artifacts page in the development console of Realtime Compute for Apache Flink. Alternatively, click the 更新JAR icon on the right side of Additional Dependencies in the Create Jar Deployment dialog box. The dependency file that you upload is stored in the artifacts directory of the Object Storage Service (OSS) bucket that you select to associate with your Realtime Compute for Apache Flink workspace when you activate the workspace. The file directory is in the format of oss://<Name of the OSS bucket that is associated with your workspace>/artifacts/namespaces/<Name of the namespace>.

      • Enter the OSS bucket where the required dependency file is stored.

        The OSS bucket to which the dependency file is uploaded must be the OSS bucket that you selected when you activate the current Realtime Compute for Apache Flink workspace.

      • Enter the URL of the required dependency file.

        You must enter the URL of an external storage system that Realtime Compute for Apache Flink can access and is allowed to access. The access control list (ACL) of the external storage system must be public-read or Realtime Compute for Apache Flink must have the permission to access the external storage system. Only URLs that end with file names are supported, such as http://xxxxxx/<file>.

      Note
      • The dependency file that you upload by using one of the preceding methods can also be downloaded to the destination machine. When the deployment is running, the dependency file is loaded to the /flink/usrlib directory of the pod in which the JobManager and TaskManager reside.

      • If you select a session cluster for Deployment Target, you cannot configure the dependency file for the deployment.

      Deployment Target

      Select the desired queue or session cluster from the drop-down list. We recommend that you do not use session clusters in the production environment. For more information, see Manage queues and the "Step 1: Create a session cluster" section of the Debug a deployment topic.

      Note

      Metrics of deployments that are deployed in session clusters cannot be displayed. Session clusters do not support the monitoring and alerting feature and the Autopilot feature. Session clusters are suitable for development and test environments. We recommend that you do not use session clusters in the production environment. For more information, see Debug a deployment.

      Description

      Optional. Enter the description for the deployment.

      Label

      After you specify labels for a deployment, you can search for the deployment by label key and label value on the Deployments page. You can specify up to three labels for a deployment.

      More Setting

      If you turn on the switch, you must configure the following parameters:

      • Kerberos Name: Select a Hive cluster that supports Kerberos authentication from the drop-down list. For more information about how to create a Hive cluster that supports Kerberos authentication, see Register a Hive cluster that supports Kerberos authentication.

      • principal: a Kerberos principal, which can be a user or a service. A Kerberos principal is used to uniquely identify an identity in the Kerberos encryption system.

    3. Click Deploy.

      You can view the deployment on the Deployments page and start the deployment based on your business requirements.

    Create a Python deployment

    1. In the left-side navigation pane, choose O&M > Deployments. On the Deployments page, choose Create Deployment > Python Deployment in the upper-left corner.

    2. In the Create Python Deployment dialog box, configure the parameters. The following table describes the parameters.

      Parameter

      Description

      Deployment Mode

      Select Stream Mode or Batch Mode.

      Deployment Name

      Enter the name of the deployment that you want to create.

      Engine Version

      For more information about engine versions, see Engine version and Lifecycle policies. We recommend that you use a recommended version or a stable version.

      • Recommended: the latest minor version of the latest major version.

      • Stable: the latest minor version of a major version that is still in the service period of the product. Defects in previous versions are fixed in such a version.

      • Normal: other minor versions that are still in the service period of the product.

      • Deprecated: the versions that exceed the service period of the product.

      Note

      In VVR 3.0.3 and later, VVP allows you to run Python deployments that use different engine versions at the same time. The version of the Flink engine that uses VVR 3.0.3 is Flink 1.12. If the engine version of your deployment is Flink 1.12 or earlier, you can perform the following operations to update the engine version based on the engine version that your deployment uses:

      • Flink 1.12: Stop and then restart your deployment. Then, the system automatically updates the engine version of your deployment to vvr-3.0.3-flink-1.12.

      • Flink 1.11 or Flink 1.10: Manually update the engine version of your deployment to vvr-3.0.3-flink-1.12 or vvr-4.0.8-flink-1.13, and then restart the deployment. Otherwise, a timeout error occurs when you start the deployment.

      Python Uri

      Specify the Uniform Resource Identifier (URI) used to access the Python deployment file that you want to upload. Python deployment files can be .py files or .zip files.

      Entry Module

      Specify the entry point class of the Python application. If the Python deployment file that you select is a .py file, you do not need to configure this parameter. If the Python deployment file that you select is a .zip file, you must configure this parameter. For example, you can enter example.word_count in the Entry Module field.

      Entry Point Main Arguments

      Enter the parameters of the deployment.

      Python Libraries

      Specify the third-party Python package that you want to use. The third-party Python package that you uploaded is added to PYTHONPATH of the Python worker process so that the package can be directly accessed in Python user-defined functions (UDFs). For information about how to use third-party Python libraries, see the "Use a third-party Python package" section of the Use Python dependencies topic.

      Python Archives

      Specify the archive files. Only ZIP files such as .zip, .jar, .whl, and .egg are supported.

      Archive files are decompressed to the working directory of the Python worker process. For example, if the name of the compressed file where the archive files are located is mydata.zip, the following code can be written in Python UDFs to access the mydata.zip archive file.

      def map():  
          with open("mydata.zip/mydata/data.txt") as f: 
          ...

      For more information, see the Use a custom Python virtual environment and Use data files sections of the "Use Python dependencies" topic.

      Additional Dependencies

      You can upload a file for the deployment, such as a Python deployment file and a data file that the deployment requires. For more information about Python dependencies, see Python Dependency.

      • (Recommended) Select the dependency file that you uploaded.

        You must upload a dependency file in advance. To upload a dependency file, click Upload Artifacts in the upper-left corner of the Artifacts page in the development console of Realtime Compute for Apache Flink. Alternatively, click the 更新JAR icon on the right side of Additional Dependencies in the Create Jar Deployment dialog box. The dependency file that you upload is stored in the artifacts directory of the Object Storage Service (OSS) bucket that you select to associate with your Realtime Compute for Apache Flink workspace when you activate the workspace. The file directory is in the format of oss://<Name of the OSS bucket that is associated with your workspace>/artifacts/namespaces/<Name of the namespace>.

      • Enter the OSS bucket where the required dependency file is stored.

        The OSS bucket to which the dependency file is uploaded must be the OSS bucket that you selected when you activate the current Realtime Compute for Apache Flink workspace.

      • Enter the URL of the required dependency file.

        You must enter the URL of an external storage system that Realtime Compute for Apache Flink can access and is allowed to access. The access control list (ACL) of the external storage system must be public-read or Realtime Compute for Apache Flink must have the permission to access the external storage system. Only URLs that end with file names are supported, such as http://xxxxxx/<file>.

      Note
      • The dependency file that you upload by using one of the preceding methods can also be downloaded to the destination machine. When the deployment is running, the dependency file is loaded to the /flink/usrlib directory of the pod in which the JobManager and TaskManager reside.

      • If you select a session cluster for Deployment Target, you cannot configure the dependency file for the deployment.

      Deployment Target

      Select the desired queue or session cluster from the drop-down list. We recommend that you do not use session clusters in the production environment. For more information, see Manage queues and the "Step 1: Create a session cluster" section of the Debug a deployment topic.

      Note

      Metrics of deployments that are deployed in session clusters cannot be displayed. Session clusters do not support the monitoring and alerting feature and the Autopilot feature. Session clusters are suitable for development and test environments. We recommend that you do not use session clusters in the production environment. For more information, see Debug a deployment.

      Description

      Optional. Enter the description for the deployment.

      Label

      After you specify labels for a deployment, you can search for the deployment by label key and label value on the Deployments page. You can specify up to three labels for a deployment.

      More Setting

      If you turn on the switch, you must configure the following parameters:

      • Kerberos Name: Select a Hive cluster that supports Kerberos authentication from the drop-down list. For more information about how to create a Hive cluster that supports Kerberos authentication, see Register a Hive cluster that supports Kerberos authentication.

      • principal: a Kerberos principal, which can be a user or a service. A Kerberos principal is used to uniquely identify an identity in the Kerberos encryption system.

    3. Click Deploy.

      You can view the deployment on the Deployments page and start the deployment based on your business requirements.

References

  • You can configure and modify a deployment and the resources of the deployment before you start the deployment or after you publish the draft for the deployment. For more information, see Configure a deployment and Configure resources for a deployment.

  • After you create a deployment, you can start the deployment on the Deployments page to run the deployment. For more information about how to start a deployment, see Start a deployment.