Promo Center

50% off for new user

Direct Mail-46% off

Learn More

Debug a draft

Updated at: 2025-01-24 03:59

You can enable draft debugging to simulate deployment running, check outputs, and verify the business logic of SELECT and INSERT statements. This feature improves development efficiency and reduces the risks of poor data quality. This topic describes how to debug an SQL draft for Realtime Compute for Apache Flink.

Background information

The draft debugging feature allows you to verify the correctness of the draft logic in the console of fully managed Flink. During the debugging process, data is not written to the result table regardless of the type of the result table. When you use the draft debugging feature, you can use the upstream online data or specify debugging data. You can debug complex drafts that include multiple SELECT or INSERT statements, as well as UPSERT statements that contain update operations, such as count(*).

Limits

  • To use the draft debugging feature, you must create a session cluster.

  • You can debug only SQL drafts.

  • You cannot debug drafts that contain the CREATE TABLE AS or CREATE DATABASE AS statement.

  • You cannot debug data of MySQL CDC source tables for session clusters of Realtime Compute for Apache Flink that uses VVR 4.0.8 or earlier. This is because MySQL CDC source tables are not written in append-only mode.

  • By default, Realtime Compute for Apache Flink reads a maximum of 1,000 data records, beyond which data reading is stopped.

Usage notes

  • When you create a session cluster, the cluster resources are consumed. The resource consumption is based on the configurations that you select when you create the cluster.

  • Session clusters are suitable for development and test environments. Do not use session clusters in the production environment. If you debug a draft in a session cluster, the resource utilization of the JobManager increases. If you use a session cluster in the production environment, the reuse mechanism of the JobManager negatively affects the stability among deployments. The following stability issues may occur:

    • If the JobManager is faulty, all deployments of a cluster that runs on the JobManager are affected.

    • If a TaskManager is faulty, the deployments that have tasks running on the TaskManager are affected.

    • If processes are not isolated for tasks that run on the same TaskManager, the tasks may be affected by each other.

  • If the session cluster uses the default configurations, take note of the following points:

    • For a single small deployment, we recommend that the total number of such deployments in a cluster be no more than 100.

    • For complex deployments, limit deployment parallelism to 512; In a single cluster, limit the number of medium-sized deployments (each with parallelism 64) to 32. Otherwise, issues such as heartbeat timeout may occur, affecting and the stability of the cluster. In this case, you must increase the heartbeat interval and heartbeat timeout period.

    • If you want to run more tasks in parallel, you must increase the resource configuration of the session cluster.

Procedure

Step 1: Create a session cluster

  1. Go to the Session Clusters page.

    1. Log on to the Realtime Compute for Apache Flink console.

    2. Find the workspace that you want to manage and click Console in the Actions column.

    3. In the left-side navigation pane, choose O&M > Session Clusters.

  2. In the upper-left corner of the Session Clusters page, click Create Session Cluster.

  3. Configure the parameters.

    The following table describes the parameters.

    Section

    Parameter

    Description

    Section

    Parameter

    Description

    Standard

    Name

    The name of the session cluster that you want to create.

    Deployment Target

    The queue in which the draft is deployed. For more information about how to create a queue, see Manage queues.

    State

    The desired state of the cluster. Valid values:

    • RUNNING: The cluster keeps running after it is configured.

    • STOPPED: The cluster is stopped after it is configured, and the deployments that are deployed in the cluster are also stopped.

    Label key

    You can configure labels for deployments in the Labels section. This allows you to find a deployment on the Overview page in an efficient manner.

    Label value

    N/A

    Configuration

    Engine Version

    The version of the Flink engine that is used by the current deployment. For more information about engine versions, see Engine version and Lifecycle policies. We recommend that you use a recommended version or a stable version. Engine versions are classified into the following types:

    • Recommended: the latest minor version of the latest major version.

    • Stable: the latest minor version of a major version that is still in the service period of the product. Defects in previous versions are fixed in such a version.

    • Normal: other minor versions that are still in the service period of the product.

    • Deprecated: the version that exceeds the service period of the product.

    Flink Restart Policy

    Valid values:

    • Failure Rate: The JobManager is restarted if the number of failures within the specified interval exceeds the upper limit.

      If you select this option, you must configure the Failure Rate Interval, Max failures per interval, and Delay Between Restart Attempts parameters.

    • Fixed Delay: The JobManager is restarted at a fixed interval.

      If you select this option, you must configure the Number of Restart Attempts and Delay Between Restart Attempts parameters.

    • No Restarts: The JobManager is not restarted if tasks fail.

    Important

    If you leave this parameter empty, the default Apache Flink restart policy is used. In this case, if a task fails and checkpointing is disabled, the JobManager is not restarted. If you enable checkpointing, the JobManager is restarted.

    Other Configuration

    Configure other Flink settings, such as taskmanager.numberOfTaskSlots: 1.

    Resources

    Number of TaskManagers

    By default, the value is the same as the parallelism.

    JobManager CPU Cores

    Default value: 1.

    JobManager Memory

    Minimum value: 1 GiB. Recommended value: 4 GiB. JobManager memory can also be measured in MiB. For example, you can set this parameter to 1024 MiB or 1.5 GiB.

    TaskManager CPU Cores

    Default value: 2.

    TaskManager Memory

    Minimum value: 1 GiB. Recommended value: 8 GiB. TaskManager memory can also be measured in MiB. For example, you can set this parameter to 1024 MiB or 1.5 GiB.

    We recommend that you specify the number of slots for each TaskManager and the amount of resources that are available for TaskManagers. The number of slots is specified by the taskmanager.numberOfTaskSlots parameter. When you configure this parameter, take note of the following points:

    • For a single small deployment, we recommend that you set the CPU-to-memory ratio of a single slot to 1:4 and configure at least 1 CPU core and 2 GiB of memory for each slot.

    • For a complex deployment, we recommend that you configure at least 1 CPU core and 4 GiB of memory for each slot. If you use the default resource configuration, you can configure two slots for each TaskManager.

    • We recommend that you use the default resource configuration for each TaskManager and set the number of slots to 2.

      Important
      • Insufficient TaskManager resources affect the stability of the deployments that run on the TaskManager. Additionally, configuring only a small number of slots underutilizes resources because the overhead of the TaskManager cannot be effectively spread across tasks.

      • Configuring a large number of resources for a TaskManager means a large number of deployments run on the TaskManager. If the TaskManager is faulty, all the deployments will be affected.

    Logging

    Root Log Level

    The following log levels are supported and listed in ascending order of importance.

    1. TRACE: records finer-grained information than DEBUG logs.

    2. DEBUG: records the status of the system.

    3. INFO: records important system information.

    4. WARN: records the information about potential errors.

    5. ERROR: records the information about errors and exceptions that occur.

    Log Levels

    The name and level of the log.

    Logging Profile

    The log template. You can choose a default or custom profile template.

    Note

    For more information about the options related to the integration between Flink and resource orchestration frameworks such as Kubernetes and Yarn, see Resource Orchestration Frameworks.

  1. Click Create Session Cluster.

    After a session cluster is created, you can use it in draft debugging or deployment.

Step 2: Debug a draft

  1. In the left-side navigation pane, choose Development > ETL. Select the SQL draft to debug.

  2. In the upper-right corner of the SQL editor, click Debug. The Debug dialog box appears. Select a session cluster from the Session Cluster drop-down list. Then, click Next.

  3. Configure debugging data.

    • If you use online data for debugging, click Confirm.

    • If you use mock data, click Download mock data template, enter your mock data in the template, and then Click Upload mock data to upload the template.使用调试数据

      The following table describes the parameters in this step.

      Parameter

      Description

      Parameter

      Description

      Download mock data template

      The template is adapted to the schema of the source table.

      Upload mock data

      To debug a draft by using mock data, you can download the mock data template, edit the template, and upload the template. Then, select Use mock data.

      Limits on using a mock data file:

      • Only a CSV file is supported.

      • A CSV file must contain a table header, such as id(INT).

      • A CSV file can contain a maximum of 1,000 data records and cannot be greater than 1 MB.

      Data preview

      After you upload mock data, click the 加号 icon on the left side of the source table name to preview the data and download the mock data.

      Code preview

      The debugging feature automatically modifies the DDL statements of source and result tables. However, it does not change the draft code. You can preview code details in the lower part of Code Preview.

  4. Click Confirm.

    After you click Confirm, the debugging result appears in the lower part of the SQL script editor.调试结果

References

  • On this page (1)
  • Background information
  • Limits
  • Usage notes
  • Procedure
  • Step 1: Create a session cluster
  • Step 2: Debug a draft
  • References
Feedback
phone Contact Us

Chat now with Alibaba Cloud Customer Service to assist you in finding the right products and services to meet your needs.

alicare alicarealicarealicare