All Products
Search
Document Center

Hologres:Maintain refresh tasks for dynamic tables in Hologres

Last Updated:Oct 23, 2024

If data changes are made to a base table that serves as the data source of a dynamic table, a refresh task must be run on the dynamic table to synchronize data changes. The refresh task is automatically run in the background based on the configured refresh start time and refresh interval. This topic describes how to query and maintain refresh tasks for dynamic tables.

Query refresh tasks

Query running refresh tasks

Use the hologres.hg_dynamic_table_refresh_activity system table

You can query full and incremental data refresh tasks that are running and resource consumption of the refresh tasks from the hologres.hg_dynamic_table_refresh_activity system table.

-- Query running refresh tasks
SELECT
    pid,
    query_id,
    refresh_mode,
    'RUNNING' as status,
    refresh_start,
    extract(epoch from duration) as duration, -- seconds
    serverless_queue_time_ms::bigint / 1000 AS serverless_queue_time_sec,
    serverless_resource_used_time_ms::bigint / 1000 AS serverless_resource_used_time_sec,
    serverless_allocated_cores
FROM
    hologres.hg_dynamic_table_refresh_activity
WHERE datname = '${database}'
  AND table_write = quote_ident('${schema}') || '.' || quote_ident('${tableName}')
ORDER BY refresh_start DESC
limit 2000;

Use the hg_stat_activity system view

You can query running refresh tasks from the hg_stat_activity system view. The content that can be displayed varies based on the refresh mode that you configured.

  • Full data refresh: INSERT statements are displayed.

  • Incremental data refresh: Refresh tasks are displayed.

Use metrics to check the status of refresh tasks

You can use metrics such as the queries per second (QPS), records per second (RPS), and latency to check the status of refresh tasks. The value refresh of the Command Type parameter indicates a refresh task for a dynamic table. For more information about metrics, see Hologres metrics.

If you use serverless computing resources to run a refresh task for a dynamic table, you can also view serverless computing-related metrics to check the status of the task.

Note

You can configure alert rules in CloudMonitor for refresh tasks. For more information, see CloudMonitor.

Query historical refresh tasks

Use the hologres.hg_dynamic_table_refresh_history system table

The hologres.hg_dynamic_table_refresh_history system table records all historical refresh tasks within the most recent month, including full data refresh, incremental data refresh, and manual refresh tasks.

Note
  • By default, the historical refresh tasks of the most recent month are retained. You cannot query the data generated one month ago.

  • The owner of a dynamic table can query only the historical refresh tasks for the current dynamic table. The superuser can query the historical refresh tasks for all dynamic tables in the current Hologres instance.

  • Example 1: Query all incremental data refresh tasks in the past day.

    -- Query all incremental data refresh tasks in the past day.
    SELECT
        query_id,
        refresh_mode,
        status,
        refresh_start,
        duration,
        refresh_latency / 1000 AS refresh_latency_second,
        serverless_allocated_cores,
        queue_time_ms::bigint / 1000 AS queue_time_second,
        serverless_resource_used_time_ms::bigint / 1000 AS serverless_resource_used_time_second
    FROM
        hologres.hg_dynamic_table_refresh_history
    WHERE
        refresh_start >= CURRENT_DATE - INTERVAL '1 day'
        AND dynamic_table_name = '<dynamic_table>'
        AND refresh_mode = 'incremental'
    ORDER BY
        refresh_start DESC
    limit 100;
    
  • Example 2: Query all refresh tasks in the current instance in the past day.

    -- Query all refresh tasks in the current instance in the past day.
    SELECT 
    query_id,
        refresh_mode,
        status,
        refresh_start,
        duration,
        refresh_latency / 1000 AS refresh_latency_second,
        serverless_allocated_cores,
        queue_time_ms::bigint / 1000 AS queue_time_second,
        serverless_resource_used_time_ms::bigint / 1000 AS serverless_resource_used_time_second
    FROM hologres.hg_dynamic_table_refresh_history where refresh_start >= CURRENT_DATE - INTERVAL '1 day'
  • Example 3: Query the refresh tasks for a specific table in the past day.

    -- Query the refresh tasks for a specific table in the past day.
    SELECT 
    query_id,
        refresh_mode,
        status,
        refresh_start,
        duration,
        refresh_latency / 1000 AS refresh_latency_second,
         serverless_allocated_cores,
        queue_time_ms::bigint / 1000 AS queue_time_second,
        serverless_resource_used_time_ms::bigint / 1000 AS serverless_resource_used_time_second
    FROM hologres.hg_dynamic_table_refresh_history where schema_name='<scehma_name>' and dynamic_table_name='<dynamic_table>' and  refresh_start >= CURRENT_DATE - INTERVAL '1 day'

Use slow query logs

You can query refresh tasks of a dynamic table in slow query logs. The value refresh of the Command Type parameter indicates a refresh task of a dynamic table. For more information, see Query and analyze slow query logs.

Query the execution plan of a refresh task

Similar to common queries, you can execute the EXPLAIN and EXPLAIN ANALYZE statements to query the execution plan and running details of a refresh task. This helps you analyze the performance bottleneck of a refresh task and optimize queries. Sample code:

 explain refresh dynamic table hmtest.dt_order_lineitem;
                                                                                                  QUERY PLAN                                                 
                                                 
-------------------------------------------------------------------------------------------------------------------------------------------------------------
-------------------------------------------------
 Gather  (cost=0.00..10.13 rows=1 width=16)
   ->  Insert  (cost=0.00..10.13 rows=1 width=16)
         ->  Redistribution  (cost=0.00..10.11 rows=1 width=16)
               ->  Final HashAggregate  (cost=0.00..10.11 rows=1 width=16)
                     Group Key: orders.o_orderpriority
                     ->  Redistribution  (cost=0.00..10.11 rows=10 width=16)
                           Hash Key: orders.o_orderpriority
                           ->  Partial HashAggregate  (cost=0.00..10.11 rows=10 width=16)
                                 Group Key: orders.o_orderpriority
                                 ->  Hash Left Semi Join  (cost=0.00..10.11 rows=1000 width=8)
                                       Hash Cond: (orders.o_orderkey = lineitem.l_orderkey)
                                       ->  Redistribution  (cost=0.00..5.03 rows=1000 width=16)
                                             Hash Key: orders.o_orderkey
                                             ->  Local Gather  (cost=0.00..5.01 rows=1000 width=16)
                                                   ->  Seq Scan on orders  (cost=0.00..5.01 rows=1000 width=16)
                                                         Filter: ((o_orderdate >= '1996-07-01 00:00:00+08'::timestamp with time zone) AND (o_orderdate < '199
6-10-01 00:00:00+08'::timestamp with time zone))
                                       ->  Hash  (cost=5.03..5.03 rows=1000 width=8)
                                             ->  Redistribution  (cost=0.00..5.03 rows=1000 width=8)
                                                   Hash Key: lineitem.l_orderkey
                                                   ->  Local Gather  (cost=0.00..5.03 rows=1000 width=8)
                                                         ->  Seq Scan on lineitem  (cost=0.00..5.03 rows=1000 width=8)
                                                               Filter: (l_commitdate < l_receiptdate)
 Optimizer: HQO version 2.1.0
(23 rows)

Manually refresh a dynamic table

You can execute the following statement to manually refresh a dynamic table:

refresh DYNAMIC TABLE <dynamic_schema_name.dynamic_table_name>;
Note

If the automatic refresh property is configured for a table, but you manually perform a refresh operation, both the automatic and manual refresh operations are performed. Both operations can be performed as expected. However, the system will keep only one piece of the latest data.

Configure the refresh timeout period

Similar to common queries, you can configure the timeout period for refresh tasks.

Table-level settings

If you configure a timeout period when you create a dynamic table,.the setting takes effect for all refresh tasks of the created table. In the following SQL statement, the public dataset tpch_10g is used as an example. Before you execute the statement, make sure that the data in the tpch_10g public dataset is imported to Hologres. For more information, see Create a public dataset import task.

-- Configure a timeout period for refresh tasks when you create a table. 
CREATE DYNAMIC TABLE tpch_q1_batch
  WITH (
    refresh_mode='full',
    auto_refresh_enable='true',
    full_auto_refresh_interval='1 hours',
    refresh_guc_statement_timeout='30 mins'-- The timeout period is 30 minutes.
       ) 
AS
  SELECT
        l_returnflag,
        l_linestatus,
        SUM(l_quantity) AS sum_qty,
        SUM(l_extendedprice) AS sum_base_price,
        SUM(l_extendedprice * (1 - l_discount)) AS sum_disc_price,
        SUM(l_extendedprice * (1 - l_discount) * (1 + l_tax)) AS sum_charge,
        AVG(l_quantity) AS avg_qty,
        AVG(l_extendedprice) AS avg_price,
        AVG(l_discount) AS avg_disc,
        COUNT(*) asAScount_order
FROM
        hologres_dataset_tpch_10.lineitem
WHERE
        l_shipdate <= DATE '1998-12-01' - INTERVAL '120' DAY
GROUP BY
        l_returnflag,
        l_linestatus;

Session-level settings

Manually configure a timeout period. Sample SQL code:

SET statement_timeout = <time>;
refresh DYNAMIC TABLE <dynamic_schema_name.dynamic_table_name>;

For more information, see Change the timeout period of active queries.

Cancel refresh tasks

Cancel running refresh tasks

If a refresh task has been running for a long time or is stuck, you can use the pg_cancel_backend function to cancel the task.

You can cancel a single running refresh task by executing the following statement:

// The pid parameter specifies the ID of the refresh task that you want to cancel.
SELECT pg_cancel_backend(<pid>);

The pid parameter specifies the ID of a refresh task. You can query the refresh task to obtain the ID (query_id). For more information, see the Query refresh tasks section in this topic.

Note

You can cancel multiple running refresh tasks at a time in the same way as common queries. For more information, see Cancel queries.

Cancel all refresh tasks of a dynamic table

If refresh tasks are configured for a dynamic table, you can cancel all subsequent refresh tasks at the table level.

ALTER DYNAMIC TABLE [IF EXISTS ] [<schema>.]<table_name> set (auto_refresh_enable=false);
Important

Exercise caution when you perform this operation. Otherwise, data may fail to be updated.