All Products
Search
Document Center

ApsaraDB RDS:AP acceleration engine (rds_duckdb)

Last Updated:Jan 14, 2026

You can use the AP acceleration engine (rds_duckdb) for ApsaraDB RDS for PostgreSQL to obtain fast results from complex queries. This engine provides column-oriented tables and vectorized execution to significantly speed up complex queries without changing the original query statements. This ensures that you can retrieve results easily and efficiently.

Join the ApsaraDB RDS for PostgreSQL extension DingTalk group (ID: 103525002795) to ask questions, share ideas, give feedback, and learn more about the extension.

Introduction

rds_duckdb integrates the efficient and resource-friendly DuckDB into ApsaraDB RDS for PostgreSQL to enhance analytical query capabilities. This extension lets you export local tables from ApsaraDB RDS for PostgreSQL into column-oriented tables to enable Analytical Processing (AP) query acceleration. This feature significantly improves the execution speed of complex queries to better meet the demands of analytical workloads.

Prerequisites

  • The major version of your instance is ApsaraDB RDS for PostgreSQL 12 or later.

  • The minor engine version of your instance is 20241030 or later.

  • You have added rds_duckdb to the Parameter Value of shared_preload_libraries.

    For more information about how to configure parameters, see Set instance parameters. For example, set Running Parameter Value to 'pg_stat_statements,auto_explain,rds_duckdb'.

Note

If the major version of your instance is ApsaraDB RDS for PostgreSQL 15 and the minor version is 20250228, automatic incremental synchronization of column-oriented table data is enabled by default.

Create and delete the extension

Use a privileged account to create and delete the extension.

Create a plugin

CREATE EXTENSION rds_duckdb;

View the DuckDB version used by the extension

SELECT rds_duckdb.duckdb_version();

Delete the extension

DROP EXTENSION rds_duckdb;

Manage column-oriented tables

Create a column-oriented table

Use the following command to export a local ApsaraDB RDS for PostgreSQL table, such as a user table, materialized view, or foreign table, to a column-oriented table to accelerate analytical queries.

Note

If the major version of your instance is ApsaraDB RDS for PostgreSQL 15 and the minor version is 20250228, automatic incremental synchronization of column-oriented table data is enabled by default. Before you create a column-oriented table, you must configure the target ApsaraDB RDS for PostgreSQL instance and local table to enable automatic incremental synchronization. For more information, see Set up automatic incremental synchronization for column-oriented tables.

SELECT rds_duckdb.create_duckdb_table('local_table_name');

Refresh the column-oriented table

Use the following command to refresh the exported column-oriented table with the latest data from the local ApsaraDB RDS for PostgreSQL table. This command updates both the table schema and the data.

SELECT rds_duckdb.refresh_duckdb_table('local_table_name');

View the size of a column-oriented table

SELECT rds_duckdb.duckdb_table_size('local_table_name');

View the total size of all exported tables in the current database

SELECT rds_duckdb.duckdb_database_size();

Delete a column-oriented table

SELECT rds_duckdb.drop_duckdb_table('local_table_name');

Manage AP acceleration

rds_duckdb currently accelerates only read-only queries. After you enable AP acceleration, if a query involves only tables that have corresponding DuckDB column-oriented tables, DuckDB executes the SQL statement. If the SQL statement involves unsupported DML or DDL operations, or tables without a corresponding column-oriented table, the query falls back to ApsaraDB RDS for PostgreSQL for execution.

For SQL statements that fall back to ApsaraDB RDS for PostgreSQL, the system issues a warning in the following format: WARNING: Trying to execute an operation with non-duckdb tables(test), fallback to PG. The table names in parentheses are the ApsaraDB RDS for PostgreSQL tables that do not have a corresponding DuckDB column-oriented table.

Non-read-only SQL queries also trigger a warning: WARNING: Modification operations on DuckDB tables are currently not supported, fallback to PG.

Enable AP acceleration

SET rds_duckdb.execution = on;

Set AP acceleration parameters

You can adjust parameter settings in a session to control AP acceleration performance. For example:

SET rds_duckdb.worker_threads = 32;
SET rds_duckdb.memory_limit = 16384;

Parameter name

Description

Recommended value

rds_duckdb.worker_threads

The number of worker threads used for AP acceleration.

Range: 1 to 255.

Default: 1. This means there is only one worker thread.

  • For significant performance improvement, set this value to the number of CPU cores.

  • This parameter is closely related to the hardware of the execution machine. A larger value results in higher CPU load during AP acceleration. Set this parameter based on your actual needs.

  • A higher value provides better performance but also increases CPU load. A lower value reduces the acceleration effect but also decreases the system's CPU load.

rds_duckdb.memory_limit

The memory limit for AP acceleration.

Unit: MB. Do not add the unit when you configure the parameter.

Range: 1 to INT32_MAX.

Default: 100. This means the upper limit is 100 MB.

  • For significant performance improvement, set this value as high as possible.

  • This parameter is closely related to the hardware of the execution machine. A larger value results in higher memory usage during AP acceleration. Set this parameter based on your actual needs.

  • The default value is conservative. Adjust it based on your device's actual conditions.

  • A value that is too small can affect the process of exporting large tables to column-oriented tables and can impact AP acceleration performance.

Note

For more information about DuckDB parameters, see DuckDB.

Disable AP acceleration

SET rds_duckdb.execution = off;

Set up automatic incremental synchronization for column-oriented tables

If the major version of your instance is ApsaraDB RDS for PostgreSQL 15 and the minor version is 20250228, automatic incremental synchronization of column-oriented table data is enabled by default. Before you create a column-oriented table, you must perform the following steps to configure the target ApsaraDB RDS for PostgreSQL instance and local table.

  1. In Extension Management, check the version of rds_duckdb and upgrade it to 1.3.

  2. Set instance parameters to change the value of the wal_level parameter to logical.

  3. (Optional) If the target local table does not have a primary key, run the following command to set a REPLICA IDENTITY index as the replication key for the table.

    ALTER TABLE <local_table_name> REPLICA IDENTITY USING INDEX <index_name>;
  4. Use an account with replication permissions or a privileged account to create the column-oriented table.

View the synchronization status and offset progress of column-oriented tables

SELECT * FROM rds_duckdb.duckdb_sync_stat;

Sample response and parameter descriptions

Sample response:

 sync_table | sync_status_description |          sync_error_description          | confirmed_lsn 
------------+-------------------------+------------------------------------------+---------------
 test       | not syncing             | no primary key or replica identity index | 
 test2      | not syncing             | no primary key or replica identity index | 
 test3      | data syncing            | no errors                                | 0/250D1E8
 test4      | not syncing             | no primary key or replica identity index | 
 test5      | data syncing            | no errors                                | 0/250D1E8
 test6      | data syncing            | no errors                                | 0/250D1E8
 test7      | data syncing            | no errors                                | 0/250D1E8
 test8      | data syncing            | no errors                                | 0/250D1E8

Parameter descriptions:

Parameter

Description

sync_status_description

The synchronization status of the column-oriented table.

  • not syncing: Not performing incremental data synchronization.

  • data copying: Initial full data replication in progress.

  • data catchup: Applying incremental data generated during full data replication.

  • data syncing: Normal incremental synchronization in progress.

sync_error_description

The reason why a column-oriented table is not performing incremental synchronization.

  • no errors: No errors.

  • dml replay conflict: A DML operation caused an incremental replay conflict.

  • ddl replay conflict: A DDL operation caused an incremental replay conflict.

  • no primary key or replica identity index: The corresponding ApsaraDB RDS for PostgreSQL table for the column-oriented table lacks a primary key or a REPLICA IDENTITY index.

  • unsupported relation type: The table type is not supported for synchronization, such as partitioned tables, views, or materialized views.

  • rds_duckdb.enable_sync not set: The global synchronization GUC is not enabled.

    Note

    This is enabled by default if the major version of the instance is ApsaraDB RDS for PostgreSQL 15 and the minor version is 20250228.

  • removing duckdb table: The column-oriented table is being deleted.

View SQL execution plans

Use the EXPLAIN statement to view the execution plan of an SQL statement with AP acceleration enabled and disabled. For example:

  • The following execution plan is for the SQL statement after AP acceleration is enabled.

    Enable AP acceleration

    tpch_10x=# SET rds_duckdb.execution = on;
    SET
    tpch_10x=# EXPLAIN SELECT
    tpch_10x-#     100.00 * sum(
    tpch_10x(#         CASE WHEN p_type LIKE 'PROMO%' THEN
    tpch_10x(#             l_extendedprice * (1 - l_discount)
    tpch_10x(#         ELSE
    tpch_10x(#             0
    tpch_10x(#         END) / sum(l_extendedprice * (1 - l_discount)) AS promo_revenue
    tpch_10x-# FROM
    tpch_10x-#     lineitem,
    tpch_10x-#     part
    tpch_10x-# WHERE
    tpch_10x-#     l_partkey = p_partkey
    tpch_10x-#     AND l_shipdate >= date '1995-09-01'
    tpch_10x-#     AND l_shipdate < CAST('1995-10-01' AS date);
                             QUERY PLAN
    ------------------------------------------------------------
     Custom Scan (DuckDBNode)  (cost=0.00..0.00 rows=0 width=0)
       DuckDB Plan:
    
     ┌───────────────────────────┐
     │         PROJECTION        │
     │   ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─   │
     │        Projections:       │
     │       promo_revenue       │
     │   ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─   │
     │   Estimated Cardinality:  │
     │             1             │
     └─────────────┬─────────────┘
     ┌─────────────┴─────────────┐
     │    UNGROUPED_AGGREGATE    │
     │   ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─   │
     │        Aggregates:        │
     │          sum(#0)          │
     │          sum(#1)          │
     └─────────────┬─────────────┘
     ┌─────────────┴─────────────┐
     │         PROJECTION        │
     │   ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─   │
     │        Projections:       │
     │ CASE  WHEN (prefix(p_type,│
     │    'PROMO')) THEN (CAST(  │
     │ (l_extendedprice * (1.000 │
     │    - CAST(l_discount AS   │
     │     DECIMAL(18,3)))) AS   │
     │   DECIMAL(20,5))) ELSE 0  │
     │         .00000 END        │
     │ (l_extendedprice * (1.000 │
     │    - CAST(l_discount AS   │
     │      DECIMAL(18,3))))     │
     │   ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─   │
     │   Estimated Cardinality:  │
     │          6600339          │
     └─────────────┬─────────────┘
     ┌─────────────┴─────────────┐
     │         HASH_JOIN         │
     │   ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─   │
     │         Join Type:        │
     │           INNER           │
     │   ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─   │
     │        Conditions:        ├──────────────┐
     │   l_partkey = p_partkey   │              │
     │   ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─   │              │
     │   Estimated Cardinality:  │              │
     │          6600339          │              │
     └─────────────┬─────────────┘              │
     ┌─────────────┴─────────────┐┌─────────────┴─────────────┐
     │         SEQ_SCAN          ││         SEQ_SCAN          │
     │   ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─   ││   ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─   │
     │        Stringified:       ││        Stringified:       │
     │          lineitem         ││            part           │
     │   ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─   ││   ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─   │
     │        Projections:       ││        Projections:       │
     │         l_partkey         ││         p_partkey         │
     │      l_extendedprice      ││           p_type          │
     │         l_discount        ││   ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─   │
     │   ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─   ││   Estimated Cardinality:  │
     │          Filters:         ││          2000000          │
     │ l_shipdate>='1995-09-01': ││                           │
     │ :DATE AND l_shipdate<'1995││                           │
     │     -10-01'::DATE AND     ││                           │
     │   l_shipdate IS NOT NULL  ││                           │
     │   ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─   ││                           │
     │   Estimated Cardinality:  ││                           │
     │          11997210         ││                           │
     └───────────────────────────┘└───────────────────────────┘
    
    
    (71 rows)
  • The following execution plan is for the SQL statement after AP acceleration is disabled.

    Disable AP acceleration

    tpch_10x=# SET rds_duckdb.execution = off;
    SET
    tpch_10x=# EXPLAIN SELECT
        100.00 * sum(
            CASE WHEN p_type LIKE 'PROMO%' THEN
                l_extendedprice * (1 - l_discount)
            ELSE
                0
            END) / sum(l_extendedprice * (1 - l_discount)) AS promo_revenue
    FROM
        lineitem,
        part
    WHERE
        l_partkey = p_partkey
        AND l_shipdate >= date '1995-09-01'
        AND l_shipdate < CAST('1995-10-01' AS date);
                                                         QUERY PLAN
    --------------------------------------------------------------------------------------------------------------------
     Finalize Aggregate  (cost=1286740.42..1286740.43 rows=1 width=32)
       ->  Gather  (cost=1286739.96..1286740.37 rows=4 width=64)
             Workers Planned: 4
             ->  Partial Aggregate  (cost=1285739.96..1285739.97 rows=1 width=64)
                   ->  Parallel Hash Join  (cost=1235166.04..1282419.39 rows=189747 width=33)
                         Hash Cond: (part.p_partkey = lineitem.l_partkey)
                         ->  Parallel Seq Scan on part  (cost=0.00..43232.15 rows=500016 width=29)
                         ->  Parallel Hash  (cost=1233776.40..1233776.40 rows=111171 width=20)
                               ->  Parallel Seq Scan on lineitem  (cost=0.00..1233776.40 rows=111171 width=20)
                                     Filter: ((l_shipdate >= '1995-09-01'::date) AND (l_shipdate < '1995-10-01'::date))
     JIT:
       Functions: 17
       Options: Inlining true, Optimization true, Expressions true, Deforming true
    (13 rows)

Performance test

A standard TPC-H test in a Linux environment is used to evaluate the performance improvement that rds_duckdb provides for complex queries. For more information, see Performance test for the AP acceleration engine (rds_duckdb).