All Products
Search
Document Center

Hologres:Deploy read/write splitting with primary and replica instances (shared storage)

Last Updated:Feb 04, 2026

Hologres V1.1 and later provides a shared storage deployment model that uses primary and replica instances to ensure high availability (HA) for online production environments. This model supports fault and load isolation to meet HA requirements. This topic describes the core principles of this HA solution and explains how to configure primary and replica instances that use shared storage.

Single-instance automatic recovery HA solution

All Hologres compute nodes are scheduled in containers, which are referred to as Worker Nodes in the following figure. The Resource Manager performs periodic health checks. If a container fails to respond within one minute for reasons such as memory overflow, hardware failure, or software bugs, the Resource Manager automatically launches a new compute node and migrates the shard responsibilities to it. For example, if Worker Node3 times out, the Resource Manager replaces it with Worker Node4. This process enables rapid system recovery. The data state is stored in the Pangu distributed storage system and does not need to be migrated from the compute nodes. Because compute nodes are lightweight and stateless, the system can recover quickly from failures. This solution is enabled by default for every instance. The system recovers automatically from failures without manual O&M intervention. During the recovery process, any query operator that attempts to access the recovering node fails immediately. Hologres V1.1 and later uses a new recovery mechanism that restores nodes in approximately one minute, which is 5 to 10 times faster than earlier versions.单可用方案

Multi-instance HA solution with shared storage

Technical principles

The single-instance solution relies on real-time fault detection and node replacement. However, this approach still results in a brief service unavailability window during node recovery. Mission-critical scenarios require a more advanced HA solution that supports both fault and load isolation. Hologres V1.1 and later supports a multi-instance deployment model that uses shared storage. In this model, the primary instance has full capabilities. It supports read and write operations and lets you configure permissions and system parameters. Replica instances are read-only. All changes must be made on the primary instance, as shown in the following figure.共享存储多实例 Primary and replica instances do not share compute resources, which ensures load and fault isolation. All instances share the same data and access control policies, and you are charged a single storage fee.

Memory states are automatically synchronized in real time between instances. Within the same region, this synchronization occurs at the instance level and takes only milliseconds. When data is written to the primary instance, the system automatically syncs it to the replica instances. As a result, replica instances consume some CPU and memory resources even when they are idle. This consumption is approximately 1/8 of the primary instance's usage. We recommend that you do not configure the primary and replica instances with significantly different specifications.

Usage notes

  • You can configure up to ten read-only replica instances. The resource configurations can differ between instances, but they should not vary significantly. All instances must have the same number of shards.

  • Each read-only replica instance has a unique access Endpoint. You can use different Endpoints to isolate business scenarios.

  • In Hologres V1.3.27 and later, the latency threshold for synchronization between the primary and replica instances is increased from 20 minutes to 60 minutes. If the resource utilization of a replica instance remains at 100% for more than 60 minutes, the replica instance automatically restarts to reduce synchronization latency. If the resource utilization of a replica instance remains at 100% for an extended period, you must perform performance tuning or scale out the instance.

  • The primary instance remains fully operational while it is being bound to a read-only replica instance.

  • It takes approximately 3 to 5 minutes to bind a read-only replica instance to a primary instance. You can use the replica instance only after the binding process is complete.

  • You cannot connect to a read-only replica instance that is not bound to a primary instance.

  • If MaxCompute directly reads data from the Hologres storage layer and Hologres uses a primary-replica architecture, you must configure the connection URL to point to the primary instance, not a replica instance. For more information, see Enable direct read from Hologres foreign table storage.

Recommended scenarios

  • General-purpose scenarios:

    You can use the primary instance for data ingestion and data transformation, and use read-only replica instances for data analytics. This configuration implements read/write splitting.

  • Scenarios:

    • For online service queries that require high P99 stability, you can dedicate a read-only replica instance to isolate and protect the availability of the online service.

    • For online analytical processing (OLAP) queries, you can use a separate, analytics-focused replica instance that is distinct from the online service replica instance. This separation prevents large queries from affecting the performance of the online service.

Configure primary and replica instances with shared storage

The following limits apply to multi-instance deployments that use shared storage.

  • Only Hologres V1.1 or later can serve as a primary instance. If your instance runs an earlier version, refer to Common upgrade preparation errors or join the Hologres user group to request an upgrade. For more information, see How to get more online support?.

  • You cannot connect to a read-only replica instance before it is bound to a primary instance.

  • The primary and replica instances must run the same Hologres version.

  • The primary and replica instances must reside in the same region.

Permission requirements for attach and detach operations

To perform attach and detach operations on read-only replica instances, you must grant the RAM user the AliyunHologresFullAccess access policy. For more information about RAM role permissions, see Grant permissions to RAM users.

To configure a highly available multi-instance deployment that uses shared storage, perform the following steps.

  1. Purchase a new Hologres instance

    Important

    The read-only replica instance must be in the same region as the primary instance.

    When you purchase a new instance, set Instance Type to Read-only Replica Instance and select the ID of the primary instance in the same zone for Primary Instance ID for Read-only Replica Instance. For more information about the other parameters, see Purchase a Hologres instance.

  2. Attach the read-only replica instance

    After you purchase the read-only replica instance, it is automatically attached to the primary instance that you selected on the purchase page. You can start to use the replica instance after its state changes to Running as Expected.

  3. Usage example

    When you use multi-instance deployments that use shared storage, take note of the following items.

    • After the configuration is complete, use the Endpoint of the replica instance to serve online traffic.

    • You must perform all operations, such as table creation and user authorization, on the primary instance. Replica instances support only data read operations.

    • Replica instances automatically inherit all objects, such as users and tables, from the primary instance. You cannot create users specifically for a replica instance at the access control layer.