All Products
Search
Document Center

Tair (Redis® OSS-Compatible):Overview of Global Distributed Cache for Tair

Last Updated:Nov 27, 2024

Global Distributed Cache for Tair (Redis OSS-compatible) is an active geo-redundancy database system that is developed in-house by Alibaba Cloud to address high latency issues that may arise during cross-region or long-distance access. This system enables you to easily provide services from multiple sites in different regions at the same time. A distributed instance in the Global Distributed Cache architecture can consist of up to three child instances. Data is automatically synchronized among child instances in real time. Global Distributed Cache is designed to reduce the geographical distance between data and users, consequently lowering access latency and improving the response speed of applications. It also facilitates disaster recovery across different geographical locations.

Background information

If your business rapidly grows and branches out into a wide range of regions, cross-region and long-distance access can result in high latency and deteriorate user experience. Global Distributed Cache for Tair (Enterprise Edition) can help you reduce the high latency caused by cross-region access. Global Distributed Cache has the following benefits:

  • Allows you to directly create child instances or specify the child instances that need to be synchronized without having to build redundancy into your application. This greatly reduces the complexity of application design and allows you to focus on application development.

  • Provides the geo-replication capability to implement geo-disaster recovery or active geo-redundancy.

This feature applies to cross-region data synchronization scenarios and global business deployment in industries such as multimedia, gaming, and e-commerce.

Scenario

Description

Active geo-redundancy

In active geo-redundancy scenarios, multiple sites in different regions provide services at the same time. Active geo-redundancy is a type of high-availability architecture. The difference from the traditional disaster recovery design is that all sites provide services at the same time in the active geo-redundancy architecture. This allows applications to connect to nearby nodes.

Disaster recovery

Global Distributed Cache can synchronize data across child instances in both directions to support disaster recovery scenarios, such as zone-disaster recovery, disaster recovery based on three data centers across two regions, and three-region disaster recovery.

Load balancing

In specific scenarios such as large promotional events where you expect ultra-high queries per second (QPS) and a large amount of access traffic, you can balance loads across child instances to mitigate the risk of overloading a single instance.

Data synchronization

Global Distributed Cache can perform two-way data synchronization across child instances in a distributed instance. This feature can be used in scenarios such as data analysis and testing.

Architecture of Global Distributed Cache

image

In the architecture of Global Distributed Cache for Tair (Enterprise Edition), a distributed instance is a logical collection of distributed child instances and synchronization channels. Data is synchronized in real time across child instances by using these synchronization channels. A distributed instance consists of the following components:

  • Child instances

    • A child instance is the basic service unit that constitutes a distributed instance. Each child instance is an independent Tair instance. All child instances are readable and writable. Data is synchronized in real time across child instances in both directions. A distributed instance supports geo-replication. You can create child instances in different regions to implement geo-disaster recovery or active geo-redundancy.

      Note

      A child instance must be a Tair (Enterprise Edition) DRAM-based instance.

  • Synchronization channels

    • A synchronization channel is a one-way link that is used to synchronize data in real time from one child instance to another. Two opposite synchronization channels are required to implement two-way replication between two child instances.

      Note

      In addition to append-only files (AOFs) supported by open source Redis, Global Distributed Cache for Tair (Enterprise Edition) includes information such as server-id and opid for synchronization. Global Distributed Cache transmits binlogs over synchronization channels to synchronize data.

  • Channel manager

    • The channel manager manages the lifecycle of synchronization channels and performs operations to handle exceptions that occur in child instances, such as a switchover between the primary and secondary databases and the rebuilding of secondary databases.

Benefits

Benefit

Description

High reliability

  • Global Distributed Cache supports resumable upload and tolerates day-level synchronization interruptions. It is exempt from the limits of the native Redis architecture for incremental synchronization across data centers or regions.

  • Troubleshooting operations such as a switchover between the primary and secondary databases and the rebuilding of secondary databases are automatically performed on child instances.

High performance

  • High throughput

    • For child instances in the standard architecture, a synchronization channel supports up to 50,000 transactions per second (TPS) in one direction.

    • For child instances in the cluster or read/write splitting architecture, the throughput linearly increases with the number of shards or nodes.

  • Low latency

    • For synchronization between regions in the same continent, the latency ranges from 100 milliseconds to seconds, and the average latency is about 1.2 seconds.

    • For synchronization between regions in different continents, the latency ranges from 1 second to 5 seconds. The latency is determined by the throughput and round-trip time (RTT) of links.

High accuracy

  • Binlogs are synchronized to the peer instance in the order in which they are generated.

  • Backloop control is supported to prevent binlogs from being synchronized in a loop.

  • The exactly once mechanism is supported to ensure that synchronized binlogs are applied only once.

Billing

You are not charged for creating a distributed instance. Only child instances in the distributed instance are billed based on their specifications. Child instances are billed in the same manner as regular Tair instances. For more information, see Billable items.