All Products
Search
Document Center

Object Storage Service:Data replication

Last Updated:Nov 18, 2024

Data replication automatically replicates objects and object operations, such as creation, overwriting, and deletion, from a source bucket to a destination bucket. Object Storage Service (OSS) supports cross-region replication (CRR) and same-region replication (SRR).

Usage notes

  • In this topic, the public endpoint of the China (Hangzhou) region is used. If you want to access OSS from other Alibaba Cloud services in the same region as OSS, use an internal endpoint. For more information about OSS regions and endpoints, see Regions, endpoints and open ports.

  • In this topic, access credentials are obtained from environment variables. For more information about how to configure access credentials, see Configure access credentials.

  • In this topic, an OSSClient instance is created by using an OSS endpoint. If you want to create an OSSClient instance by using custom domain names or Security Token Service (STS), see Initialization.

  • To enable data replication, you must have the oss:PutBucketReplication permission. To query data replication rules, you must have the oss:GetBucketReplication permission. To query the regions to which data can be replicated, you must have the oss:GetBucketReplicationLocation permission. To query the progress of a data replication task, you must have the oss:GetBucketReplicationProgress permission. To disable data replication, you must have the oss:DeleteBucketReplication permission. For more information, see Attach a custom policy to a RAM user.

Enable data replication

Important

Before you enable data replication, make sure that the source bucket and the destination bucket are unversioned or versioning-enabled.

The following sample code provides an example on how to enable data replication to replicate data from the srcexamplebucket bucket in the China (Hangzhou) region to the destexamplebucket bucket in the China (Beijing) region:

# -*- coding: utf-8 -*-
import oss2
from oss2.credentials import EnvironmentVariableCredentialsProvider
from oss2.models import ReplicationRule
# Obtain access credentials from environment variables. Before you run the sample code, make sure that the OSS_ACCESS_KEY_ID and OSS_ACCESS_KEY_SECRET environment variables are configured. 
auth = oss2.ProviderAuthV4(EnvironmentVariableCredentialsProvider())

# Specify the endpoint of the region in which the bucket is located. For example, if the bucket is located in the China (Hangzhou) region, set the endpoint to https://oss-cn-hangzhou.aliyuncs.com. 
endpoint = "https://oss-cn-hangzhou.aliyuncs.com"
# Specify the ID of the region that maps to the endpoint. Example: cn-hangzhou. This parameter is required if you use the signature algorithm V4.
region = "cn-hangzhou"

# Specify the name of the bucket.
bucket = oss2.Bucket(auth, endpoint, "examplebucket", region=region)

replica_config = ReplicationRule(
    # Specify the destination bucket to which you want to replicate data. 
    target_bucket_name='destexamplebucket',
    # Specify the region in which the destination bucket is located. 
    # If you want to enable CRR, the source and destination buckets must be located in different regions. If you want to enable SRR, the source and destination buckets must be located in the same region. 
    target_bucket_location='yourTargetBucketLocation'
)

# Specify the prefixes that are contained in the names of the objects that you want to replicate. After you specify a prefix, only objects whose names contain the prefix are replicated to the destination bucket. 
# prefix_list = ['prefix1', 'prefix2']
# Configure the data replication rule. 
# replica_config = ReplicationRule(
     # prefix_list=prefix_list,
     # Specify that OSS replicates object creation and update operations from the source bucket to the destination bucket.
     # action_list=[ReplicationRule.PUT],
     # Specify the destination bucket to which you want to replicate data. 
     # target_bucket_name='destexamplebucket1',
     # Specify the region in which the destination bucket is located. 
     # target_bucket_location='yourTargetBucketLocation',
     # Specify whether to replicate historical data. By default, historical data is replicated. In this example, historical data is not replicated. 
     # is_enable_historical_object_replication=False,
     # Specify the link that is used to transfer data during data replication. 
     # target_transfer_type='oss_acc',
     # Specify the role that you want to authorize OSS to use to replicate data. If you want to use SSE-KMS to encrypt the objects that are replicated to the destination bucket, you must specify this parameter. 
     # sync_role_name='roleNameTest',
     # Replicate the objects that are encrypted by using SSE-KMS. 
     # sse_kms_encrypted_objects_status=ReplicationRule.ENABLED
     # Specify the CMK ID used in SSE-KMS encryption. If you want to use SSE-KMS to encrypt the objects that are replicated to the destination bucket, you must configure this parameter. 
     # replica_kms_keyid='9468da86-3509-4f8d-a61e-6eab1eac****',
  #)

# Enable data replication. 
bucket.put_bucket_replication(replica_config)

Query data replication rules

The following sample code provides an example on how to query the data replication rules of the bucket named examplebucket:

# -*- coding: utf-8 -*-
import oss2
from oss2.credentials import EnvironmentVariableCredentialsProvider
from oss2.models import ReplicationRule

# Obtain access credentials from environment variables. Before you run the sample code, make sure that the OSS_ACCESS_KEY_ID and OSS_ACCESS_KEY_SECRET environment variables are configured. 
auth = oss2.ProviderAuthV4(EnvironmentVariableCredentialsProvider())

# Specify the endpoint of the region in which the bucket is located. For example, if the bucket is located in the China (Hangzhou) region, set the endpoint to https://oss-cn-hangzhou.aliyuncs.com. 
endpoint = "https://oss-cn-hangzhou.aliyuncs.com"
# Specify the ID of the region that maps to the endpoint. Example: cn-hangzhou. This parameter is required if you use the signature algorithm V4.
region = "cn-hangzhou"

# Specify the name of the bucket.
bucket = oss2.Bucket(auth, endpoint, "examplebucket", region=region)

# Query the data replication rules. 
result = bucket.get_bucket_replication()
# Display the returned information. 
for rule in result.rule_list:
    print(rule.rule_id)
    print(rule.target_bucket_name)
    print(rule.target_bucket_location)

Query the regions to which data can be replicated

The following sample code provides an example on how to query the regions to which data can be replicated from the examplebucket bucket:

# -*- coding: utf-8 -*-
import oss2
from oss2.credentials import EnvironmentVariableCredentialsProvider
from oss2.models import ReplicationRule

# Obtain access credentials from environment variables. Before you run the sample code, make sure that the OSS_ACCESS_KEY_ID and OSS_ACCESS_KEY_SECRET environment variables are configured. 
auth = oss2.ProviderAuthV4(EnvironmentVariableCredentialsProvider())

# Specify the endpoint of the region in which the bucket is located. For example, if the bucket is located in the China (Hangzhou) region, set the endpoint to https://oss-cn-hangzhou.aliyuncs.com. 
endpoint = "https://oss-cn-hangzhou.aliyuncs.com"
# Specify the ID of the region that maps to the endpoint. Example: cn-hangzhou. This parameter is required if you use the signature algorithm V4.
region = "cn-hangzhou"

# Specify the name of the bucket.
bucket = oss2.Bucket(auth, endpoint, "examplebucket", region=region)

Query the regions to which data can be replicated. 
result = bucket.get_bucket_replication_location()
for location in result.location_list:
    print(location)

Query the progress of a data replication task

You can query the progress of historical data replication tasks and incremental data replication tasks.

  • The progress of historical data replication tasks is expressed as a percentage. You can query the progress of historical data replication tasks only for buckets for which historical data replication is enabled.

  • The progress of incremental data replication tasks is expressed as a point in time. Data that is stored in the source bucket before the point in time is replicated.

The following sample code provides an example on how to query the progress of the data replication task whose replication rule ID is test_replication_1:

# -*- coding: utf-8 -*-
import oss2
from oss2.credentials import EnvironmentVariableCredentialsProvider
from oss2.models import ReplicationRule

# Obtain access credentials from environment variables. Before you run the sample code, make sure that the OSS_ACCESS_KEY_ID and OSS_ACCESS_KEY_SECRET environment variables are configured. 
auth = oss2.ProviderAuthV4(EnvironmentVariableCredentialsProvider())

# Specify the endpoint of the region in which the bucket is located. For example, if the bucket is located in the China (Hangzhou) region, set the endpoint to https://oss-cn-hangzhou.aliyuncs.com. 
endpoint = "https://oss-cn-hangzhou.aliyuncs.com"
# Specify the ID of the region that maps to the endpoint. Example: cn-hangzhou. This parameter is required if you use the signature algorithm V4.
region = "cn-hangzhou"

# Specify the name of the bucket.
bucket = oss2.Bucket(auth, endpoint, "examplebucket", region=region)

# Query the progress of the data replication task. 
# Specify the ID of the data replication rule. Example: test_replication_1. 
result = bucket.get_bucket_replication_progress('test_replication_1')
print(result.progress.rule_id)
# Check whether historical data replication is enabled. 
print(result.progress.is_enable_historical_object_replication)
# Display the progress of historical data replication. 
print(result.progress.historical_object_progress)
# Display the progress of incremental data replication. 
print(result.progress.new_object_progress)            

Disable data replication

You can delete the replication rule that is configured for the specified source bucket to disable data replication for the bucket.

The following sample code provides an example on how to delete the test_replication_1 rule of the examplebucket bucket:

# -*- coding: utf-8 -*-
import oss2
from oss2.credentials import EnvironmentVariableCredentialsProvider
from oss2.models import ReplicationRule

# Obtain access credentials from environment variables. Before you run the sample code, make sure that the OSS_ACCESS_KEY_ID and OSS_ACCESS_KEY_SECRET environment variables are configured. 
auth = oss2.ProviderAuthV4(EnvironmentVariableCredentialsProvider())

# Specify the endpoint of the region in which the bucket is located. For example, if the bucket is located in the China (Hangzhou) region, set the endpoint to https://oss-cn-hangzhou.aliyuncs.com. 
endpoint = "https://oss-cn-hangzhou.aliyuncs.com"
# Specify the ID of the region that maps to the endpoint. Example: cn-hangzhou. This parameter is required if you use the signature algorithm V4.
region = "cn-hangzhou"

# Specify the name of the bucket.
bucket = oss2.Bucket(auth, endpoint, "examplebucket", region=region)

# Disable data replication. After data replication is disabled, the objects that are replicated to the destination bucket still exist. However, all changes to the objects in the source bucket are no longer replicated to the destination bucket. 
# Specify the ID of the data replication rule. Example: test_replication_1. 
result = bucket.delete_bucket_replication('test_replication_1')

References

  • For the complete sample code for data replication, visit GitHub.

  • For more information about the API operation that you can call to enable data replication, see PutBucketReplication.

  • For more information about the API operation that you can call to query data replication rules, see GetBucketReplication.

  • For more information about the API operation that you can call to query regions to which data can be replicated, see GetBucketReplicationLocation.

  • For more information about the API operation that you can call to query the progress of a data replication task, see GetBucketReplicationProgress.

  • For more information about the API operation that you can call to disable data replication, see DeleteBucketReplication.