All Products
Search
Document Center

Object Storage Service:Data verification

Last Updated:Nov 01, 2024

Object Storage Service (OSS) SDK for Java uses MD5 verification and CRC-64 to ensure data integrity when you upload, download, and copy objects.

Usage notes

  • In this topic, the public endpoint of the China (Hangzhou) region is used. If you want to access OSS from other Alibaba Cloud services in the same region as OSS, use an internal endpoint. For more information about OSS regions and endpoints, see Regions, endpoints and open ports.

  • In this topic, access credentials are obtained from environment variables. For more information about how to configure access credentials, see Configure access credentials.

  • In this topic, an OSSClient instance is created by using an OSS endpoint. If you want to create an OSSClient instance by using custom domain names or Security Token Service (STS), see Initialization.

MD5 verification

If you configure Content-MD5 in an object upload request, OSS calculates the MD5 hash of the uploaded object. If the calculated MD5 hash is different from the MD5 hash configured in the upload request, InvalidDigest is returned. This allows OSS to ensure data integrity for object uploads. If InvalidDigest is returned, you need to upload the object again.

The following sample code provides an example on how to configure MD5 verification in a PutObject operation:

# -*- coding: utf-8 -*-
import oss2
from oss2.credentials import EnvironmentVariableCredentialsProvider
# Obtain access credentials from environment variables. Before you run the sample code, make sure that the OSS_ACCESS_KEY_ID and OSS_ACCESS_KEY_SECRET environment variables are configured. 
auth = oss2.ProviderAuthV4(EnvironmentVariableCredentialsProvider())

# Specify the endpoint of the region in which the bucket is located. In this example, the endpoint of the China (Hangzhou) region is used. Specify your actual endpoint.
endpoint = "https://oss-cn-hangzhou.aliyuncs.com"
# Specify the ID of the region that maps to the endpoint. Example: cn-hangzhou. This parameter is required if you use the signature algorithm V4.
region = "cn-hangzhou"

# Specify the name of your bucket.
bucket = oss2.Bucket(auth, endpoint, "examplebucket", region=region)

# Specify the full path of the object. Do not include the bucket name in the full path. Example: exampledir/exampleobject.txt. 
object_name = 'exampledir/exampleobject.txt'
# Specify the full path of the local file that you want to upload. During upload, the value of this variable is uploaded to OSS. This value can be in any format, such as text, image, video, and audio. 
with open('/Users/test/Desktop/demo.txt', 'rb') as file:
    content = file.read()

# Calculate the MD5 hash of the value based on the actual content. 
content_md5 = oss2.utils.content_md5(content);
print('content_md5', content_md5)

# Include the Content-MD5 header in the upload request. The server verifies the MD5 hash of the uploaded content to ensure the integrity and validity of the uploaded content. 
headers = dict()
headers['Content-MD5'] = content_md5
bucket.put_object(object_name, content, headers=headers)
Note

MD5 verification can be used for the put_object, append_object, post_object, and upload_part operations.

CRC-64

Take note of the following items when you perform CRC-64 to verify data:

Note
  • CRC-64 can be used for the put_object, get_object, append_object, and upload_part operations. By default, CRC-64 is enabled when you upload an object. If the CRC-64 value calculated on the client is different from the CRC-64 value returned by the OSS server, InconsistentError is returned.

  • CRC-64 is not supported in range download.

  • CRC-64 consumes CPU resources and slows down uploads and downloads.

  • CRC-64 in object download

    The following code provides an example on how to perform CRC-64 when you download an object:

    # -*- coding: utf-8 -*-
    import oss2
    from oss2.credentials import EnvironmentVariableCredentialsProvider
    # Obtain access credentials from environment variables. Before you run the sample code, make sure that the OSS_ACCESS_KEY_ID and OSS_ACCESS_KEY_SECRET environment variables are configured. 
    auth = oss2.ProviderAuthV4(EnvironmentVariableCredentialsProvider())
    
    # Specify the endpoint of the region in which the bucket is located. In this example, the endpoint of the China (Hangzhou) region is used. Specify your actual endpoint.
    endpoint = "https://oss-cn-hangzhou.aliyuncs.com"
    # Specify the ID of the region that maps to the endpoint. Example: cn-hangzhou. This parameter is required if you use the signature algorithm V4.
    region = "cn-hangzhou"
    
    # Specify the name of your bucket.
    bucket = oss2.Bucket(auth, endpoint, "examplebucket", region=region)
    
    # Specify the full path of the object. Do not include the bucket name in the full path. 
    object_name = 'yourObjectName'
    
    # Check whether CRC-64 is enabled by default. 
    print('bucket.enable-crc:',  bucket.enable_crc)
    
    # The value returned by bucket.get_object is a file-like object that can be iterated. 
    object_stream = bucket.get_object(object_name)
    print(object_stream.read())
    
    # You must call read() to read the object from the stream that is returned by get_object before you can calculate the CRC-64 checksum of the object. 
    if object_stream.client_crc != object_stream.server_crc:
      print("The CRC checksum between client and server is inconsistent!")
  • CRC-64 in append upload

    If you specify init_crc in append upload, CRC-64 is enabled by default.

    # -*- coding: utf-8 -*-
    import oss2
    from oss2.credentials import EnvironmentVariableCredentialsProvider
    # Obtain access credentials from environment variables. Before you run the sample code, make sure that the OSS_ACCESS_KEY_ID and OSS_ACCESS_KEY_SECRET environment variables are configured. 
    auth = oss2.ProviderAuthV4(EnvironmentVariableCredentialsProvider())
    
    # Specify the endpoint of the region in which the bucket is located. In this example, the endpoint of the China (Hangzhou) region is used. Specify your actual endpoint.
    endpoint = "https://oss-cn-hangzhou.aliyuncs.com"
    # Specify the ID of the region that maps to the endpoint. Example: cn-hangzhou. This parameter is required if you use the signature algorithm V4.
    region = "cn-hangzhou"
    
    # Specify the name of your bucket.
    bucket = oss2.Bucket(auth, endpoint, "examplebucket", region=region)
    
    object_name = "yourAppendObjectName"
    first_content = "yourFirstContent"
    second_content = "yourSecondContent"
    
    # Specify init_crc in the first append upload. 
    # If init_crc is specified, OSS SDKs perform CRC-64 for the returned results. 
    result = bucket.append_object(object_name, 0, first_content, init_crc=0)
    
    # Specify init_crc in the second append upload. 
    # Set init_crc to the CRC-64 value of the uploaded object. 
    result = bucket.append_object(object_name, result.next_position, second_content, init_crc=result.crc)

References

For the complete sample code that is used for MD5 verification and CRC-64, visit GitHub.