You may fail to download a large object if the network is unstable or other exceptions occur. In some cases, you may fail to download the object even after multiple attempts. To resolve this issue, Object Storage Service (OSS) provides the resumable download feature. In resumable download, OSS splits an object into multiple parts and downloads each part separately. After all parts are downloaded, OSS combines the parts into a complete object.
Usage notes
In this topic, the public endpoint of the China (Hangzhou) region is used. If you want to access OSS from other Alibaba Cloud services in the same region as OSS, use an internal endpoint. For more information about OSS regions and endpoints, see Regions, endpoints and open ports.
In this topic, access credentials are obtained from environment variables. For more information about how to configure access credentials, see Configure access credentials.
In this topic, an OSSClient instance is created by using an OSS endpoint. If you want to create an OSSClient instance by using custom domain names or Security Token Service (STS), see Initialization.
To use resumable download, you must have the
oss:GetObject
permission. For more information, see Attach a custom policy to a RAM user.
Procedure
To use resumable download, perform the following steps:
Create a temporary local file with a name that consists of the original object name and a random suffix.
Specify the Range header in the HTTP request so that the object is read based on the range. Then, the read content is written to the corresponding position of the temporary local file.
After the download is complete, rename the temporary file as the destination file. If the destination file already exists, the downloaded data overwrites the data in the existing file. Otherwise, a new file is created.
One piece of checkpoint information overwrites another on the local disk, or one temporary file name conflicts with another. Therefore, do not use multiple programs or threads to call the oss2.resumable_download method simultaneously to download the same object to a same destination file.
Examples
The following sample code provides an example on how to perform resumable download:
# -*- coding: utf-8 -*-
import oss2
from oss2.credentials import EnvironmentVariableCredentialsProvider
# Obtain access credentials from the environment variables. Before you run the sample code, make sure that you have configured environment variables OSS_ACCESS_KEY_ID and OSS_ACCESS_KEY_SECRET.
auth = oss2.ProviderAuthV4(EnvironmentVariableCredentialsProvider())
# Specify the endpoint of the region in which the bucket is located. For example, if the bucket is located in the China (Hangzhou) region, set the endpoint to https://oss-cn-hangzhou.aliyuncs.com.
endpoint = "https://oss-cn-hangzhou.aliyuncs.com"
# Specify the ID of the region that maps to the endpoint. Example: cn-hangzhou. This parameter is required if you use the signature algorithm V4.
region = "cn-hangzhou"
# Specify the name of your bucket.
bucket = oss2.Bucket(auth, endpoint, "yourBucketName", region=region)
# Specify the full path of the object. Do not include the bucket name in the full path. Example: exampledir/exampleobject.txt.
# Specify the full path of the local file. Example: D:\\localpath\\examplefile.txt.
oss2.resumable_download(bucket, 'exampledir/exampleobject.txt', 'D:\\localpath\\examplefile.txt')
# If you do not specify a directory by using the store parameter, the .py-oss-upload directory is created in the HOME directory to store the checkpoint information.
# You can configure the following optional parameters if you use OSS SDK for Python version 2.1.0 or later.
# import sys
# # If the length of the data to download cannot be determined, the value of total_bytes is None.
# def percentage(consumed_bytes, total_bytes):
# if total_bytes:
# rate = int(100 * (float(consumed_bytes) / float(total_bytes)))
# print('\r{0}% '.format(rate), end='')
# sys.stdout.flush()
# # If you use the store parameter to specify a directory, the checkpoint information is stored in the specified directory. If you use the num_threads parameter to specify the number of concurrent download threads, make sure that the value of oss2.defaults.connection_pool_size is greater than or equal to the number of concurrent download threads. The default number of concurrent threads is 1.
# oss2.resumable_download(bucket, 'exampledir/exampleobject.txt', 'D:\\localpath\\examplefile.txt',
# store=oss2.ResumableDownloadStore(root='/tmp'),
# # Specify that resumable download is used when the length of the object is greater than or equal to the value of the multipart_threshold parameter. The multipart_threshold parameter is optional and its default value is 10 MB.
# multiget_threshold=100*1024,
# # Specify the size of each part. Unit: bytes. The valid part size ranges from 100 KB to 5 GB. The default part size is 100 KB.
# part_size=100*1024,
# # Configure the callback function that you want to use to indicate the progress of the resumable download task.
# progress_callback=percentage,
# # If you use num_threads to set the number of concurrent download threads, set oss2.defaults.connection_pool_size to a value that is greater than or equal to the number of concurrent download threads. The default number of concurrent threads is 1.
# num_threads=4)