Alibaba Cloud CDN provides scripts that can be used to automatically purge and prefetch content such as files and directories from origin servers in batches. Compared with manual operations, scripts greatly simplify the process. This topic describes how to use a Python script. A Windows operating system is used in the example.
Overview
After you specify a file that contains URLs to be purged or prefetched, the script splits the file based on the number of concurrent purge or prefetch tasks. The URLs are then purged or prefetched in batches. A script automatically detects whether a purge or prefetch task is completed. The next purge or prefetch task does not start until the current one ends. The following items show how this feature works:
Process URLs in batches: If you have 100 URLs in your URL list, and you have set a maximum of 10 URLs per batch, the script divides the URL list into 10 batches, each of which contains 10 URLs. If you set a larger or smaller concurrency value, the size of the batch changes accordingly. For example, if you set that 20 URLs can be processed concurrently, the script divides the 100 URLs into 5 batches, each of which contains 20 URLs.
Run tasks by batch: When you run a script, the script submits purge or prefetch requests in sequence by batch. Tasks in each batch are executed concurrently.
Proceed to the next batch of tasks only after the current batch is completed: After purge or prefetch tasks in a batch are completed, the script continues to execute the tasks in the next batch. This process is automatically performed without manual intervention.
Scenarios
We recommend that you use scripts in the following scenarios:
Purge and prefetch operations are performed manually because no developer is available. The cost of operations and maintenance (O&M) is high.
The number of URLs to be purged or prefetched is large. Batch tasks will reduce efficiency.
Whether the purge and prefetch tasks run as expected must be manually checked, which consumes a large amount of resources and time.
Limits
The Python version in the operating system must be 3.x. You can run the python --version
or python3 --version
command to check whether the Python version meets the requirements.
Before you begin
Create an AccessKey pair for a Resource Access Management (RAM) user. An Alibaba Cloud account has all permissions on resources. If the AccessKey pair of your Alibaba Cloud account is leaked, your resources are exposed to great risks. We recommend that you use the AccessKey pair of a RAM user. For information about how to obtain the AccessKey pair, see Create an AccessKey pair.
Grant the RAM user the permissions on domain name resources. In this example, the AliyunDomainFullAccess system policy is attached to the RAM user.
Use a system policy.
AliyunCDNFullAccess: grants full access to Alibaba Cloud CDN resources.
Use a custom policy.
For more information about how to create custom policies, see Create custom policies.
Configure the AccessKey pair in environment variables. For more information, see Configure environment variables in Linux, macOS, and Windows.
Step 1: Install dependencies
Run the following command to install Alibaba Cloud CDN SDK for Python. The current version is v20180510.
pip install aliyun-python-sdk-cdn
Run the following command to install the core library of Alibaba Cloud SDK for Python. The current version is 2.6.0.
pip install aliyun-python-sdk-core
Step 2: Prepare a URL list file
Create a file that contains a list of URLs to be purged or prefetched, such as urllist.txt
. Enter one URL per line. Make sure that each URL starts with http://
or https://
and is in valid format. Sample content:
http://example.com/file1.jpg
http://example.com/file2.jpg
http://example.com/file3.jpg
...
http://example.com/fileN.jpg
Step 3: Create a script
Save the following code as a script and name it Refresh.py
. This file name is an example. You can specify a custom name for the script.
Script sample code
Code execution process
Divide the file into batches by the number specified by
gop
(100).Process URLs of each batch sequentially.
Proceed to the next batch after the current batch is completed.
You can change the size of each batch by configuring the gop
variable.
View the help information
After you create a script, you can run the python $script -h
in a command line interface (CLI), such as Command Prompt, PowerShell, or Terminal, to query and display the command line help information of the Python script.
In most cases, $script
is a variable, which specifies the file name of a Python script. For example, if the file name of your script is Refresh.py
, you can run the python Refresh.py -h
command.
Run the following command in a CLI, such as Command Prompt, PowerShell, or Terminal. The script displays help information about the usage and parameters of the script.
python Refresh.py -h
After you run the command, the following content is returned:
script options explain:
-i <AccessKey> //The AccessKey ID that is used to log on to Alibaba Cloud. You can view your AccessKey pair in the Alibaba Cloud Management Console.
-i <AccessKey> //The AccessKey secret that is used to log on to Alibaba Cloud. You can view your AccessKey pair in the Alibaba Cloud Management Console.
-r <filename> //The file path and file name. After the script is executed, the script reads the URLs in the file. Each line contains only one URL. Encode URLs that contain special characters. The encoded URLs must start with http or https.
-t <taskType> //The type of the task. Set the value to clear to create a purge task. Set the value to push to create a prefetch task.
-a [String,<domestic|overseas> //Optional. regions in which the content will be prefetched. The default value is overseas.
domestic //Chinese mainland only.
overseas //Global (excluding the Chinese mainland).
-o [String,<File|Directory>] Optional. The type of the resource to be purged.
File //File (default value).
Directory //Directory.
Step 4: Run the script
Run the following command in a CLI, such as Command Prompt, PowerShell, or Terminal:
python Refresh.py -i <YourAccessKey> -k <YourAccessKeySecret> -r <PathToUrlFile> -t <TaskType>
<YourAccessKey>
: the AccessKey ID of your Alibaba Cloud account.
<YourAccessKeySecret>
: the AccessKey secret of your Alibaba Cloud account.
<PathToUrlFile>
: the path to the file that contains the list of URLs. Example: urllist.txt
.
<TaskType>
: the task type. Valid values: clear
(purge) and push
(prefetch).
Sample commands
Assume that the AccessKey ID is
yourAccessKey
, the AccessKey secret isyourAccessKeySecret
, the URL list file isurllist.txt
, the URL list file and theRefresh.py
script are in the same directory, and the task type isclear
(purge). Run the following command in a CLI, such as Command Prompt, PowerShell, or Terminal:python Refresh.py -i yourAccessKey -k yourAccessKeySecret -r urllist.txt -t clear
If the URL list file is in a different directory, such as
D:\example\filename\urllist.txt
, run the following command in a CLI, such as Command Prompt, PowerShell, or Terminal:python Refresh.py -i yourAccessKey -k yourAccessKeySecret -r D:\example\filename\urllist.txt -t clear
Sample output:
python Refresh.py -i yourAccessKey -k yourAccessKeySecret -r urllist.txt -t clear
{'RequestId': 'C1686DCA-F3B5-5575-ADD1-05F96617D770', 'RefreshTaskId': '18392588710'}
[18392588710] is doing... ...
{'RequestId': '5BEAD371-9D82-5DA5-BE60-58EC2C915E82', 'RefreshTaskId': '18392588804'}
[18392588804] is doing... ...
{'RequestId': 'BD0B3D22-66CF-5B1D-A995-D912A5EA8E2F', 'RefreshTaskId': '18392588804'}
[18392588804] is doing... ...
[18392588804] is doing... ...
[18392588804] is doing... ...