This section describes how to read and write checkpoints(model status saved during model training at a specific point in time) directly from OSS buckets by using OssCheckpoint.
Prerequisites
OSS Connector for AI/ML is installed and configured. For more information, see Install OSS Connector for AI/ML and Configure OSS Connector for AI/ML.
OssCheckpoint
OssCheckpoint is suitable for scenarios that involves reading and writing results during the data training process.
The following example shows how to read and write checkpoints by using OssCheckpoint.
import torch
from osstorchconnector import OssCheckpoint
ENDPOINT = "endpoint"
CRED_PATH = "/root/.alibabacloud/credentials"
CONFIG_PATH = "/etc/oss-connector/config.json"
# Create a checkpoint by using OssCheckpoint
checkpoint = OssCheckpoint(endpoint=ENDPOINT, cred_path=CRED_PATH, config_path=CONFIG_PATH)
# Read the checkpoint
CHECKPOINT_READ_URI = "oss://checkpoint/epoch.0"
with checkpoint.reader(CHECKPOINT_READ_URI) as reader:
state_dict = torch.load(reader)
# Write the checkpoint
CHECKPOINT_WRITE_URI = "oss://checkpoint/epoch.1"
with checkpoint.writer(CHECKPOINT_WRITE_URI) as writer:
torch.save(state_dict, writer)
Data types
Checkpoint objects created by using OssCheckpoint providecommon I/O operations. For more information, see data types in OSS Connector for AI/ML.
Parameters
The following table describes the parameters that you need to configure when you use OssCheckpoint.
Parameter | Type | Required | Description |
endpoint | string | Yes | The endpoint that is used to access OSS. For more information, see endpoints and data centers. |
cred_path | string | Yes | The path of the authentication file. The default value is |
config_path | string | Yes | The path of the OSS Connector configuration file. The default value is |