To ensure that requests to access OSS are sent by legitimate users or applications and OSS Connector for AI/ML is properly initialized, you must configure parameters accordingly.
Prerequisites
OSS Connector for AI/ML is installed. For more information, see Installation.
Configure access credentials
Create an access credentials configuration file named credentials.
mkdir -p /root/.alibabacloud && touch /root/.alibabacloud/credentials
Configure parameters and save the configuration file.
Example:
{ "AccessKeyId": "<Access-key-id>", "AccessKeySecret": "<Access-key-secret>", "SecurityToken": "<Security-Token>", "Expiration": "2024-08-02T15:04:05Z" }
The following table describes the preceding parameters.
Parameter
Required
Example
Description
AccessKeyId
Yes
STS.L4aB******************
The AccessKey ID and AccessKey secret of your Alibaba Cloud account or a RAM user.
If you use a security token for authentication, set these parameters to the AccessKey ID and AccessKey secret in the temporary access credentials.
AccessKeySecret
Yes
At32************************
SecurityToken
No
STS.6MC2***************************************
The security token. This parameter is required when you use the temporary access credentials obtained from Security Token Service (STS) to access OSS.
If you use the AccessKey ID and AccessKey secret of an Alibaba Cloud account or a RAM user for authentication leave this parameter empty.
Expiration
No
2024-08-02T15:04:05Z
The expiration time of the authentication information. After the expiration time, the OSS connector re-reads the authentication information. If you do not specify the Expiration parameter, the authentication information does not expire.
If you use a security token for authentication, we recommend that you specify this parameter.
If you use the AccessKey ID and AccessKey secret of an Alibaba Cloud account or a RAM user for authentication, leave this parameter empty.
Use AccessKey ID and AccessKey secret:
Replace
<Access-key-id>
and<Access-key-secret>
in the example with the AccessKey ID and AccessKey secret of a RAM user. For more information about how to create an AccessKey ID and AccessKey secret, see Create an AccessKey pair.{ "AccessKeyId": "LTAI************************", "AccessKeySecret": "At32************************" }
Use temporary access credentials:
NoteTo ensure data security in scenarios in which access credentials are used in the production environment for a long period of time, we recommend that you use temporary access credentials to prevent the AccessKey ID and the AccessKey secret from being leaked. If you want to authorize temporary access, you must obtain temporary access credentials. For more information, see Use temporary access credentials provided by STS to access OSS. After you obtain the temporary access credentials, replace <Access-key-id>, <Access-key-secret>, and <Security-Token> with the AccessKey ID, AccessKey secret, and security token.
{ "AccessKeyId": "STS.L4aB******************, "AccessKeySecret": "wyLTSm*************************", "SecurityToken": "************", "Expiration": "2024-08-15T15:04:05Z" }
Run the
chmod 400 /root/.alibabacloud/credentials
command to grant read-only permissions on thecredentials
file to ensure the security of the AccessKey ID and AccessKey secret.
Configure OSS Connector
Create a configuration file named config.json for OSS connector.
mkdir -p /etc/oss-connector/ && touch /etc/oss-connector/config.json
Configure parameters and save the configuration file.
In most cases, you can use the default configurations.
{ "logLevel": 1, "logPath": "/var/log/oss-connector/connector.log", "auditPath": "/var/log/oss-connector/audit.log", "datasetConfig": { "prefetchConcurrency": 24, "prefetchWorker": 2 }, "checkpointConfig": { "prefetchConcurrency": 24, "prefetchWorker": 4, "uploadConcurrency": 64 } }
The following table describes the parameters. Read the instructions in the table carefully before you change the configurations.
Parameter
Required
Example
Description
logLevel
No
1
The log level. The default value is 1. We recommend that you set the parameter to 2.
Valid values: 0, 1, 2, and 3. 0 specifies Debug, 1 specifies INFO, 2 specifies WARN, and 3 specifies ERROR.
logPath
No
/var/log/oss-connector/connector.log
The path of the OSS Connector for AI/ML log. Default value:
/var/log/oss-connector/connector.log
.auditPath
No
/var/log/oss-connector/audit.log
The path of the OSS Connector for AI/ML audit log, which records read and write requests that have a latency of greater than 100 milliseconds. Default value:
/var/log/oss-connector/audit.log
.DatasetConfig
prefetchConcurrency
No
24
The number of concurrent download tasks when you use a dataset to prefetch data from OSS. Default value: 24.
prefetchWorker
No
2
The number of available vCPUs when you use a dataset to prefetch data from OSS. Default value: 2.
checkpointConfig
prefetchConcurrency
No
24
The number of concurrent download tasks when you use checkpoint read to prefetch data from OSS. Default value: 24.
prefetchWorker
No
4
The number of available vCPUs when you use checkpoint read to prefetch data from OSS. Default value: 4.
uploadConcurrency
No
64
The number of concurrent upload tasks when you use checkpoint write to upload data to OSS. Default value: 64.
References
After you install and configure OSS Connector for AI/ML, you can perform the following operations by using Pytorch training jobs:
Use OssMapDataset to build a map dataset suitable for random reading. For more information, see Use OSS data to build an OssMapDataset dataset for random reading.
Use OssIterableDataset to build an iterable dataset suitable for sequential streaming reading. For more information, see Use data in OSS to build an iterable dataset suitable for sequential streaming reading.
Use OssCheckpoint to perform read and write operations on checkpoints in OSS. For more information, see Store and access checkpoints in OSS.