By NVIDIA
NVIDIA makes available on the Alibaba Cloud platform a customized image optimized for the NVIDIA Pascal? and Volta? -based Tesla GPUs. Running NGC containers on this virtual machine (VM) instance provides optimum performance for deep learning jobs.
For those familiar with the Alibaba platform, the process of launching the instance is as simple as logging into Alibaba, selecting the "NVIDIA GPU Cloud Machine Image" and one of the supported NVIDIA GPU instance types, configuring settings as needed, then launching the VM. After launching the VM, you can SSH into it and start running deep learning jobs using framework containers from the NGC container registry.
This article provides step-by-step instructions for accomplishing this.
These instructions assume the following:
Perform these preliminary setup tasks to simplify the process of launching the NVIDIA GPU Cloud machine image.
If you do not already have SSH keys set up specifically for Alibaba, you will need to set one up and have it on the machine you will use to SSH to the VM. In the examples, the key is named "alibaba-key".
.pem
file will immediately download. This is the ONLY time you can download it..pem
file, move it to the .ssh
directory.mv alibaba-key.pem ~/.ssh/
chmod 400 ~/.ssh/alibaba-key.pem
On Windows, the location will depend on the SSH client you use, so modify the path above and in the snippets or your SSH client configuration. See the Alibaba documentation for Creating an SSH key pair.
In order to create instances, you need to put them in a Security Group.
To use the Alibaba CLI, follow the Alibaba CLI Install Instructions and also install the ECS SDK.
sudo pip install aliyun-python-sdk-ecs
aliyuncli configure
Wait until your instance status is Running, then you can connect to the instance using SSH.
Once started, you can SSH into your instance using the SSH key for the root user. If you followed the setup in this tutorial, your key is in ~/.ssh/
.
Command syntax:
ssh -i <KEYPATH> root@<IP>
Example:
ssh -i ~/.ssh/alibaba-key.pem root@47.89.248.188
Refer to Connect to a Linux Instance for more instructions on connecting to your instance.
A comprehensive set of example Python scripts for automating the CLI are provided at https://github.com/nvidia/ngc-examples/tree/master/ncsp. You can download the scripts and modify them to meet your requirements. The code examples that follow use similar environment variables and structure as the scripts.
This flow and the code snippets in this section are for Linux or Mac OS X. If you are using Windows, you can use the Windows Subsystem for Linux and use the bash shell (where you will be in Ubuntu Linux).
Many of these CLI command can have significant delays.
For complete CLI documentation and sample scripts visit the Alibaba Documentation Center.
Once started, you can SSH into your instance using the SSH key for the root user. If you followed the setup in this tutorial, your key is in ~/.ssh/
.
You need to specify a source ImageID when creating an instance. Use this command to find the latest ImageID of the NVIDIA-GPU-Cloud-Machine-Image:
aliyuncli ecs DescribeImages --RegionId us-west-1 \
--ImageName "NVIDIA-GPU-Cloud-Virtual-Machine" \
--output json --filter Images.Image[0].ImageId
It will output the Image ID such as "m-rj9iy0xjiod3ghkyhz4p"
Creating an instance with the CLI is done using the aliyuncli ecs CreateInstance
command.
Full syntax documentation - https://www.alibabacloud.com/help/doc-detail/25499.htm
Launch the instance and capture the resulting JSON:
aliyuncli ecs CreateInstance \
--RegionId us-west-1 \
--ImageId "m-rj9iy0xjiod3ghkyhz4p" \
--SecurityGroupId "sg-rj94krsusal2k5l6gnnz" \
--InstanceType ecs.gn5-c4g1.xlarge \
--InstanceName "my-instance" \
--InternetMaxBandwidthOut 10 \
--InstanceChargeType PostPaid \
--KeyPairName alibaba-key
The output shows the instance ID.
{
"InstanceId": "i-rj9a0iw25hryafj0fm4v",
"RequestId": "440ECC70-09F9-492C-AB9E-21AA9C4E0531"
}
Instances created via CLI are not automatically given a public IP address.
To assign a public IP address to the instance you just created, run:
aliyuncli ecs AllocatePublicIpAddress --RegionId us-west-1 \
--InstanceId "i-rj9a0iw25hryafj0fm4v"
Successful completion of the command will return the IP address:
{
"IpAddress": "47.89.248.188",
"RequestId": "65EB59AE-FA75-446F-B5C7-2BA0F9A77CDC"
}
Instances created via CLI are not started automatically.
To start the instance you just created, run:
aliyuncli ecs StartInstance --InstanceId "i-rj9a0iw25hryafj0fm4v"
Once started, you can SSH into your instance using the SSH key for the root user. If you followed the setup in this tutorial, your key is in ~/.ssh/
.
Command syntax:
ssh -i <KEYPATH> root@<IP>
Example:
ssh -i ~/.ssh/alibaba-key.pem root@47.89.248.188
Refer to Connect to a Linux Instance for more instructions on connecting to your instance.
Once an instance is running, you can stop, (re)start, or delete your instance.
Stop:
aliyuncli ecs StopInstance --InstanceId INSTANCE_ID
Start or Restart:
aliyuncli ecs StartInstance --InstanceId INSTANCE_ID
Delete:
aliyuncli ecs DeleteInstance --InstanceId INSTANCE_ID
Source: https://docs.nvidia.com/ngc/ngc-alibaba-setup-guide/index.html
21 posts | 12 followers
FollowAlibaba Clouder - October 18, 2019
Alibaba Container Service - February 17, 2020
Alibaba Developer - June 17, 2020
Alibaba Container Service - July 29, 2019
Farruh - March 1, 2023
Alibaba Clouder - February 20, 2019
21 posts | 12 followers
FollowFueled by the insatiable demand for better 3D graphics, and the massive scale of the gaming market, NVIDIA has evolved the GPU into a computer brain at the exciting intersection of virtual reality, high performance computing, and artificial intelligence.
Learn MorePowerful parallel computing capabilities based on GPU technology.
Learn MoreMarketplace is an online market for users to search and quickly use the software as image for Alibaba Cloud products.
Learn MoreAlibaba Cloud Function Compute is a fully-managed event-driven compute service. It allows you to focus on writing and uploading code without the need to manage infrastructure such as servers.
Learn MoreMore Posts by Marketplace