After you activate Lingjun AI Computing Service, the networks of compute nodes that you purchase are isolated from the Alibaba Cloud public cloud. You must use other Alibaba Cloud services together with Lingjun AI Computing Service to implement network connectivity and status monitoring. The services include Virtual Private Cloud (VPC), Cloud Enterprise Network (CEN), and Application Real-time Monitoring Service (ARMS). This topic describes the Alibaba Cloud services that you must purchase and configure before you can use Lingjun AI Computing Service.
Background information
The compute nodes are in an isolated network environment after they are purchased. You can connect the compute nodes to CEN by using Lingjun connection instances. This way, compute nodes are connected to the Alibaba Cloud public cloud. To implement network connectivity, you must configure Lingjun services such as compute nodes, Lingjun connection instances, and Cloud Parallel File Storage (CPFS). You must also configure the following services:
CEN: You can use CEN to implement network connectivity between Lingjun AI Computing Service and the Alibaba Cloud public cloud. For more information about how to configure CEN, see the CEN configurations section of this topic.
VPC: After Lingjun AI Computing Service is connected to the Alibaba Cloud public cloud by using CEN, you must create and configure a VPC to connect Lingjun AI Computing Service to cloud services in other VPCs. For more information, see the VPC configurations section of this topic.
ARMS: ARMS is used to monitor cluster instances of Lingjun AI Computing Service in real time. This way, you can view the status and details of cluster instances on the dashboard in the Intelligent Computing Lingjun console. ARMS is automatically activated and configured after you activate Lingjun AI Computing Service.
CEN configurations
To connect Lingjun AI Computing Service to the Alibaba Cloud public cloud by using CEN, you must configure the following resources. For more information about detailed service descriptions and operation guides, see CEN documentation.
Create a CEN instance. For more information, see CEN instances.
Create a transit router for the CEN instance. For more information, see Transit routers.
Configure the transit router.
You must connect VPCs in which other cloud services are deployed to the transit router of the CEN instance.
Intra-region connection: You can connect the VPCs to the transit router by using intra-region connection. This way, the VPCs in the region can communicate with each other. For more information, see Create a VPC connection.
Inter-region connection: If you want to access Alibaba Cloud services that are deployed in different regions, you can create inter-region connections and allocate bandwidth resources to the connections. For more information, see Manage inter-region connections.
VPC configurations
The preceding network topology shows that Lingjun AI Computing Service and the connected cloud services are in different VPCs. Therefore, you must create a VPC and vSwitches to ensure network connectivity between Lingjun AI Computing Service and other cloud services. For more information, see Create a VPC with an IPv4 CIDR block.
If you have created a VPC, you can use the created VPC for Lingjun services. Make sure that the vSwitches in the VPC have idle IP addresses.
Monitoring networks: An IP address in the VPC is assigned to the networks that are used to monitor the network connectivity of Lingjun AI Computing Service.
ARMS and other cloud services: You can connect VPCs in which other cloud services are deployed to the transit router of the CEN instance. This way, Lingjun AI Computing Service can connect with other cloud services.
After the VPC and vSwitches are created, you can use the VPC and vSwitches for subsequent operations during cluster configurations.
Configurations of an ACK Lingjun managed cluster
If you need to process business of large-scale data computing and high-performance data processing in a more efficient and stable manner, you can activate Container Service for Kubernetes (ACK).
If you use ACK for the first time, you must assign default roles to ACK and activate other cloud services. For more information, see What is ACK Lingjun?
For more information about the cloud services that you need to activate before you use an ACK Lingjun managed cluster, see the "Cloud service fee" section of the Billing of ACK Lingjun clusters topic.
Configurations of an ApsaraDB RDS for MySQL instance
Create an ApsaraDB RDS for MySQL instance. For more information, see Create an ApsaraDB RDS for MySQL instance.
Make sure that the ApsaraDB RDS for MySQL instance and the ACK Lingjun managed cluster are in the same VPC.
Create a database account and six databases for the ApsaraDB RDS for MySQL instance. For more information, see Create accounts and databases.
Databases are used to store data of the following Machine Learning Platform for AI (PAI) components: dlc, notebook, eas, paiflow, pai_user, and pai_console.
Create a secret object for MySQL databases in the ACK Lingjun managed cluster based on the endpoint and account of the ApsaraDB RDS for MySQL instance.
NAS configurations
Create a File Storage NAS (NAS) file system. For more information, see Create a file system.
Make sure that the NAS file system and the ACK Lingjun managed cluster are in the same VPC.
Create a mount target. For more information, see CreateMountTarget.
Container Registry configurations
Create a Container Registry instance that is in the same VPC as the ACK Lingjun managed cluster.
Run the
docker push
command to push a base image to the image repositories in the Container Registry instance. For more information, see Use images of Container Registry to deploy applications in other cloud services.Create a secret object for the Docker registry from which the image is pulled to the ACK Lingjun managed cluster.
OAuth configurations
Use Open Authorization (OAuth) to create an application and obtain the name of the application. For more information, see Create an application.
Add OAuth scopes to the application. For more information, see Add OAuth scopes.
Create a secret for the application. For more information, see Create an application secret.
Configurations of an ARMS application monitoring agent
Install an ARMS application monitoring agent. For more information, see the "Step 1: Install the ARMS application monitoring agent" section of the Enable ARMS for a registered cluster topic.
Add the ARMS application monitoring agent to the ACK Lingjun managed cluster.
You must deploy add-ons when you create the ACK Lingjun managed cluster. The add-ons include the prometheus, kube-state-metrics, gpu-exporter, and node-exporter components.