MLflow is an open source machine learning lifecycle management platform, which can be used to track model training information, manage machine learning models, and deploy machine learning models. This topic describes how to configure MLflow Model Registry for the model management feature.
Introduction to MLflow Model Registry
For more information about MLflow Model Registry, see MLflow Model Registry - MLflow documentation.
Prerequisites
A Container Service for Kubernetes (ACK) Pro cluster that runs Kubernetes 1.20 or later is created. For more information, see Create an ACK Pro cluster.
An ApsaraDB RDS for PostgreSQL instance is created. For more information, see Create an ApsaraDB RDS for PostgreSQL instance.
We recommend that you select the virtual private cloud (VPC) of the ACK cluster when you create the ApsaraDB RDS for PostgreSQL instance and add the VPC CIDR block to the whitelist of the instance so that you can use private IP addresses to access the instance. If the ApsaraDB RDS for PostgreSQL instance and ACK cluster reside in different VPCs, make sure that Internet access is enabled for the ApsaraDB RDS for PostgreSQL instance and add the VPC CIDR block of the ACK cluster to the whitelist of the instance. For more information, see Configure an IP address whitelist.
A regular user account named
mlflow
is created on the ApsaraDB RDS for PostgreSQL instance. For more information, see Create an account.A database named
mlflow_store
is created on the ApsaraDB RDS for PostgreSQL instance to store model metadata. Authorized By is set tomlflow
. For more information, see Create a database.(Optional) A database named
mlflow_basic_auth
is created on the ApsaraDB RDS for PostgreSQL instance to store MLflow user authentication information. Authorized By is set tomlflow
. For more information, see Create a database.The Arena client is configured to manage models. The Arena version is 0.9.14 or later. For more information, see Configure the Arena client.
Step 1: Deploy MLflow in the ACK cluster
Log on to the ACK console. In the left-side navigation pane, click Clusters.
On the Clusters page, find the cluster that you want to manage and click its name. In the left-side navigation pane, choose .
Click Deploy. In the Deploy panel, set Application Name to
mlflow
and Namespace tokube-ai
. In the Chart section, search for and clickmlflow
, and then click Next. In the message that appears, confirm whether to use mlflow as the default namespace of the chart.To use AI Developer Console to manage models, you need to deploy MLflow in the
kube-ai
namespace and use the default release namemlflow
.To use Arena to manage models, deploy MLflow in any namespace and use the default release name
mlflow
.
In the Deploy panel, configure the parameters of the chart.
Configure the
defaultArtifactRoot
andbackendStore
parameters as shown in the following example:trackingServer: # -- Specifies which mode mlflow tracking server run with, available options are `serve-artifacts`, `no-serve-artifacts` and `artifacts-only` mode: no-serve-artifacts # -- Specifies a default artifact location for logging, data will be logged to `mlflow-artifacts/:` if artifact serving is enabled, otherwise `./mlruns` defaultArtifactRoot: "./mlruns" # For more information about how to configure backend store, please visit https://mlflow.org/docs/latest/tracking/backend-stores.html backendStore: # -- Backend store uri e.g. `<dialect>+<driver>://<username>:<password>@<host>:<port>/<database>` backendStoreUri: postgresql+psycopg2://mlflow:<password>@pgm-xxxxxxxxxxxxxx.pg.rds.aliyuncs.com/mlflow_store
Replace
backendStore.backendStoreUri
with the address of the database namedmlflow_store
in the Prerequisites section. Example:postgresql+psycopg2://mlflow:<password>@pgm-xxxxxxxxxxxxxx.pg.rds.aliyuncs.com/mlflow_store
.ImportantIf the ApsaraDB RDS instance and the ACK cluster reside in the VPC, use the internal endpoint of the ApsaraDB RDS instance. Otherwise, use the public endpoint of the ApsaraDB RDS instance and ensure that the ACK cluster can access it.
Log on to the RDS PostgreSQL console, click Instance ID > Log On to Database > Internal/Public Endpoint to obtain the database endpoint
pgm-xxxxxxxxxxxxxx.pg.rds.aliyuncs.com
.For more information, see Database connection.
(Optional) To enable BasicAuth, configure the following parameters.
trackingServer: # -- Specifies which mode mlflow tracking server run with, available options are `serve-artifacts`, `no-serve-artifacts` and `artifacts-only` mode: no-serve-artifacts # -- Specifies a default artifact location for logging, data will be logged to `mlflow-artifacts/:` if artifact serving is enabled, otherwise `./mlruns` defaultArtifactRoot: "./mlruns" # Basic authentication configuration, # for more information, please visit https://mlflow.org/docs/latest/auth/index.html#configuration basicAuth: # -- Specifies whether to enable basic authentication enabled: true # -- Default permission on all resources, available options are `READ`, `EDIT`, `MANAGE` and `NO_PERMISSIONS` defaultPermission: NO_PERMISSIONS # -- Database location to store permissions and user data e.g. `<dialect>+<driver>://<username>:<password>@<host>:<port>/<database>` databaseUri: postgresql+psycopg2://<username>:<password>@pgm-xxxxxxxxxxxxxx.pg.rds.aliyuncs.com/mlflow_basic_auth # -- Default admin username if the admin is not already created adminUsername: admin # -- Default admin password if the admin is not already created adminPassword: password # -- Function to authenticate requests authorizationFunction: mlflow.server.auth:authenticate_request_basic_auth # For more information about how to configure backend store, please visit https://mlflow.org/docs/latest/tracking/backend-stores.html backendStore: # -- Backend store uri e.g. `<dialect>+<driver>://<username>:<password>@<host>:<port>/<database>` backendStoreUri: postgresql+psycopg2://mlflow:<password>@pgm-xxxxxxxxxxxxxx.pg.rds.aliyuncs.com/mlflow_store
Replace
trackingServer.basicAuth.databaseUri
with the address of the database namedmlflow_basic_auth
in the Prerequisites section. Example:postgresql+psycopg2://<username>:<password>@pgm-xxxxxxxxxxxxxx.pg.rds.aliyuncs.com/mlflow_basic_auth
.Configure the
trackingServer.basicAuth.adminUsername
andtrackingServer.basicAuth.adminPassword
parameters to set the username and initial password of the MLflow administrator. You need to create an administrator only when no administrator exists.
For more information about the parameters of MLflow, see MLflow.
Step 2: Access the MLflow web UI deployed in the ACK cluster
Run the following command to forward traffic from local port 5000 to the pod of the MLflow web UI service:
kubectl port-forward -n kube-ai services/mlflow 5000
Expected output:
Forwarding from 127.0.0.1:5000 -> 5000 Forwarding from [::1]:5000 -> 5000 Handling connection for 5000 Handling connection for 5000 ...
Enter http://127.0.0.1:5000 into the address bar of the browser to access the MLflow web UI.
What to do next: manage models
You can use the cloud-native AI suite to manage models in MLflow Model Registry. For more information about how to use AI Developer Console and the Arena CLI to manage model, see Manage models in MLflow Model Registry.