All Products
Search
Document Center

Container Service for Kubernetes:Configure MLflow Model Registry

最終更新日:Aug 09, 2024

MLflow is an open source machine learning lifecycle management platform, which can be used to track model training information, manage machine learning models, and deploy machine learning models. This topic describes how to configure MLflow Model Registry for the model management feature.

Introduction to MLflow Model Registry

For more information about MLflow Model Registry, see MLflow Model Registry - MLflow documentation.

Prerequisites

  • A Container Service for Kubernetes (ACK) Pro cluster that runs Kubernetes 1.20 or later is created. For more information, see Create an ACK Pro cluster.

  • An ApsaraDB RDS for PostgreSQL instance is created. For more information, see Create an ApsaraDB RDS for PostgreSQL instance.

    We recommend that you select the virtual private cloud (VPC) of the ACK cluster when you create the ApsaraDB RDS for PostgreSQL instance and add the VPC CIDR block to the whitelist of the instance so that you can use private IP addresses to access the instance. If the ApsaraDB RDS for PostgreSQL instance and ACK cluster reside in different VPCs, make sure that Internet access is enabled for the ApsaraDB RDS for PostgreSQL instance and add the VPC CIDR block of the ACK cluster to the whitelist of the instance. For more information, see Configure an IP address whitelist.

  • A regular user account named mlflow is created on the ApsaraDB RDS for PostgreSQL instance. For more information, see Create an account.

  • A database named mlflow_store is created on the ApsaraDB RDS for PostgreSQL instance to store model metadata. Authorized By is set to mlflow. For more information, see Create a database.

  • (Optional) A database named mlflow_basic_auth is created on the ApsaraDB RDS for PostgreSQL instance to store MLflow user authentication information. Authorized By is set to mlflow. For more information, see Create a database.

  • The Arena client is configured to manage models. The Arena version is 0.9.14 or later. For more information, see Configure the Arena client.

Step 1: Deploy MLflow in the ACK cluster

  1. Log on to the ACK console. In the left-side navigation pane, click Clusters.

  2. On the Clusters page, find the cluster that you want to manage and click its name. In the left-side navigation pane, choose Applications > Helm.

  3. Click Deploy. In the Deploy panel, set Application Name to mlflow and Namespace to kube-ai. In the Chart section, search for and click mlflow, and then click Next. In the message that appears, confirm whether to use mlflow as the default namespace of the chart.

    • To use AI Developer Console to manage models, you need to deploy MLflow in the kube-ai namespace and use the default release name mlflow.

    • To use Arena to manage models, deploy MLflow in any namespace and use the default release name mlflow.

  4. In the Deploy panel, configure the parameters of the chart.

    1. Configure the defaultArtifactRoot and backendStore parameters as shown in the following example:

      trackingServer:
        # -- Specifies which mode mlflow tracking server run with, available options are `serve-artifacts`, `no-serve-artifacts` and `artifacts-only`
        mode: no-serve-artifacts
        # -- Specifies a default artifact location for logging, data will be logged to `mlflow-artifacts/:` if artifact serving is enabled, otherwise `./mlruns`
        defaultArtifactRoot: "./mlruns"
        
      # For more information about how to configure backend store, please visit https://mlflow.org/docs/latest/tracking/backend-stores.html
      backendStore:
        # -- Backend store uri e.g. `<dialect>+<driver>://<username>:<password>@<host>:<port>/<database>`
        backendStoreUri: postgresql+psycopg2://mlflow:<password>@pgm-xxxxxxxxxxxxxx.pg.rds.aliyuncs.com/mlflow_store

      Replace backendStore.backendStoreUri with the address of the database named mlflow_store in the Prerequisites section. Example: postgresql+psycopg2://mlflow:<password>@pgm-xxxxxxxxxxxxxx.pg.rds.aliyuncs.com/mlflow_store.

      Important

      If the ApsaraDB RDS instance and the ACK cluster reside in the VPC, use the internal endpoint of the ApsaraDB RDS instance. Otherwise, use the public endpoint of the ApsaraDB RDS instance and ensure that the ACK cluster can access it.

      Log on to the RDS PostgreSQL console, click Instance ID > Log On to Database > Internal/Public Endpoint to obtain the database endpoint pgm-xxxxxxxxxxxxxx.pg.rds.aliyuncs.com.

      For more information, see Database connection.

    2. (Optional) To enable BasicAuth, configure the following parameters.

      trackingServer:
        # -- Specifies which mode mlflow tracking server run with, available options are `serve-artifacts`, `no-serve-artifacts` and `artifacts-only`
        mode: no-serve-artifacts
        # -- Specifies a default artifact location for logging, data will be logged to `mlflow-artifacts/:` if artifact serving is enabled, otherwise `./mlruns`
        defaultArtifactRoot: "./mlruns"
        
        # Basic authentication configuration,
        # for more information, please visit https://mlflow.org/docs/latest/auth/index.html#configuration
        basicAuth:
          # -- Specifies whether to enable basic authentication
          enabled: true
          # -- Default permission on all resources, available options are `READ`, `EDIT`, `MANAGE` and `NO_PERMISSIONS`
          defaultPermission: NO_PERMISSIONS
          # -- Database location to store permissions and user data e.g. `<dialect>+<driver>://<username>:<password>@<host>:<port>/<database>`
          databaseUri: postgresql+psycopg2://<username>:<password>@pgm-xxxxxxxxxxxxxx.pg.rds.aliyuncs.com/mlflow_basic_auth
          # -- Default admin username if the admin is not already created
          adminUsername: admin
          # -- Default admin password if the admin is not already created
          adminPassword: password
          # -- Function to authenticate requests
          authorizationFunction: mlflow.server.auth:authenticate_request_basic_auth
          
      # For more information about how to configure backend store, please visit https://mlflow.org/docs/latest/tracking/backend-stores.html
      backendStore:
        # -- Backend store uri e.g. `<dialect>+<driver>://<username>:<password>@<host>:<port>/<database>`
        backendStoreUri: postgresql+psycopg2://mlflow:<password>@pgm-xxxxxxxxxxxxxx.pg.rds.aliyuncs.com/mlflow_store
      • Replace trackingServer.basicAuth.databaseUri with the address of the database named mlflow_basic_auth in the Prerequisites section. Example: postgresql+psycopg2://<username>:<password>@pgm-xxxxxxxxxxxxxx.pg.rds.aliyuncs.com/mlflow_basic_auth.

      • Configure the trackingServer.basicAuth.adminUsername and trackingServer.basicAuth.adminPassword parameters to set the username and initial password of the MLflow administrator. You need to create an administrator only when no administrator exists.

    For more information about the parameters of MLflow, see MLflow.

Step 2: Access the MLflow web UI deployed in the ACK cluster

  1. Run the following command to forward traffic from local port 5000 to the pod of the MLflow web UI service:

    kubectl port-forward -n kube-ai services/mlflow 5000

    Expected output:

    Forwarding from 127.0.0.1:5000 -> 5000
    Forwarding from [::1]:5000 -> 5000
    Handling connection for 5000
    Handling connection for 5000
    ...
  2. Enter http://127.0.0.1:5000 into the address bar of the browser to access the MLflow web UI.

    image

What to do next: manage models

You can use the cloud-native AI suite to manage models in MLflow Model Registry. For more information about how to use AI Developer Console and the Arena CLI to manage model, see Manage models in MLflow Model Registry.