All Products
Search
Document Center

Platform For AI:Use FeatureStore to manage features in a recommendation system

Last Updated:Nov 14, 2024

This topic describes how to use FeatureStore to build and publish a recommendation system from scratch. To do so, you need to create a project in FeatureStore, register feature tables, and then publish a trained model online.

Background information

A recommendation system is a system that can recommend personalized content or products to users based on user interests and preferences. The feature extraction and configuration of users or items matter to the performance of a recommendation system. This topic provides a solution to help you use FeatureStore to build a recommendation system and understand how FeatureStore interacts with recommendation systems by using FeatureStore SDKs of different versions. The solution consists of the following steps: create a project in FeatureStore, register feature tables, create a model feature, export a training dataset, synchronize features from an offline data store to an online data store, train a model by using the training dataset, deploy a service by using Elastic Algorithm Service (EAS), and then configure PAI-REC.

You can also directly run Python code in Notebook to complete the configuration. For more information, see DSW Gallery.

For more information about FeatureStore, see Overview.

If you have any questions when you use FeatureStore, join the DingTalk group (ID: 34415007523) for technical support.

Prerequisites

Before you perform the operations described in this topic, make sure that the following requirements are met:

Service

Description

Platform for AI (PAI)

PAI is activated and a PAI workspace is created. For more information, see Activate PAI and create the default workspace.

MaxCompute

Hologres

DataWorks

Object Storage Service (OSS)

OSS is activated. For more information, see Get started by using the OSS console.

Step 1: Prepare data

Synchronize data tables

In most recommendation scenarios, you need to prepare the following tables: user feature table, item feature table, and label table.

In this example, three simulated tables, including a user table, an item table, and a label table, in the MaxCompute project pai_online_project are used. Each partition of the user table and the item table contains approximately 100,000 data records, and occupies about 70 MB of storage capacity in the MaxCompute project. Each partition of the label table contains approximately 450,000 data records, and occupies about 5 MB of storage capacity in the MaxCompute project.

You need to execute SQL statements in DataWorks to synchronize the user table, item table, and label table from the pai_online_project project to your own MaxCompute. To synchronize data from the simulated tables, perform the following steps:

  1. Log on to the DataWorks console.

  2. In the left-side navigation pane, choose Data Development and Governance > DataStudio.

  3. On the DataStudio page, select the DataWorks workspace that you created and click Go to DataStudio.

  4. Move the pointer over Create and choose Create Node > MaxCompute > ODPS SQL. In the Create Node dialog box, configure the node parameters that are described in the following table.

    Parameter

    Description

    Engine Instance

    Select the MaxCompute engine that you created.

    Node type

    Select ODPS SQL from the Node Type drop-down list.

    Path

    Choose Business Flow > Workflow > MaxCompute.

    Name

    Specify a name.

  5. Click Confirm.

  6. On the tab of the node that you created, run the following SQL statements to synchronize the user table, item table, and label table from the pai_online_project project to your MaxCompute project. Select the exclusive resource group that you created as the resource group.

    Synchronize the user table: rec_sln_demo_user_table_preprocess_all_feature_v1. (Click to view details)

    CREATE TABLE IF NOT EXISTS rec_sln_demo_user_table_preprocess_all_feature_v1
    like pai_online_project.rec_sln_demo_user_table_preprocess_all_feature_v1
    STORED AS ALIORC  
    LIFECYCLE 90;
    
    INSERT OVERWRITE TABLE rec_sln_demo_user_table_preprocess_all_feature_v1
    PARTITION(ds='${bdp.system.bizdate}')
    SELECT * except(ds)
    FROM pai_online_project.rec_sln_demo_user_table_preprocess_all_feature_v1
    WHERE ds = '${bdp.system.bizdate}';

    In the preceding code, ${bdp.system.bizdate} is a parameter that you can use to obtain the data of the following partitions by backfilling data:

    ds=20231022

    ds=20231023

    ds=20231024

    Synchronize the item table: rec_sln_demo_item_table_preprocess_all_feature_v1. (Click to view details)

    CREATE TABLE IF NOT EXISTS rec_sln_demo_item_table_preprocess_all_feature_v1
    like pai_online_project.rec_sln_demo_item_table_preprocess_all_feature_v1
    STORED AS ALIORC  
    LIFECYCLE 90;
    
    INSERT OVERWRITE TABLE rec_sln_demo_item_table_preprocess_all_feature_v1
    PARTITION(ds='${bdp.system.bizdate}')
    SELECT * except(ds)
    FROM pai_online_project.rec_sln_demo_item_table_preprocess_all_feature_v1
    WHERE ds = '${bdp.system.bizdate}';

    In the preceding code, ${bdp.system.bizdate} is a parameter that you can use to obtain the data of the following partitions by backfilling data:

    ds=20231022

    ds=20231023

    ds=20231024

    Synchronize the label table: rec_sln_demo_label_table. (Click to view details)

    CREATE TABLE IF NOT EXISTS rec_sln_demo_label_table
    like pai_online_project.rec_sln_demo_label_table
    STORED AS ALIORC  
    LIFECYCLE 90;
    
    INSERT OVERWRITE TABLE rec_sln_demo_label_table
    PARTITION(ds='${bdp.system.bizdate}')
    SELECT * except(ds)
    FROM pai_online_project.rec_sln_demo_label_table
    WHERE ds = '${bdp.system.bizdate}';

    In the preceding code, ${bdp.system.bizdate} is a parameter that you can use to obtain the data of the following partitions by backfilling data:

    ds=20231022

    ds=20231023

    ds=20231024

  7. Backfill data on the tables to which the data is synchronized.

    1. Log on to the DataWorks console. In the left-side navigation pane, choose Data Development and Governance > Operation Center. On the Operation Center page, select a workspace from the drop-down list and click Go to Operation Center.

    2. In the left-side navigation pane, choose Auto Triggered Node O&M > Auto Triggered Nodes. The Auto Triggered Nodes page appears.

    3. On the Auto Triggered Nodes page, find the node that you want to manage and click DAG in the Actions column.

    4. Right-click the desired node and choose Run > Backfill Data for Current Node.

    5. In the Backfill Data dialogue box, set Data Timestamp to 2023-10-22 to 2023-10-24 and click OK.

After you perform the preceding steps, you can view the user table rec_sln_demo_user_table_preprocess_all_feature_v1, item table rec_sln_demo_item_table_preprocess_all_feature_v1, and label table rec_sln_demo_label_table in your workspace. These three tables are used as examples to describe the operations.

Configure data stores

In most cases, you need to configure an offline data store, such as a MaxCompute project, and an online data store, such as a Hologres instance, a GraphCompute instance, or a Tablestore instance, in FeatureStore. In this example, a MaxCompute project is configured as an offline data store and a Hologres instance is configured as an online data store.

  1. Log on to the PAI console. In the left-side navigation pane, choose Data Preparation > FeatureStore.

  2. On the FeatureStore page, select a workspace from the drop-down list and click Enter FeatureStore.

  3. Configure a MaxCompute data store.

    1. On the Store tab, click Create Store. In the Create Store panel, configure the parameters that are described in the following table for the MaxCompute data store.

      Parameter

      Description

      Type

      Select MaxCompute from the Type drop-down list.

      Name

      Specify a name.

      MaxCompute Project Name

      Select the MaxCompute project that you created.

    2. Copy the authorization statement and click Go to to synchronize data to the Hologres instance. After you execute the copied statement in DataWorks, the Hologres instance is authorized to synchronize data from the MaxCompute project.

      Note

      To grant permissions to the Hologres instance, make sure that your account has the admin permissions. For more information, see Manage user permissions by using commands or Manage user permissions in the MaxCompute console.

    3. Click Submit.

  4. Configure a Hologres data source.

    1. On the Store tab, click Create Store. In the Create Store panel, configure the parameters that are described in the following table for the Hologres data store.

      Parameter

      Description

      Type

      Select Hologres from the Type drop-down list.

      Name

      Specify a name.

      Instance ID

      Select the Hologres instance that you created.

      Database Name

      Select the database that you created in the Hologres instance.

    2. Click Submit.

    3. Grant the permissions to access the Hologres instance. For more information, see Configure data sources.

Install FeatureStore SDK for Python

  1. Log on to the DataWorks console.

  2. In the left-side navigation pane, click Resource Groups.

  3. On the Exclusive Resource Groups tab, find the resource group that you want to manage. Move the pointer over the image.png icon in the Actions column and select O&M Assistant.

  4. Click Create Command. In the Create Command panel, configure the parameters that are described in the following table.

    Parameter

    Description

    Command Name

    Specify a name. In this example, install is used.

    Command Type

    Select Manual Installation(You cannot run pip commands to install the third-party packages.).

    Command Content

    Enter the following command in the Command Content field:

    /home/tops/bin/pip3 install -i https://pypi.tuna.tsinghua.edu.cn/simple https://feature-store-py.oss-cn-beijing.aliyuncs.com/package/feature_store_py-1.3.1-py3-none-any.whl

    Timeout

    Specify a timeout period.

  5. Click Create.

  6. Click Run Command. In the message that appears, click Run.

  7. Click Refresh to view the latest status of the command. If the state of the command changes to Successful, FeatureStore SDK is installed.

2. Create a project and register feature tables in FeatureStore

You can create a project and register feature tables in FeatureStore in the PAI console or by using FeatureStore SDK based on your business requirements. You must use FeatureStore SDK to export a training dataset and synchronize data. Therefore, you still need to install FeatureStore SDK for Python after you create a project and register feature tables in the PAI console.

Method 1: Use the PAI console

  1. Create a FeatureStore project.

    1. Log on to the PAI console. In the left-side navigation pane, choose Data Preparation > FeatureStore.

    2. On the FeatureStore page, select a workspace from the drop-down list and click Enter FeatureStore.

    3. Click Create Project. On the Create Project page, configure the project parameters that are described in the following table.

      Parameter

      Description

      Name

      Specify a name. In this example, fs_demo is used.

      Description

      Enter a custom description.

      Offline Store (Offline Store)

      Select the MaxCompute data store that you configured.

      Online Store (Online Store)

      Select the Hologres data store that you configured.

    4. Click Submit.

  2. Create feature entities.

    1. On the FeatureStore page, find the created project and click the project name to go to the Project Details page.

    2. On the Feature Entity tab, click Create Feature Entity. In the Create Feature Entity panel, configure the parameters that are described in the following table for the user feature entity.

      Parameter

      Description

      Feature Entity Name

      Specify a name. In this example, user is used.

      Join Id

      Set this parameter to user_id.

    3. Click Submit.

    4. Click Create Feature Entity. In the Create Feature Entity panel, configure the parameters that are described in the following table for the item feature entity.

      Parameter

      Description

      Feature Entity Name

      Specify a name. In this example, item is used.

      Join Id

      Set this parameter to item_id.

    5. Click Submit.

  3. Create feature views.

    1. On the Feature View tab of the Project Details page, click Create Feature View. In the Create Feature View panel, configure the parameters that are described in the following table for the user feature view.

      Parameter

      Description

      View Name

      Specify a name. In this example, user_table_preprocess_all_feature_v1 is used.

      Type

      Select Offline.

      Write Mode

      Select Use Offline Table.

      Store

      Select the MaxCompute data store that you configured.

      Feature Table

      Select the prepared user table rec_sln_demo_user_table_preprocess_all_feature_v1.

      Feature Field

      Select the user_id primary key field.

      Synchronize Online Feature Table

      Select Yes.

      Feature Entity

      Select user.

      Feature Lifecycle

      Use the default value.

    2. Click Submit.

    3. Click Create Feature View. In the Create Feature View panel, configure the parameters that are described in the following table for the item feature view.

      Parameter

      Description

      View Name

      Specify a name. In this example, item_table_preprocess_all_feature_v1 is used.

      Type

      Select Offline.

      Write Mode

      Select Use Offline Table.

      Store

      Select the MaxCompute data store that you configured.

      Feature Table

      Select the prepared item table rec_sln_demo_item_table_preprocess_all_feature_v1.

      Feature Field

      Select the item_id primary key field.

      Synchronize Online Feature Table

      Select Yes.

      Feature Entity

      Select item.

      Feature Lifecycle

      Use the default value.

    4. Click Submit.

  4. Create a label table.

    1. On the Label Table tab of the Project Details page, click Create Label Table. In the Create Label Table panel, configure the parameters that are described in the following table for the label table.

      Parameter

      Description

      Store

      Select the MaxCompute data store that you configured.

      Table Name

      Select the prepared label table rec_sln_demo_label_table.

    2. Click Submit.

  5. Create a model feature.

    1. On the Model Features tab of the Project Details page, click Create Model Feature. In the Create Model Feature panel, configure the parameters that are described in the following table for the model feature.

      Parameter

      Description

      Model Feature Name

      Specify a name. In this example, fs_rank_v1 is used.

      Select Feature

      Select the user feature view and item feature view that you created.

      Label Table Name

      Select the label table rec_sln_demo_label_table that you created.

    2. Click Submit.

    3. On the Model Features tab, find the model feature that you created and click the name of the model feature.

    4. On the Basic Information tab of the Model Feature Details panel, view the value of the Export Table Name parameter. In this example, the value of the Export Table Name parameter is fs_demo_fs_rank_v1_trainning_set. You can use this table to generate features and train a model.

  6. Install FeatureStore SDK for Python. For more information, see the Method 2: Use FeatureStore SDK for Python section of this topic.

Method 2: Use FeatureStore SDK for Python

For more information about how to use FeatureStore SDK, see Feature Store.

Step 3: Export a training dataset and train a model

  1. Export a training dataset.

    1. Log on to the DataWorks console.

    2. In the left-side navigation pane, choose Data Development and Governance > DataStudio.

    3. On the DataStudio page, select the DataWorks workspace that you created and click Go to DataStudio.

    4. Move the pointer over Create and choose Create Node > MaxCompute > PyODPS 3. In the Create Node dialog box, configure the node parameters that are described in the following table.

      Parameter

      Description

      Engine Instance

      Select the MaxCompute engine that you created.

      Node Type

      Set this parameter to PyODPS 3.

      Path

      Choose Business Flow > Workflow > MaxCompute.

      Name

      Specify a name.

    5. Click Confirm.

    6. Copy the following code to the code editor:

      from feature_store_py.fs_client import FeatureStoreClient
      from feature_store_py.fs_project import FeatureStoreProject
      from feature_store_py.fs_datasource import LabelInput, MaxComputeDataSource, TrainingSetOutput
      from feature_store_py.fs_features import FeatureSelector
      from feature_store_py.fs_config import LabelInputConfig, PartitionConfig, FeatureViewConfig
      from feature_store_py.fs_config import TrainSetOutputConfig, EASDeployConfig
      import datetime
      import sys
      
      cur_day = args['dt']
      print('cur_day = ', cur_day)
      offset = datetime.timedelta(days=-1)
      pre_day = (datetime.datetime.strptime(cur_day, "%Y%m%d") + offset).strftime('%Y%m%d')
      print('pre_day = ', pre_day)
      
      
      access_key_id = o.account.access_id
      access_key_secret = o.account.secret_access_key
      fs = FeatureStoreClient(access_key_id=access_key_id, access_key_secret=access_key_secret, region='cn-beijing')
      cur_project_name = 'fs_demo'
      project = fs.get_project(cur_project_name)
      
      label_partitions = PartitionConfig(name = 'ds', value = cur_day)
      label_input_config = LabelInputConfig(partition_config=label_partitions)
      
      user_partitions = PartitionConfig(name = 'ds', value = pre_day)
      feature_view_user_config = FeatureViewConfig(name = 'user_table_preprocess_all_feature_v1',
      partition_config=user_partitions)
      
      item_partitions = PartitionConfig(name = 'ds', value = pre_day)
      feature_view_item_config = FeatureViewConfig(name = 'item_table_preprocess_all_feature_v1',
      partition_config=item_partitions)
      feature_view_config_list = [feature_view_user_config, feature_view_item_config]
      train_set_partitions = PartitionConfig(name = 'ds', value = cur_day)
      train_set_output_config = TrainSetOutputConfig(partition_config=train_set_partitions)
      
      
      model_name = 'fs_rank_v1'
      cur_model = project.get_model(model_name)
      task = cur_model.export_train_set(label_input_config, feature_view_config_list, train_set_output_config)
      task.wait()
      print("task_summary = ", task.task_summary)
    7. Click Properties on the right side of the tab. In the Properties panel, configure the scheduling parameters that are described in the following table.

      Parameter

      Description

      Scheduling Parameter

      Parameter Name

      Set this parameter to dt.

      Parameter Value

      Set this parameter to $[yyyymmdd-1].

      Resource Group

      Resource Group

      Select the exclusive resource group that you created.

      Dependencies

      Select the user and item tables that you created.

    8. After the node is configured and tested, save and submit the node configurations.

    9. Backfill data for the node. For more information, see the Synchronize data from simulated tables section of this topic.

  2. Optional. View the export job.

    1. On the FeatureStore page, find the created project and click the project name to go to the Project Details page.

    2. On the Project Details page, click Jobs.

    3. On the Jobs tab, find the job that you want to manage and click the name of the job. In the panel that appears, view the basic information, configurations, and logs of the job.

  3. Train a model.

    EasyRec is an open source recommendation system framework that can be seamlessly connected to FeatureStore to train, export, and publish models. We recommend that you use EasyRec to train a model by using the fs_demo_fs_rank_v1_trainning_set table as the training dataset.

    • For more information about the open source code of EasyRec, see EasyRec.

    • For more information about EasyRec, see What is EasyRec?

    • For more information about how to use EasyRec to train models, see train_config.

If you have other questions about EasyRec, join the DingTalk group (ID: 32260796) for technical support.

Step 4: Publish the model

After you train and export the model, you can deploy and publish the model. If you use a self-managed recommendation system, you can use FeatureStore SDK for Python, FeatureStore SDK for Go, FeatureStore SDK for C++, or FeatureStore SDK for Java provided by FeatureStore to connect your recommendation system to FeatureStore. You can also join the DingTalk group (ID 32260796) for technical support on how to connect your recommendation system to FeatureStore. FeatureStore is seamlessly integrated with other Alibaba Cloud services. You can use Alibaba Cloud services to quickly build and publish a recommendation system.

In this example, Alibaba Cloud services are used to publish a model.

Step 1: Configure routine data synchronization nodes

Before you publish a model, you must configure routine data synchronization nodes to synchronize data from the offline data store to the online data store on a regular basis. Then, data can be read from the online data store in real time. In this example, data in the user and item tables needs to be synchronized on a regular basis. To configure routine data synchronization nodes, perform the following steps:

  1. Log on to the DataWorks console.

  2. In the left-side navigation pane, choose Data Development and Governance > DataStudio.

  3. On the DataStudio page, select the DataWorks workspace that you created and click Go to DataStudio.

  4. Synchronize data from the user table on a regular basis.

    1. Move the pointer over Create and choose Create Node > MaxCompute > PyODPS 3.

    2. Copy the following code to the code editor. The code is used to synchronize data from the user_table_preprocess_all_feature_v1 feature view on a regular basis.

      Synchronize data from the user_table_preprocess_all_feature_v1 feature view (Click to view details)

      from feature_store_py.fs_client import FeatureStoreClient
      import datetime
      from feature_store_py.fs_datasource import MaxComputeDataSource
      import sys
      
      cur_day = args['dt']
      print('cur_day = ', cur_day)
      
      access_key_id = o.account.access_id
      access_key_secret = o.account.secret_access_key
      fs = FeatureStoreClient(access_key_id=access_key_id, access_key_secret=access_key_secret, region='cn-beijing')
      cur_project_name = 'fs_demo'
      project = fs.get_project(cur_project_name)
      
      feature_view_name = 'user_table_preprocess_all_feature_v1'
      batch_feature_view = project.get_feature_view(feature_view_name)
      task = batch_feature_view.publish_table(partitions={'ds':cur_day}, mode='Overwrite')
      task.wait()
      task.print_summary()
    3. Click Properties on the right side of the tab. In the Properties panel, configure the scheduling parameters that are described in the following table.

      Parameter

      Description

      Scheduling Parameter

      Parameter Name

      Set this parameter to dt.

      Parameter Value

      Set this parameter to $[yyyymmdd-1].

      Resource Group

      Resource Group

      Select the exclusive resource group that you created.

      Dependencies

      Select the user table that you created.

    4. After the node is configured and tested, save and submit the node configurations.

    5. Backfill data for the node. For more information, see the Synchronize data from simulated tables section of this topic.

  5. Synchronize data from the item table on a regular basis.

    1. Move the pointer over Create and choose Create Node > MaxCompute > PyODPS 3. In the Create Node dialog box, configure the node parameters.

    2. Click Confirm.

    3. Copy the following code to the code editor:

      Synchronize data from the item_table_preprocess_all_feature_v1 feature view (Click to view details)

      from feature_store_py.fs_client import FeatureStoreClient
      import datetime
      from feature_store_py.fs_datasource import MaxComputeDataSource
      import sys
      
      cur_day = args['dt']
      print('cur_day = ', cur_day)
      
      access_key_id = o.account.access_id
      access_key_secret = o.account.secret_access_key
      fs = FeatureStoreClient(access_key_id=access_key_id, access_key_secret=access_key_secret, region='cn-beijing')
      cur_project_name = 'fs_demo'
      project = fs.get_project(cur_project_name)
      
      feature_view_name = 'item_table_preprocess_all_feature_v1'
      batch_feature_view = project.get_feature_view(feature_view_name)
      task = batch_feature_view.publish_table(partitions={'ds':cur_day}, mode='Overwrite')
      task.wait()
      task.print_summary()
    4. Click Properties on the right side of the tab. In the Properties panel, configure the scheduling parameters that are described in the following table.

      Parameter

      Description

      Scheduling Parameter

      Parameter Name

      Set this parameter to dt.

      Parameter Value

      Set this parameter to $[yyyymmdd-1].

      Resource Group

      Resource Group

      Select the exclusive resource group that you created.

      Dependencies

      Select the item table that you created.

    5. After the node is configured and tested, save and submit the node configurations.

    6. Backfill data for the node. For more information, see the Synchronize data from simulated tables section of this topic.

  6. After the data is synchronized, view the latest features that are synchronized in the Hologres data store.

Step 2: Create and deploy a service by using EAS

The service is used to receive requests from the recommendation engine, score the items based on the requests, and then return scores. The EasyRec processor is integrated with FeatureStore SDK for C++, which implements feature extraction with low latency and high performance. After the EasyRec processor extracts features by using FeatureStore SDK for C++, the EasyRec processor sends the extracted features to the model for inference and returns the scores to the recommendation engine.

To deploy the service, perform the following steps:

  1. Log on to the DataWorks console.

  2. In the left-side navigation pane, choose Data Development and Governance > DataStudio.

  3. On the DataStudio page, select the DataWorks workspace that you created and click Go to DataStudio.

  4. Move the pointer over Create and choose Create Node > MaxCompute > PyODPS 3.

  5. Copy the following code to the code editor:

    import os
    import json
    config = {
      "name": "fs_demo_v1",
      "metadata": {
        "cpu": 4,
        "rpc.max_queue_size": 256,
        "rpc.enable_jemalloc": 1,
        "gateway": "default",
        "memory": 16000
      },
      "model_path": f"oss://beijing0009/EasyRec/deploy/rec_sln_demo_dbmtl_v1/{args['ymd']}/export/final_with_fg",  # The path of the trained model. You can specify a custom path. 
      "model_config": {
        "access_key_id": f'{o.account.access_id}',
        "access_key_secret": f'{o.account.secret_access_key}',
        "region": "cn-beijing",  # Replace the value with the ID of the region in which PAI resides. In this example, cn-beijing is used. 
        "fs_project": "fs_demo",  # Replace the value with the name of your project in FeatureStore. In this example, fs_demo is used. 
        "fs_model": "fs_rank_v1",  # Replace the value with the name of your model feature in FeatureStore. In this example, fs_rank_v1 is used. 
        "fs_entity": "item",
        "load_feature_from_offlinestore": True,
        "steady_mode": True,
        "period": 2880,
        "outputs": "probs_is_click,y_ln_playtime,probs_is_praise",
        "fg_mode": "tf"
      },
      "processor": "easyrec-1.9",
      "processor_type": "cpp"
    }
    
    with open("echo.json", "w") as output_file:
        json.dump(config, output_file)
    
    # Run the following code if you deploy the service for the first time:
    os.system(f"/home/admin/usertools/tools/eascmd -i {o.account.access_id} -k {o.account.secret_access_key} -e pai-eas.cn-beijing.aliyuncs.com create echo.json")
    
    # Run the following line for routine updates:
    # os.system(f"/home/admin/usertools/tools/eascmd -i {o.account.access_id} -k {o.account.secret_access_key} -e pai-eas.cn-beijing.aliyuncs.com modify fs_demo_v1 -s echo.json")
  6. Click Properties on the right side of the tab. In the Properties panel, configure the scheduling parameters that are described in the following table.

    Parameter

    Description

    Scheduling Parameter

    Parameter Name

    Set this parameter to dt.

    Parameter Value

    Set this parameter to $[yyyymmdd-1].

    Resource Group

    Resource Group

    Select the exclusive resource group that you created.

    Dependencies

    Select the training job and the item_table_preprocess_all_feature_v1 feature view.

  7. After the node is configured and tested, run the node to view the deployment status.

  8. After the deployment is complete, comment out Line 34 in the code and uncomment Line 37 to run the job on a regular basis.

  9. Optional. View the deployed service on the Inference Service tab of the Elastic Algorithm Service (EAS) page. For more information, see Deploy a model service in the PAI console.

  10. Optional. Connect EAS to the virtual private cloud (VPC) in which the data store resides. Data stores, such as a Hologres data store, can be accessed only over the specified VPC. In this example, a Hologres data store is used. You can view the basic information about the Hologres data store such as the VPC ID and vSwitch ID in the Hologres console. In the upper-right corner of the Elastic Algorithm Service (EAS) page of the PAI console, click Configure Direct Connection. In the Configure Direct Connection dialog box, enter the VPC ID and vSwitch ID of the Hologres data store in the VPC and vSwitch fields, and configure the Security Group Name parameter. You can select an existing security group or create a new security group. The port that is enabled for the security group must meet the requirements for connection to the Hologres data store. In most cases, port 80 is used for connection to a Hologres data store. Therefore, you must select a security group for which port 80 is enabled. Click OK. The service is available after it is updated.

Step 3: Configure PAI-REC

PAI-REC is a recommendation engine service, which integrates FeatureStore SDK for Go and can be seamlessly integrated with FeatureStore and EAS.

To configure PAI-REC, perform the following steps:

  1. Configure the FeatureStoreConfs parameter.

    • RegionId: the ID of the region in which FeatureStore resides. In this example, cn-beijing is used.

    • ProjectName: the name of the project that you created in FeatureStore. In this example, fs_demo is used.

        "FeatureStoreConfs": {
            "pairec-fs": {
                "RegionId": "cn-beijing",
                "AccessId": "${AccessKey}",
                "AccessKey": "${AccessSecret}",
                "ProjectName": "fs_demo"
            }
        },
  2. Configure the FeatureConfs parameter.

    • FeatureStoreName: Set this parameter to pairec-fs that is specified in the FeatureStoreConfs parameter.

    • FeatureStoreModelName: the name of the model feature that you created. In this example, fs_rank_v1 is used.

    • FeatureStoreEntityName: the name of the feature entity that you created. In this example, user is used. The parameter settings enable PAI-REC to extract features from the user feature entity in the fs_rank_v1 model by using FeatureStore SDK for Go.

        "FeatureConfs": {
            "recreation_rec": {
                "AsynLoadFeature": true,
                "FeatureLoadConfs": [
                    {
                        "FeatureDaoConf": {
                            "AdapterType": "featurestore",
                            "FeatureStoreName": "pairec-fs",
                            "FeatureKey": "user:uid",
                            "FeatureStoreModelName": "fs_rank_v1",
                            "FeatureStoreEntityName": "user",
                            "FeatureStore": "user"
                        }
                    }
                ]
            }
        },
  3. Configure the AlgoConfs parameter.

    The AlgoConfs parameter specifies the scoring service in EAS to which PAI-REC connects.

    • Name: the name of the service that you deployed by using EAS.

    • Url and Auth: the URL and token that are used to access the service that you deployed by using EAS. You can click the service name on the Elastic Algorithm Service (EAS) page, and then click View Endpoint Information on the Service Details tab to obtain the URL and token. For more information, see FAQ about EAS.

        "AlgoConfs": [
            {
                "Name": "fs_demo_v1",
                "Type": "EAS",
                "EasConf": {
                    "Processor": "EasyRec",
                    "Timeout": 300,
                    "ResponseFuncName": "easyrecMutValResponseFunc",
                    "Url": "eas_url_xxx",
                    "EndpointType": "DIRECT",
                    "Auth": "eas_token"
                }
            }
        ],