全部產品
Search
文件中心

Platform For AI:ALS矩陣分解

更新時間:Nov 26, 2024

交替最小二乘ALS(Alternating Least Squares)是矩陣分解的一種演算法,常用於推薦系統中,尤其是協同過濾情境。其主要目標是將一個使用者-物品評分矩陣分解為兩個低階矩陣的乘積,從而實現降維、填補缺失值和發現潛在的使用者偏好和物品特徵。

支援的計算資源

MaxCompute/Flink

輸入/輸出

輸入樁

輸入的上遊組件支援:

輸出樁

輸出的User因子和Item因子對應下遊組件:ALS評分

配置組件

在Designer工作流程頁面添加ALS矩陣分解組件,並在介面右側配置相關參數:

參數類型

參數

描述

欄位設定

user列名

輸入資料來源中,使用者ID列的名稱。該列資料必須是BIGINT類型。

item列名

輸入資料來源中,item項的列名。該列資料必須是BIGINT類型。

打分列名

輸入資料來源中,使用者對item項的打分所在的列名。該列資料必須是數值型。

參數設定

因子數

預設值為10,取值範圍為(0,+∞)

迭代數

預設值為10,取值範圍為(0,+∞)

正則化係數

預設值為0.1,取值範圍為(0,+∞)

複選框

是否採用隱式偏好模型。

隱式偏好係數

預設值為40,取值範圍為(0,+∞)

輸出表生命週期

輸出模型表的生命週期,單位天。

執行調優

節點個數

取值範圍為1~9999。

單個節點記憶體大小

取值範圍為1024 MB~64*1024 MB。

使用樣本

使用以下資料作為ALS演算法模板的輸入資料,可以獲得輸出的user因子和item因子:

  • 輸入資料來源

    user_id

    item_id

    rating

    10944750

    13451

    0

    10944751

    13452

    1

    10944752

    13453

    2

    10944753

    13454

    2

    10944754

    13455

    4

    ... ...

    ... ...

    ... ...

  • 輸出的user因子表

    user_id

    factors

    8528750

    [0.026986524,0.03350178,0.03532385,0.019542359,0.020429865,0.02046867,0.022253247,0.027391396,0.018985065,0.04889483]

    282500

    [0.116156064,0.07193632,0.090851225,0.017075706,0.025412979,0.047022138,0.12534861,0.05869226,0.11170533,0.1640192]

    4895250

    [0.038429666,0.061858658,0.04236993,0.055866677,0.031814687,0.0417443,0.012085311,0.0379342,0.10767074,0.028392972]

    ... ...

    ... ...

  • 輸出的item因子表

    item_id

    factors

    24601

    [0.0063337763,0.026349949,0.0064828005,0.01734504,0.022049638,0.0059205987,0.008568814,0.0015981696,0.0,0.013601779]

    26699

    [0.0027524426,0.0043066847,0.0031336215,0.00269448,0.0022347474,0.0020477585,0.0027995422,0.0025390312,0.0033011117,0.003957773]

    20751

    [0.03902271,0.050952066,0.032981463,0.03862796,0.048720762,0.027976315,0.02721664,0.018149626,0.0149896275,0.026251089]

    ... ...

    ... ...