The Factorization Machine (FM) algorithm is a nonlinear model that incorporates interactions among features. This algorithm is suitable for scenarios in which E-commerce, advertisements, and live video streaming are used to promote commodities.
Configure the components
Machine Learning Designer, previously known as Machine Learning Studio, provides the FM algorithm in the FM Train and FM Prediction components. You can use the templates that contain the components to create FM experiments. If you use Machine Learning Studio, you can find the FM-Embedding for Rec-System template on the Home page and click Create. If you use Machine Learning Designer, you can search for the [Alink]FM-Embedding for Rec-System template on the Pipeline Templates tab of the Visualized Modeling (Designer) page and click Create.
You can use one of the following methods to configure the FM algorithm component.
Method 1: Configure the component on the pipeline page
Component | Tab | Parameter | Description |
---|---|---|---|
FM Train | Fields Setting | Feature Columns | Select feature columns based on the characteristics of the input table. Columns of the STRING and DOUBLE types are supported. |
Label Column | Select a label column based on the characteristics of the input table. Only the columns of the DOUBLE type are supported. | ||
Advanced Options | This parameter is available only in Machine Learning Designer. If you select Advanced Options, Flink configuration item is available. | ||
Flink configuration item | This parameter is available only in Machine Learning Designer. Specify the Flink configuration items. For more information, see Configuration. | ||
Parameters Setting | Task Type | Select the task type. Valid values:
| |
Number of iterations | Specify the total number of iterations. Default value: 10. | ||
Regularization coefficient | Specify three floating-point numbers separated by commas (,). These three numbers represent the regularization coefficients of the 0th order term, 1st order term, and 2nd order term. | ||
Learning rate | Specify the learning rate. If the training is diverged, set this parameter to a smaller value. | ||
Parameter initialization standard deviation | Specify the standard deviation for parameter initialization. This parameter is used to normalize data. The value is of the DOUBLE type. Default value: 0.05. | ||
Dimensions | Specify three positive integers separated by commas (,). These three positive integers represent the lengths of the 0th order term, 1st order term, and 2nd order term. | ||
Block size | Specify the name of the performance metric. | ||
Output table lifecycle | This parameter is available only in Machine Learning Studio. Specify the lifecycle of the output table. | ||
Tuning | Number of Workers | Specify the number of workers. This parameter must be used together with the Memory Size per Node (MB) parameter. Valid values: 1 to 9999. | |
Memory Size per Node (MB) | Specify the memory size of each node. This parameter must be used together with the Number of Workers parameter. Valid values: 1024 to 64 × 1024. Unit: MB. | ||
FM Prediction | Parameters Setting | Prediction Result Column | Specify the name of the prediction result column. |
Prediction Score Column | This parameter is available only in Machine Learning Studio. Specify the name of the prediction score column. None. | ||
Output Detail Column | Specify the name of the prediction detail column. | ||
Reserved Columns | Specify the columns that you want to reserve in the output table. | ||
Advanced Configuration | This parameter is available only in Machine Learning Designer. If you select Advanced Configuration, Number of Threads using by each worker and Type of ModelSize are available. | ||
Number of Threads using by each worker | Specify the number of threads used for prediction in each worker. Default value: 1. | ||
Type of ModelSize | Specify the model size type. Default value: small. Valid values:
| ||
Tuning | Number of Workers | Specify the number of workers. This parameter must be used together with the Memory Size per Core (MB) parameter. Valid values: 1 to 9999. | |
Memory Size per Core (MB) | Specify the memory size of each core. This parameter must be used together with the Number of Workers parameter. Valid values: 1024 to 64 × 1024. Unit: MB. |
Method 2: Use PAI commands
Component | Parameter | Required | Description | Default value |
---|---|---|---|---|
FM Train | tensorColName | Yes | The name of the feature column. Data in the column must be in the key-value format. Separate multiple names with commas (,). Example: 1:1.0,3:1.0. | None |
labelColName | Yes | The name of the label column. Only the columns of numeric data types are supported. If the task parameter is set to binary_classification, the value of label must be 0 or 1. | None | |
task | Yes | The type of the task. Valid values: regression and binary_classification. | regression | |
numEpochs | No | The number of iterations. | 10 | |
dim | No | Three positive integers separated by commas (,). These three positive integers represent the lengths of the 0th order term, 1st order term, and 2nd order term. | 1,1,10 | |
learnRate | No | The learning rate. Note If the training is diverged, set the learnRate parameter to a smaller value. | 0.01 | |
lambda | No | Three floating-point numbers separated by commas (,). These three numbers represent the regularization coefficients of the 0th order term, 1st order term, and 2nd order term. | 0.01,0.01,0.01 | |
initStdev | No | The standard deviation of parameter initialization. | 0.05 | |
FM Prediction | predResultColName | No | The name of the prediction result column. | prediction_result |
predScoreColName | No | The name of the prediction score column. | prediction_score | |
predDetailColName | No | The name of the prediction detail column. | prediction_detail | |
keepColNames | No | The columns that you want to reserve in the output table. | All columns |