The Behavior Sequence Transformer (BST) algorithm uses the powerful Transformer framework to capture long-term time series information from user behavior sequences. The BST algorithm can extract implicit features from behavior sequences and make predictions. The BST algorithm provides significant benefits in business scenarios related to behavior sequences, such as recommendation systems and user lifecycle value mining.
Scenarios
The BST algorithm is designed to support various prediction tasks, including classification and regression.
The input data of the BST algorithm is behavior sequences that have time series features. The input data is stored in
LONGTEXT
format in the database. An example of such data is the click behaviors of users over the previous seven days.The BST algorithm outputs predictions, which are integers or floating-point numbers, such as the amount that users are expected to pay, whether user churns occur, and whether payments are made.
Sample classification scenarios:
Predict the number of new paying users and the potential churns of regular-paying and high-paying users in gaming scenarios. For example, the in-game behaviors of paying users over the previous 14 days in a gaming operation scenario are constructed into the behavior sequence input of the BST algorithm. The BST algorithm extracts the relevant features from the behavior sequences to predict the potential churns in the following 14 days. A user is considered to have churned if the user does not log on for 14 consecutive days.
Sample regression scenarios:
Predict the total spending of new users in a gaming scenario. For example, the in-game behaviors of new users within the first 24 hours in a gaming operation scenario are constructed into the behavior sequence input of the BST algorithm. The BST algorithm extracts the relevant features from the behavior sequences to predict the total spending of the new users in the following seven days.
Limits
The BST algorithm works effectively when the input data is balanced in terms of class distribution. If the input data is imbalanced, such as when a majority class have more than 20 times the samples of the minority classes, we recommend that you use the K-means clustering algorithm provided in PolarDB for AI to preprocess the imbalanced classes, such as the non-paying group, and provide a balanced overall data distribution across classes. For more information, see K-means clustering algorithm.
Format of the data table for algorithm model creation
Column | Required | Data type | Description | Example |
uid | Yes | VARCHAR | The ID of each data entry, such as the user ID or product ID. | 253460731706911258 |
event_list | Yes | LONGTEXT | A sequence of behaviors for model creation. Each behavior in the sequence is represented by a unique integer ID. The behaviors in the sequence are separated by commas (,) and sorted in ascending order based on their timestamps. | "[183, 238, 153, 152]" |
target | Yes | INT, FLOAT, and DOUBLE | The labels that measure the algorithm model metrics. | 0 |
val_row | No | INT | To prevent the model from overfitting, you can specify a validation set. Valid values:
Note In most cases, this parameter is used together with the version and val_flag model creation parameters.
| 1 |
other_feature | No | INT, FLOAT, DOUBLE, and LONGTEXT | Other features of the model. To use a feature, include the column name of the feature in the x_value_cols and x_statics_cols model creation parameters. Note
| 2 |
val_x_cols | No | LONGTEXT | A sequence of behaviors for model validation and parameter tuning. Each behavior in the sequence is represented by a unique integer ID. The behaviors in the sequence are separated by commas (,) and sorted in ascending order based on their timestamps. Note This parameter takes effect only if you set the version parameter to | "[183, 238, 153, 152]" |
val_y_cols | No | INT, FLOAT, and DOUBLE | The label of the behavior sequence used to tune the parameters. Note This parameter takes effect only if you set the version parameter to | 1 |
You can execute the CREATE MODEL
statement to create an algorithm model. The following table describes the configuration options for the model_parameter
parameter in the CREATE MODEL statement.
Parameter | Description |
model_task_type | The task type. Valid values:
|
batch_size | The batch size. A small batch size can increase the risk of overfitting in a model. Default value: 16. |
window_size | Used for embedded encoding of behavior IDs. The value must be greater than or equal to the maximum behavior ID value plus one. Otherwise, a parsing error occurs. |
sequence_length | The length of the behavior sequence involved in algorithm model calculations. The value must not exceed 3000. If the window_size parameter is greater than 900, do not set the sequence_length parameter to a value that is excessively large. |
success_id | The ID of the behavior for which the model makes a prediction. |
max_epoch | The maximum number of iterations. Default value: 1. |
learning_rate | The learning rate. Default value: 0.0002. |
loss | The loss function. Valid values:
|
val_flag | Specifies whether to perform validation after each model iteration. Valid values:
|
val_metric | The metric used for validation. Valid values:
|
auto_data_statics | Specifies whether to automatically generate statistical features. Valid values:
|
auto_heads | Specifies whether to automatically specify the number of multi-attention headers. Valid values:
Note
|
num_heads | If you set the auto_heads parameter to 0, you must specify this parameter. Default value: 4. |
x_value_cols | Specifies specific columns as numeric discrete features. The value cannot be empty. Note
|
x_statics_cols | Specifies specific columns as statistical features. The value cannot be empty. The length of data in each row of a specified column is fixed. Note
|
x_seq_cols | Specifies specific columns as sequence features. Note
|
version | The model version. Valid values:
|
data_normalization | Specifies whether to normalize data in the columns specified by the x_value_cols parameter. Valid values:
|
remove_seq_adjacent_duplicates | Specifies whether to remove adjacent duplicate values from the columns specified by the x_seq_cols parameter. Valid values:
|
Format of the data table for algorithm model evaluation
Column | Required | Data type | Description | Example |
uid | Yes | VARCHAR(255) | The ID of each data entry, such as the user ID or product ID. | 123213 |
event_list | Yes | LONGTEXT | A sequence of behaviors for model creation. Each behavior in the sequence is represented by a unique integer ID. The behaviors in the sequence are separated by commas (,) and sorted in ascending order based on their timestamps. | "[183, 238, 153, 152]" |
target | Yes | INT, FLOAT, and DOUBLE | The label of the sample used to calculate the errors of the algorithm model. | 0 |
other_feature | No | INT, FLOAT, DOUBLE, and LONGTEXT | Other features of the model, which are the same as those in the model creation data table. To use a feature, include the column name of the feature in the x_value_cols and x_statics_cols model creation parameters. Note
| 2 |
You can execute the EVALUATE
statement to evaluate an algorithm model. The following table describes the configuration options for the metrics
parameter in the EVALUATE statement.
Parameter | Description |
metrics | The metric used for validation. Valid values:
|
Format of the data table for algorithm model prediction
Column | Required | Data type | Description | Example |
uid | Yes | VARCHAR(255) | The ID of each data entry, such as the user ID or product ID. | 123213 |
event_list | Yes | LONGTEXT | A sequence of behaviors for model creation. Each behavior in the sequence is represented by a unique integer ID. The behaviors in the sequence are separated by commas (,) and sorted in ascending order based on their timestamps. | "[183, 238, 153, 152]" |
other_feature | No | INT, FLOAT, DOUBLE, and LONGTEXT | Other features of the model, which are the same as those in the model creation data table. To use a feature, include the column name of the feature in the x_value_cols and x_statics_cols model creation parameters. Note
| 2 |
Examples
Classification tasks are used in the following examples. For more information about the task types, see the description of the model_task_type parameter in this topic.
Create a BST model
/*polar4ai*/CREATE MODEL sequential_bst WITH (
model_class = 'bst',
x_cols = 'event_list,other_feature1',
y_cols='target',
model_parameter=(
batch_size=128,
window_size=900,
sequence_length=3000,
success_id=900,
max_epoch=2,
learning_rate=0.0008,
val_flag=1,
x_seq_cols='event_list',
x_value_cols='other_feature1',
val_metric='f1score',
auto_data_statics='on',
data_normalization=1,
remove_seq_adjacent_duplicates='on',
version=1)) AS (SELECT * FROM seqential_train);
sequential_train
is a sample data table name used for algorithm model creation.
Evaluate the model
/*polar4ai*/SELECT uid,target FROM evaluate(MODEL sequential_bst,
SELECT * FROM seqential_eval) WITH
(x_cols = 'event_list,other_feature1', y_cols='target', metrics='Fscore');
sequential_eval
is a sample data table name for algorithm model evaluation.
Make a prediction by using the model
/*polar4ai*/SELECT uid,target FROM PREDICT(MODEL sequential_bst, SELECT * FROM seqential_test) WITH
(x_cols= 'event_list,other_feature1',mode='async');
seqential_test
is a sample data table name for algorithm model predication.