This topic describes the Information Value (IV) method that evaluates the predictive power of a feature.
Scenarios
IV is commonly used to select effective features. In risk control scenarios, thousands or tens of thousands of features may exist, which makes it difficult to manually choose the predictive ones. The IV method can be used to handle this issue.
Syntax
CREATE FEATURE feature_name WITH ( feature_class = '', x_cols = '', y_cols = '', parameters=()) AS (SELECT select_expr [, select_expr] ... FROM table_reference)
Parameter description:Parameter | Description |
---|---|
feature_name | The name of the feature. |
feature_class | The type of the feature. Set the value to iv. |
x_cols | The list of independent variables. Separate multiple variables with commas (,). |
y_cols | The dependent variable. |
parameters | Custom parameters for creating the feature. The IV method supports only categorical features. The value can only be set to categorical_features. Separate multiple features with commas(,). |
select_expr | The name of the column used to create the feature. |
table_reference | The name of the table containing the column used to create the feature. |
Example
/*polar4ai*/CREATE FEATURE iv_001 WITH ( feature_class = 'iv',x_cols='Airline,Flight,AirportFrom,AirportTo,DayOfWeek,Time,Length',y_cols='Delay',parameters=(categorical_feature='Airline,Flight,AirportFrom,AirportTo,DayOfWeek')) AS (SELECT * from airlines_test_1000);