The Conditional Random Field Prediction component is an algorithm component provided by Machine Learning Designer (formerly known as Machine Learning Studio) based on the online prediction model Linear Conditional Random Field (LinearCRF). This component is used to process sequence labeling tasks. This topic describes how to configure parameters for the Conditional Random Field Prediction component. This topic also provides an example on how to use the Conditional Random Field Prediction component.
Configure parameters
You can configure parameters for the Conditional Random Field Prediction component
in the Machine Learning Platform for AI (PAI) console.
Parameter | Description |
---|---|
ID Columns | The column that contains the ID of each sample. Samples are stored in n-tuples. |
Feature Columns | The word to be annotated and its features. |
Target Columns | The column that you want to select. |
Prediction Result Column | The name of the prediction result column. Default value: prediction_result. |
Prediction Score Column | The name of the prediction score column. Default value: prediction_score. |
Prediction Detail Column | The name of the prediction detail column. You can leave this parameter empty if you do not need the prediction detail column. |
Example
In the online prediction phase of the online prediction model LinearCRF, you must
use a training model of the input/output (I/O) structure. The following table describes
the format of the training data table.
sentence_id | word | f1 | f2 | label |
---|---|---|---|---|
1 | Rockwell | NNP | POS | B-NP |
1 | International | NNP | NP | I-NP |
1 | Corp | NNP | PO | I-NP |
1 | 's | POS | NN | B-NP |
... | ... | ... | ... | ... |
The feature names word, f1, and f2 in the input format must be the same as the corresponding column names of the features
in the training data table. In an input request for online prediction, separate the
features of different words with spaces. The following code shows the input format
of the online prediction model LinearCRF:
{
"inputs":[
{
"word":{
"dataType": 50,
"dataValue":"Rockwell International Corp 's ..."
},
"f1": {
"dataType": 50,
"dataValue":"NNP NNP NNP POS ..."
},
"f2": {
"dataType": 50,
"dataValue":"POS NP PO NN ..."
}
}]
}
In the outputValue section of the output format, a prediction_result field, a prediction_score
field, and a prediction_detail field that match all the words in the input format
are generated in the JSON format. The following code shows the output format of the
online prediction model LinearCRF:
{
"outputs": [
{
"outputLabel": "CRFProcessor_Result",
"outputValue": {
"dataType": 50,
"dataValue": {
"Rockwell NNP POS": {
"prediction_result":"B-NP",
"prediction_score":0.99,
"prediction_detail":{"B-ADJP":0.000145, "B-NP":0.99, ...}
},
"International NNP NP": ...
}
}
}
]
}
If your input format is invalid, the program displays an error message, as shown in
the following code:
{
"outputs": [
{
"outputLabel": "CRFProcessor_Result",
"outputValue": {
"dataType":50,
"dataValue": "Failed: The input format is incorrect"
}
}
]
}