Conditional Random Field Prediction - Platform For AI - Alibaba Cloud Documentation Center

The Conditional Random Field Prediction component is an algorithm component provided by Machine Learning Designer (formerly known as Machine Learning Studio) based on the online prediction model Linear Conditional Random Field (LinearCRF). This component is used to process sequence labeling tasks. This topic describes how to configure parameters for the Conditional Random Field Prediction component. This topic also provides an example on how to use the Conditional Random Field Prediction component.

Configure parameters

You can configure parameters for the Conditional Random Field Prediction component in the Machine Learning Platform for AI (PAI) console.


Parameter	Description
ID Columns	The column that contains the ID of each sample. Samples are stored in n-tuples.
Feature Columns	The word to be annotated and its features.
Target Columns	The column that you want to select.
Prediction Result Column	The name of the prediction result column. Default value: prediction_result.
Prediction Score Column	The name of the prediction score column. Default value: prediction_score.
Prediction Detail Column	The name of the prediction detail column. You can leave this parameter empty if you do not need the prediction detail column.

Example

In the online prediction phase of the online prediction model LinearCRF, you must use a training model of the input/output (I/O) structure. The following table describes the format of the training data table.


sentence_id	word	f1	f2	label
1	Rockwell	NNP	POS	B-NP
1	International	NNP	NP	I-NP
1	Corp	NNP	PO	I-NP
1	's	POS	NN	B-NP
...	...	...	...	...

The feature names word, f1, and f2 in the input format must be the same as the corresponding column names of the features in the training data table. In an input request for online prediction, separate the features of different words with spaces. The following code shows the input format of the online prediction model LinearCRF:

{
       "inputs":[
         {
               "word":{
                    "dataType": 50,
                    "dataValue":"Rockwell International Corp 's ..."
                },
                 "f1": {
                   "dataType": 50,
                   "dataValue":"NNP NNP NNP POS ..."
                },
                 "f2": {
                   "dataType": 50,
                   "dataValue":"POS NP PO NN ..."
                }
         }]
}

In the outputValue section of the output format, a prediction_result field, a prediction_score field, and a prediction_detail field that match all the words in the input format are generated in the JSON format. The following code shows the output format of the online prediction model LinearCRF:

{
    "outputs": [
    {
       "outputLabel": "CRFProcessor_Result",
       "outputValue": {
        "dataType": 50,
        "dataValue": {
            "Rockwell NNP POS": {
            "prediction_result":"B-NP",
            "prediction_score":0.99,
            "prediction_detail":{"B-ADJP":0.000145, "B-NP":0.99, ...}
            },
            "International NNP NP": ...
        }
       }
    }
    ]
}

If your input format is invalid, the program displays an error message, as shown in the following code:

{
    "outputs": [
    {
       "outputLabel": "CRFProcessor_Result",
       "outputValue": {
        "dataType":50,
        "dataValue": "Failed: The input format is incorrect"
       }
    }
    ]
}