One Sample T Test is a statistical method used to assess whether a significant difference exists between the overall mean of a variable and a specific value. This test assumes that the data follows a normal distribution, which is particularly crucial when the sample size is small. By calculating the T statistic and referring to the T distribution table based on the degrees of freedom, you can draw conclusions about the significance of the mean difference.
Configure the component
Method 1: Configure the component in Machine Learning Designer
Add an One Sample T Test component on the pipeline page and configure the following parameters:
Category | Parameter | Description |
Fields Setting | Sample 1 Column | The column that contains Sample 1. |
Parameters Setting | Alternative Hypothesis Type | The alternative hypothesis type. Valid values:
|
Confidence Level | The confidence level of the test results. Valid values: 0.8, 0.9, 0.95, 0.99, 0.995, 0.999. | |
Hypothesized Mean | The hypothesized mean. | |
Core number | The number of cores. The value must be a positive integer. | |
Memory Size per Core | The memory size of each core. Valid values: 1 to 65536. Unit: MB. |
Method 2: Use PAI commands
Use PAI commands to configure the parameters of One Sample T Test. For more information, see SQL Script.
pai -name t_test -project algo_public
-DxTableName=pai_t_test_all_type
-DxColName=col1_double
-DoutputTableName=pai_t_test_out
-DxTablePartitions=ds=2010/dt=1
-Dalternative=less
-Dmu=47
-DconfidenceLevel=0.95
Parameter | Required | Default value | Description |
xTableName | Yes | None | The name of the input table. |
xColName | Yes | None | The column that you want to select from the input table for testing. |
outputTableName | Yes | None | The name of the output table. |
xTablePartitions | No | Null | The partitions that you want to select from the input table. |
alternative | No | two.sided | The alternative hypothesis type. Valid values: two.sided, less, and greater. |
mu | No | 0 | The hypothesized mean. |
confidenceLevel | No | 0.95 | The confidence level. Valid values: 0.8, 0.9, 0.95, 0.99, 0.995, and 0.999. |
Sample output
{
"AlternativeHypthesis": "mean not equals to 0",
"ConfidenceInterval": "(44.72234194006504, 46.27765805993496)",
"ConfidenceLevel": 0.95,
"alpha": 0.05,
"df": 99,
"mean": 45.5,
"p": 0,
"stdDeviation": 3.919647479510927,
"t": 116.081867662439
}