One Sample T Test - Platform For AI - Alibaba Cloud Documentation Center

One Sample T Test is a statistical method used to assess whether a significant difference exists between the overall mean of a variable and a specific value. This test assumes that the data follows a normal distribution, which is particularly crucial when the sample size is small. By calculating the T statistic and referring to the T distribution table based on the degrees of freedom, you can draw conclusions about the significance of the mean difference.

Configure the component

Method 1: Configure the component in Machine Learning Designer

Add an One Sample T Test component on the pipeline page and configure the following parameters:

Category	Parameter	Description
Fields Setting	Sample 1 Column	The column that contains Sample 1.
Parameters Setting	Alternative Hypothesis Type	The alternative hypothesis type. Valid values: two.sided: Two-tailed test, which means the sample mean is not equal to the hypothesized mean. less: Left-tailed test, which means the sample mean is less than the hypothesized mean. greater: Right-tailed test, which means the sample mean is greater than the hypothesized mean.
	Confidence Level	The confidence level of the test results. Valid values: 0.8, 0.9, 0.95, 0.99, 0.995, 0.999.
	Hypothesized Mean	The hypothesized mean.
	Core number	The number of cores. The value must be a positive integer.
	Memory Size per Core	The memory size of each core. Valid values: 1 to 65536. Unit: MB.

Method 2: Use PAI commands

Use PAI commands to configure the parameters of One Sample T Test. For more information, see SQL Script.

pai -name t_test -project algo_public
    -DxTableName=pai_t_test_all_type
    -DxColName=col1_double
    -DoutputTableName=pai_t_test_out
    -DxTablePartitions=ds=2010/dt=1
    -Dalternative=less
    -Dmu=47
    -DconfidenceLevel=0.95

Parameter	Required	Default value	Description
xTableName	Yes	None	The name of the input table.
xColName	Yes	None	The column that you want to select from the input table for testing.
outputTableName	Yes	None	The name of the output table.
xTablePartitions	No	Null	The partitions that you want to select from the input table.
alternative	No	two.sided	The alternative hypothesis type. Valid values: two.sided, less, and greater.
mu	No	0	The hypothesized mean.
confidenceLevel	No	0.95	The confidence level. Valid values: 0.8, 0.9, 0.95, 0.99, 0.995, and 0.999.

Sample output

{
    "AlternativeHypthesis": "mean not equals to 0",
    "ConfidenceInterval": "(44.72234194006504, 46.27765805993496)",
    "ConfidenceLevel": 0.95,
    "alpha": 0.05,
    "df": 99,
    "mean": 45.5,
    "p": 0,
    "stdDeviation": 3.919647479510927,
    "t": 116.081867662439
}