All Products
Search
Document Center

Platform For AI:One Sample T Test

Last Updated:Nov 28, 2024

One Sample T Test is a statistical method used to assess whether a significant difference exists between the overall mean of a variable and a specific value. This test assumes that the data follows a normal distribution, which is particularly crucial when the sample size is small. By calculating the T statistic and referring to the T distribution table based on the degrees of freedom, you can draw conclusions about the significance of the mean difference.

Configure the component

Method 1: Configure the component in Machine Learning Designer

Add an One Sample T Test component on the pipeline page and configure the following parameters:

Category

Parameter

Description

Fields Setting

Sample 1 Column

The column that contains Sample 1.

Parameters Setting

Alternative Hypothesis Type

The alternative hypothesis type. Valid values:

  • two.sided: Two-tailed test, which means the sample mean is not equal to the hypothesized mean.

  • less: Left-tailed test, which means the sample mean is less than the hypothesized mean.

  • greater: Right-tailed test, which means the sample mean is greater than the hypothesized mean.

Confidence Level

The confidence level of the test results. Valid values: 0.8, 0.9, 0.95, 0.99, 0.995, 0.999.

Hypothesized Mean

The hypothesized mean.

Core number

The number of cores. The value must be a positive integer.

Memory Size per Core

The memory size of each core. Valid values: 1 to 65536. Unit: MB.

Method 2: Use PAI commands

Use PAI commands to configure the parameters of One Sample T Test. For more information, see SQL Script.

pai -name t_test -project algo_public
    -DxTableName=pai_t_test_all_type
    -DxColName=col1_double
    -DoutputTableName=pai_t_test_out
    -DxTablePartitions=ds=2010/dt=1
    -Dalternative=less
    -Dmu=47
    -DconfidenceLevel=0.95

Parameter

Required

Default value

Description

xTableName

Yes

None

The name of the input table.

xColName

Yes

None

The column that you want to select from the input table for testing.

outputTableName

Yes

None

The name of the output table.

xTablePartitions

No

Null

The partitions that you want to select from the input table.

alternative

No

two.sided

The alternative hypothesis type. Valid values: two.sided, less, and greater.

mu

No

0

The hypothesized mean.

confidenceLevel

No

0.95

The confidence level. Valid values: 0.8, 0.9, 0.95, 0.99, 0.995, and 0.999.

Sample output

{
    "AlternativeHypthesis": "mean not equals to 0",
    "ConfidenceInterval": "(44.72234194006504, 46.27765805993496)",
    "ConfidenceLevel": 0.95,
    "alpha": 0.05,
    "df": 99,
    "mean": 45.5,
    "p": 0,
    "stdDeviation": 3.919647479510927,
    "t": 116.081867662439
}