# One-Sample T-Test

#### 1. Functionalities

• To determine if your data is statistically significantly different from the specified mean from T-test results
• To understand the basic descriptive statistics about your data
• To understand the descriptive statistics plot such as box-plot, mean-sd plot, QQ-plot, distribution histogram, and density distribution plot about your data to determine if your data is close to a normal distribution

• Your data contain only 1 group of values (or a numeric vector)
• The values are independent observations and approximately normally distributed

#### Case Example

Suppose we collected the age of 50 independent lymph node-positive patients and wanted to know whether the general age of lymph node-positive patients was 50 years old

#### Output 1. Descriptive Results

Explanations
• The band inside the box is the median
• The box measures the difference between 75th and 25th percentiles
• Outliers will be in red, if existing

Explanations
• Normal Q–Q Plot: to compare randomly generated, independent standard normal data on the vertical axis to a standard normal population on the horizontal axis. The linearity of the points suggests that the data are normally distributed.
• Histogram: to roughly assess the probability distribution of a given variable by depicting the frequencies of observations occurring in certain ranges of values
• Density Plot: to estimate the probability density function of the data

Normal Q–Q plot

Histogram

When the number of bins is 0, plot will use the default number of bins

Density plot

#### Output 2. Test Results

Explanations
• P Value < 0.05, then the population of the data IS significantly different from the specified mean. (Accept the alternative hypothesis)
• P Value >= 0.05, then the population of the data IS NOT significantly different from the specified mean. (Accept the null hypothesis)

Because P < 0.05, we concluded that the age of the lymph node-positive population was significantly different from 50 years old. Thus the general age was not 50. If we reset the specified mean to 44, we could get P > 0.05

# Independent Two-Sample T-Test

#### 1. Functionalities

• To determine if the means of two sets of your data are significantly different from each other from T test results
• To know the basic descriptive statistics about your data
• To know the descriptive statistics plot such as box-plot, mean-sd plot, QQ-plot, distribution histogram, and density distribution plot about your data to determine if your data is close to normal distribution

• Your data contain 2 separate groups/sets (or 2 numeric vectors)
• The 2 separate groups/sets are independent and identically approximately normally distributed

#### Case Example

Suppose we collected the age of 50 independent lymph node-positive patients. Among them, 25 had Estrogen receptor (ER) positive, 25 had ER negative. We wanted to know if the ages of patients with ER positive was significantly different from patients with ER negative in general. Or, whether ER is related to age.

#### Output 1. Descriptive Results

Explanations
• The band inside the box is the median
• The box measures the difference between 75th and 25th percentiles
• Outliers will be in red, if existing

Explanations
• Normal Q–Q Plot: to compare randomly generated, independent standard normal data on the vertical axis to a standard normal population on the horizontal axis. The linearity of the points suggests that the data are normally distributed.
• Histogram: to roughly assess the probability distribution of a given variable by depicting the frequencies of observations occurring in certain ranges of values
• Density Plot: to estimate the probability density function of the data

Normal Q-Q plot

Histogram

When the number of bins is 0, plot will use the default number of bins

Density plot

#### Output 2. Test Result 1

Check the equivalence of 2 variances
Explanations
• P value < 0.05, then refer to the Welch Two-Sample t-test
• P Value >= 0.05, then refer to Two-Sample t-test

In this example, P value of F test was about 0.15 (>0.05), indicating the equal variance in the data. Thus, we should refer to the results from 'Two-Sample t-test'

#### Output 3. Test Result 2

Decide the T Test

Explanations
• P Value < 0.05, then the population means of the Group 1 IS significantly different from Group 2. (Accept the alternative hypothesis)
• P Value >= 0.05, then there are NO significant differences between Group 1 and Group 2. (Accept the null hypothesis)

In this example, we concluded that the age of lymph node positive population with ER positive was not significantly different from ER negative (P=0.55, from 'Two-Sample t-test')

# Dependent T-Test for Paired Samples

In paired case, we compare the differences of 2 groups to zero. Thus, it becomes a one-sample test problem.

#### 1. Functionalities

• To determine if the difference of the paired 2 samples are equal to 0
• To know the basic descriptive statistics about your data
• To know the descriptive statistics plot such as box-plot, mean-sd plot, QQ-plot, distribution histogram, and density distribution plot about your data to determine if your data is close to normal distribution

• Your data contain 2 separate groups/sets (or 2 numeric vectors)
• Two samples that have been matched or paired
• The differences of paired samples are approximately normally distributed

#### 3. Examples for Matched or Paired Data

• One person's pre-test and post-test scores
• When there are two samples that have been matched or paired

#### Case Example

Suppose we collected the wanted to know whether a certain drug had effect on people's sleeping hour. We got 10 people and collected the sleeping hour data before and after taking the drug. This was a paired case. We wanted to know whether the sleeping hours before and after the drug would be significantly different; or, whether the difference before and after were significantly different from 0

#### Output 1. Descriptive Results

Basic Descriptives of the Difference

Explanations
• The band inside the box is the median
• The box measures the difference between 75th and 25th percentiles
• Outliers will be in red, if existing

Explanations
• Normal Q–Q Plot: to compare randomly generated, independent standard normal data on the vertical axis to a standard normal population on the horizontal axis. The linearity of the points suggests that the data are normally distributed.
• Histogram: to roughly assess the probability distribution of a given variable by depicting the frequencies of observations occurring in certain ranges of values
• Density Plot: to estimate the probability density function of the difference

Normal Q-Q plot

Histogram

When the number of bins is 0, plot will use the default number of bins

Density plot

#### Output 2. Test Results

Explanations
• P Value < 0.05, then Group 1 (Before) and Group 2 (After) have a significantly unequal effect. (Accept the alternative hypothesis)
• P Value >= 0.05, then there is NO significant difference between 2 groups. (Accept the null hypothesis)

From the default settings, we concluded that the drug has no significant effect on the sleep hour. (P=0.2)