# One-way ANOVA to Compare Means from Multiple Factor Groups and Multiple Comparison Post-Hoc Correction for Specific Groups

#### 1. Functionalities

• To determine if the means differ significantly among the factor groups
• To determine if the means differ significantly among pairs, given that one-way ANOVA finds significant differences among factor groups.

• Your data contain several separate factor groups shown in 2 vectors
• One vector is the observed values; one vector is to mark your values in different factor groups
• The separate factor groups are independent and identically approximately normally distributed
• Each mean of the factor group follows a normal distribution with the same variance and can be compared

#### Case Example

Suppose we want to find whether passive smoking had a measurable effect on the incidence of cancer. In a study, we studied 6 group of smokers: nonsmokers (NS), passive smokers (PS), non-inhaling smokers (NI), light smokers (LS), moderate smokers (MS), and heavy smokers (HS). The study measured the forced mid-expiatory flow (FEF). We wanted to the know the FEF differences among the 6 groups.

#### Step 1. Data Preparation

1. Give names to your Values and Factor Group

2. Input data

Example here was the FEF data from smokers and smoking groups. Detailed information can be found in the Output 1.

Data point can be separated by , ; /Enter /Tab /Space

Data be copied from CSV (one column) and pasted in the box

Sample Values

Factor group

Missing value is input as NA to ensure 2 sets have equal length; otherwise, there will be error

Upload data will cover the example data

2. Show 1st row as column names?

3. Use 1st column as row names? (No duplicates)

Correct separator and quote ensure the successful data input

Find some example data here

#### Hypothesis

Null hypothesis

The means from each group are equal

Alternative hypothesis

At least two factor groups have significant different means

In this example, we wanted to know if the FEF values were different among the 6 smoking groups

#### Output 1. Descriptive Results

The categories in the Factor Group

Descriptive statistics by group

Explanations
• The band inside the box is the median
• The box measures the difference between 75th and 25th percentiles
• Outliers will be in red, if existing

#### Output 2. ANOVA Table

Explanations
• DFFactor = [number of factor group categories] -1
• DFResiduals = [number of sample values] - [number of factor group categories]
• MS = SS/DF
• F = MSFactor / MSResiduals
• P Value < 0.05, then the population means are significantly different among factor groups. (Accept alternative hypothesis)
• P Value >= 0.05, then there is NO significant differences in the factor groups. (Accept null hypothesis)

In this example, smoking groups showed significant, so we could conclude that FEF were significantly different among the 6 groups.

When P < 0.05, if you want to find which pairwise factor groups are significantly different, please continue with Multiple Comparison

#### Hypothesis

Null hypothesis

In one pair of factors, the means from each pair are equal

Alternative hypothesis

In one pair of factors, the means from each pair are significantly different

In this example, we wanted to know if the FEF values were different in which pairs of the 6 smoking groups

#### Step 2. Choose Multiple Comparison Methods

Explanations
• Bonferroni correction is a generic but very conservative approach
• Bonferroni-Holm is less conservative and uniformly more powerful than Bonferroni
• False Discovery Rate-BH is more powerful than the others, developed by Benjamini and Hochberg
• False Discovery Rate-BY is more powerful than the others, developed by Benjamini and Yekutieli
• Scheffe procedure controls for the search over any possible contrast
• Tukey Honest Significant Difference is preferred if there are unequal group sizes among the experimental and control groups
• Dunnett is useful for compare all treatment groups with a control group
• #### Output 3. Multiple Comparison Results

Pairwise P Value Table

Explanations
• In the matrix, P < 0.05 indicates the statistical significant in the pairs
• In the matrix, P >= 0.05 indicates no statistically significant differences in the pairs

In this example, we used Bonferroni-Holm method to explore the possible pairs with P < 0.05. HS was significant different from the other groups; LS was significantly different from MS and NS; MS was significantly different from NI and PS; NI was significantly different from NS.

# Two-way ANOVA to Compare Means from Multiple Groups and Multiple Comparison Post-Hoc Correction for Specific Groups

#### 1. Functionalities

• To determine if the means differ significantly among the Factor1 after controlling for Factor2
• To determine if the means differ significantly among the Factor2 after controlling for Factor1
• To determine if the Factor1 and Factor2 have interaction to effect the outcomes
• To determine if the means differ significantly among which pairs, given that two-way ANOVA finds significant differences among groups.

• Your data contain several separate factor groups (or 2 vectors)
• The separate factor groups/sets are independent and identically approximately normally distributed
• Each mean of the factor group follows a normal distribution with the same variance and can be compared

#### Case Example

Suppose we were interested in the effects of sex and 3 dietary groups on SBP. The 3 dietary groups included strict vegetarians (SV), lactovegentarians (LV), and normal (NOR) people, and we tested the SBP. The effects of sex and and dietary group might be related (interact) with each other. We wanted to know the effect of dietary group and sex on the SBP and whether the two factors were related with each other.

#### Step 1. Data Preparation

1. Give names to your Values and 2 Factors Group variables

2. Input data

Example here was the full metastasis-free follow-up time (months) of 100 lymph node positive patients under 3 grades of the tumor and 2 levels of ER.

Data point can be separated by , ; /Enter /Tab /Space

Data be copied from CSV (one column) and pasted in the box

Sample Values

Factor 1

Factor 2

Missing value is input as NA to ensure 3 sets have equal length; otherwise, there will be error

Upload data will cover the example data

2. Show 1st row as column names?

3. Use 1st column as row names? (No duplicates)

Correct separator and quote ensure the successful data input

Find some example data here

#### Hypothesis

Null hypothesis

1. The population means under the first factor are equal.

2. The population means under the second factor are equal

3. There is no interaction between the two factors

Alternative hypothesis

1. The first factor effects.

2. The second factor effects

3. There is interaction between the two factors

In this example, we wanted to know if the metastasis-free follow-up time was different with grade of the tumor under the controlling for ER

#### Output 1. Descriptive Results

The categories in the Factor 1

The categories in the Factor 2

#### Output 2. ANOVA Table

Explanations
• DFFactor = [number of factor group categories] -1
• DFInteraction = DFFactor1 x DFFactor2
• DFResiduals = [number of sample values] - [number of factor1 group categories] x [number of factor2 group categories]
• MS = SS/DF
• F = MSFactor / MSResiduals
• P Value < 0.05, then the population means are significantly different among factor groups. (Accept alternative hypothesis)
• P Value >= 0.05, then there is NO significant differences in the factor groups. (Accept null hypothesis)

In this example, dietary types and sex both have effects on the SBP (P<0.001), and dietary types also significantly related with sex (P<0.001).

When P < 0.05, if you want to find which pairwise factor groups are significantly different, please continue with Multiple Comparison

#### Hypothesis

Null hypothesis

The means from each group are equal

Alternative hypothesis

At least two groups have significant different means

In this example, we wanted to know if the metastasis-free follow-up time was different with grade of the tumor (three ordered levels)

#### Step 2. Choose Multiple Comparison Methods

Explanations
• Scheffe procedure controls for the search over any possible contrast
• Tukey Honest Significant Difference is preferred if there are unequal group sizes among the experimental and control groups
• #### Output 2. Test Results

Pairwise P Value Table under Each Factor

Explanations
• In the matrix, P < 0.05 indicates the statistical significant in the pairs
• In the matrix, P >= 0.05 indicates no statistically significant differences in the pairs

In this example, all the pairs, normal vs LV, SV vs LV, SV vs normal, and male vs female had significant differences on SBP.

# Kruskal-Wallis Non-parametric Test to Compare Multiple Samples and Multiple Comparison Post-Hoc Correction for Specific Groups

This method compares ranks of the observed data, rather than mean and SD. An alternative to one-way ANOVA without assumption on the data distribution

#### 1. Functionalities

• To determine if the means differ significantly among the factor groups
• To determine if the means differ significantly among pairs, given that one-way ANOVA finds significant differences among groups.

• Your data contain several separate factor groups shown in two vectors
• One vector is the observed values; one vector is to mark your values in different factor groups
• The separate factor groups are independent and identically without distribution assumption

#### Case Example

Suppose we want to find whether passive smoking had a measurable effect on the incidence of cancer. In a study, we studied 6 group of smokers: nonsmokers (NS), passive smokers (PS), non-inhaling smokers (NI), light smokers (LS), moderate smokers (MS), and heavy smokers (HS). The study measured the forced mid-expiatory flow (FEF). We wanted to the know the FEF differences among the 6 groups.

#### Step 1. Data Preparation

1. Give names to your Values and Factor Group

2. Input data

Example here was the FEF data from smokers and smoking groups. Detailed information can be found in the Output 1.

Data point can be separated by , ; /Enter /Tab /Space

Data be copied from CSV (one column) and pasted in the box

Sample Values

Factor group

Missing value is input as NA to ensure 2 sets have equal length; otherwise, there will be error

Upload data will cover the example data

2. Show 1st row as column names?

3. Use 1st column as row names? (No duplicates)

Correct separator and quote ensure the successful data input

Find some example data here

#### Hypothesis

Null hypothesis

The means from each group are equal

Alternative hypothesis

At least two factor groups have significant different means

In this example, we wanted to know if the FEF values were different among the 6 smoking groups

#### Output 1. Descriptive Results

The categories in the Factor Group

Descriptive statistics by group

#### Output 2. Test Results

In this example, smoking groups showed significant, so we could conclude that FEF were significantly different among the 6 groups from Kruskal-Wallis rank sum test.

When P < 0.05, if you want to find which pairwise factor groups are significantly different, please continue with Multiple Comparison

#### Hypothesis

Null hypothesis

The means from each group are equal

Alternative hypothesis

At least two factor groups have significant different means

In this example, we wanted to know if the FEF values were different among the 6 smoking groups

#### Step 2. Choose Multiple Comparison Methods

Explanations
• Bonferroni adjusted p-values = max(1, pm); m= k(k-1)/2 multiple pairwise comparisons
• Sidak adjusted p-values = max(1, 1 - (1 - p)^m)
• Holm's adjusted p-values = max[1, p(m+1-i)]; i is ordering index
• Holm-Sidak adjusted p-values = max[1, 1 - (1 - p)^(m+1-i)]
• Hochberg's adjusted p-values = max[1, p*i]
• Benjamini-Hochberg adjusted p-values = max[1, pm/(m+1-i)]
• Benjamini-Yekutieli adjusted p-values = max[1, pmC/(m+1-i)]; C = 1 + 1/2 + ... + 1/m
• #### Output 2. Test Results

Reject Null Hypothesis if p <= 0.025

In this example, smoking groups showed significant, so we could conclude that FEF were not significantly different in LS-NI, LS-PS, and NI-PS groups. For other groups, P <0.025.