One-way ANOVA to Compare Means from Multiple Factor Groups and Multiple Comparison Post-Hoc Correction for Specific Groups

1. Functionalities

  • To determine if the means differ significantly among the factor groups
  • To determine if the means differ significantly among pairs, given that one-way ANOVA finds significant differences among factor groups.

2. About your data

  • Your data contain several separate factor groups shown in 2 vectors
  • One vector is the observed values; one vector is to mark your values in different factor groups
  • The separate factor groups are independent and identically approximately normally distributed
  • Each mean of the factor group follows a normal distribution with the same variance and can be compared

Case Example

Suppose we want to find whether passive smoking had a measurable effect on the incidence of cancer. In a study, we studied 6 group of smokers: nonsmokers (NS), passive smokers (PS), non-inhaling smokers (NI), light smokers (LS), moderate smokers (MS), and heavy smokers (HS). The study measured the forced mid-expiatory flow (FEF). We wanted to the know the FEF differences among the 6 groups.

Please follow the Steps, and Outputs will give real-time analytical results.


One-way ANOVA

Step 1. Data Preparation

1. Give names to your Values and Factor Group


2. Input data


Example here was the FEF data from smokers and smoking groups. Detailed information can be found in the Output 1.

Please follow the example to input your data

Data point can be separated by , ; /Enter /Tab /Space

Data be copied from CSV (one column) and pasted in the box

Sample Values

Factor group

Missing value is input as NA to ensure 2 sets have equal length; otherwise, there will be error


Upload data will cover the example data

Please refer to the example data format to upload new data

2. Show 1st row as column names?

3. Use 1st column as row names? (No duplicates)

Correct separator and quote ensure the successful data input

Find some example data here


Hypothesis

Null hypothesis

The means from each group are equal

Alternative hypothesis

At least two factor groups have significant different means

In this example, we wanted to know if the FEF values were different among the 6 smoking groups

Output 1. Descriptive Results


The categories in the Factor Group


Descriptive statistics by group


Explanations
  • The band inside the box is the median
  • The box measures the difference between 75th and 25th percentiles
  • Outliers will be in red, if existing



Output 2. ANOVA Table



Explanations
  • DFFactor = [number of factor group categories] -1
  • DFResiduals = [number of sample values] - [number of factor group categories]
  • MS = SS/DF
  • F = MSFactor / MSResiduals
  • P Value < 0.05, then the population means are significantly different among factor groups. (Accept alternative hypothesis)
  • P Value >= 0.05, then there is NO significant differences in the factor groups. (Accept null hypothesis)

In this example, smoking groups showed significant, so we could conclude that FEF were significantly different among the 6 groups.


When P < 0.05, if you want to find which pairwise factor groups are significantly different, please continue with Multiple Comparison


Multiple Comparison


Hypothesis

Null hypothesis

In one pair of factors, the means from each pair are equal

Alternative hypothesis

In one pair of factors, the means from each pair are significantly different

In this example, we wanted to know if the FEF values were different in which pairs of the 6 smoking groups


Step 2. Choose Multiple Comparison Methods



Explanations
  • Bonferroni correction is a generic but very conservative approach
  • Bonferroni-Holm is less conservative and uniformly more powerful than Bonferroni
  • False Discovery Rate-BH is more powerful than the others, developed by Benjamini and Hochberg
  • False Discovery Rate-BY is more powerful than the others, developed by Benjamini and Yekutieli
  • Scheffe procedure controls for the search over any possible contrast
  • Tukey Honest Significant Difference is preferred if there are unequal group sizes among the experimental and control groups
  • Dunnett is useful for compare all treatment groups with a control group
  • Output 3. Multiple Comparison Results


    Pairwise P Value Table


    Explanations
    • In the matrix, P < 0.05 indicates the statistical significant in the pairs
    • In the matrix, P >= 0.05 indicates no statistically significant differences in the pairs

    In this example, we used Bonferroni-Holm method to explore the possible pairs with P < 0.05. HS was significant different from the other groups; LS was significantly different from MS and NS; MS was significantly different from NI and PS; NI was significantly different from NS.


    Two-way ANOVA to Compare Means from Multiple Groups and Multiple Comparison Post-Hoc Correction for Specific Groups

    1. Functionalities

    • To determine if the means differ significantly among the Factor1 after controlling for Factor2
    • To determine if the means differ significantly among the Factor2 after controlling for Factor1
    • To determine if the Factor1 and Factor2 have interaction to effect the outcomes
    • To determine if the means differ significantly among which pairs, given that two-way ANOVA finds significant differences among groups.

    2. About your data

    • Your data contain several separate factor groups (or 2 vectors)
    • The separate factor groups/sets are independent and identically approximately normally distributed
    • Each mean of the factor group follows a normal distribution with the same variance and can be compared

    Case Example

    Suppose we were interested in the effects of sex and 3 dietary groups on SBP. The 3 dietary groups included strict vegetarians (SV), lactovegentarians (LV), and normal (NOR) people, and we tested the SBP. The effects of sex and and dietary group might be related (interact) with each other. We wanted to know the effect of dietary group and sex on the SBP and whether the two factors were related with each other.

    Please follow the Steps, and Outputs will give real-time analytical results.


    Step 1. Data Preparation

    1. Give names to your Values and 2 Factors Group variables


    2. Input data


    Example here was the full metastasis-free follow-up time (months) of 100 lymph node positive patients under 3 grades of the tumor and 2 levels of ER.

    Please follow the example to input your data

    Data point can be separated by , ; /Enter /Tab /Space

    Data be copied from CSV (one column) and pasted in the box

    Sample Values

    Factor 1

    Factor 2

    Missing value is input as NA to ensure 3 sets have equal length; otherwise, there will be error


    Upload data will cover the example data

    Please refer to the example data format to upload new data

    2. Show 1st row as column names?

    3. Use 1st column as row names? (No duplicates)

    Correct separator and quote ensure the successful data input

    Find some example data here


    Hypothesis

    Null hypothesis

    1. The population means under the first factor are equal.

    2. The population means under the second factor are equal

    3. There is no interaction between the two factors

    Alternative hypothesis

    1. The first factor effects.

    2. The second factor effects

    3. There is interaction between the two factors

    In this example, we wanted to know if the metastasis-free follow-up time was different with grade of the tumor under the controlling for ER

    Output 1. Descriptive Results


    The categories in the Factor 1

    The categories in the Factor 2





    Output 2. ANOVA Table



    Explanations
    • DFFactor = [number of factor group categories] -1
    • DFInteraction = DFFactor1 x DFFactor2
    • DFResiduals = [number of sample values] - [number of factor1 group categories] x [number of factor2 group categories]
    • MS = SS/DF
    • F = MSFactor / MSResiduals
    • P Value < 0.05, then the population means are significantly different among factor groups. (Accept alternative hypothesis)
    • P Value >= 0.05, then there is NO significant differences in the factor groups. (Accept null hypothesis)

    In this example, dietary types and sex both have effects on the SBP (P<0.001), and dietary types also significantly related with sex (P<0.001).


    When P < 0.05, if you want to find which pairwise factor groups are significantly different, please continue with Multiple Comparison


    Multiple Comparison


    Hypothesis

    Null hypothesis

    The means from each group are equal

    Alternative hypothesis

    At least two groups have significant different means

    In this example, we wanted to know if the metastasis-free follow-up time was different with grade of the tumor (three ordered levels)


    Step 2. Choose Multiple Comparison Methods



    Explanations
  • Scheffe procedure controls for the search over any possible contrast
  • Tukey Honest Significant Difference is preferred if there are unequal group sizes among the experimental and control groups
  • Output 2. Test Results


    Pairwise P Value Table under Each Factor


    Explanations
    • In the matrix, P < 0.05 indicates the statistical significant in the pairs
    • In the matrix, P >= 0.05 indicates no statistically significant differences in the pairs

    In this example, all the pairs, normal vs LV, SV vs LV, SV vs normal, and male vs female had significant differences on SBP.


    Kruskal-Wallis Non-parametric Test to Compare Multiple Samples and Multiple Comparison Post-Hoc Correction for Specific Groups

    This method compares ranks of the observed data, rather than mean and SD. An alternative to one-way ANOVA without assumption on the data distribution

    1. Functionalities

    • To determine if the means differ significantly among the factor groups
    • To determine if the means differ significantly among pairs, given that one-way ANOVA finds significant differences among groups.

    2. About your data

    • Your data contain several separate factor groups shown in two vectors
    • One vector is the observed values; one vector is to mark your values in different factor groups
    • The separate factor groups are independent and identically without distribution assumption

    Case Example

    Suppose we want to find whether passive smoking had a measurable effect on the incidence of cancer. In a study, we studied 6 group of smokers: nonsmokers (NS), passive smokers (PS), non-inhaling smokers (NI), light smokers (LS), moderate smokers (MS), and heavy smokers (HS). The study measured the forced mid-expiatory flow (FEF). We wanted to the know the FEF differences among the 6 groups.

    Please follow the Steps, and Outputs will give real-time analytical results.


    Step 1. Data Preparation

    1. Give names to your Values and Factor Group


    2. Input data


    Example here was the FEF data from smokers and smoking groups. Detailed information can be found in the Output 1.

    Please follow the example to input your data

    Data point can be separated by , ; /Enter /Tab /Space

    Data be copied from CSV (one column) and pasted in the box

    Sample Values

    Factor group

    Missing value is input as NA to ensure 2 sets have equal length; otherwise, there will be error


    Upload data will cover the example data

    Please refer to the example data format to upload new data

    2. Show 1st row as column names?

    3. Use 1st column as row names? (No duplicates)

    Correct separator and quote ensure the successful data input

    Find some example data here


    Hypothesis

    Null hypothesis

    The means from each group are equal

    Alternative hypothesis

    At least two factor groups have significant different means

    In this example, we wanted to know if the FEF values were different among the 6 smoking groups

    Output 1. Descriptive Results


    The categories in the Factor Group


    Descriptive statistics by group



    Output 2. Test Results



    In this example, smoking groups showed significant, so we could conclude that FEF were significantly different among the 6 groups from Kruskal-Wallis rank sum test.


    When P < 0.05, if you want to find which pairwise factor groups are significantly different, please continue with Multiple Comparison


    Multiple Comparison


    Hypothesis

    Null hypothesis

    The means from each group are equal

    Alternative hypothesis

    At least two factor groups have significant different means

    In this example, we wanted to know if the FEF values were different among the 6 smoking groups


    Step 2. Choose Multiple Comparison Methods



    Explanations
  • Bonferroni adjusted p-values = max(1, pm); m= k(k-1)/2 multiple pairwise comparisons
  • Sidak adjusted p-values = max(1, 1 - (1 - p)^m)
  • Holm's adjusted p-values = max[1, p(m+1-i)]; i is ordering index
  • Holm-Sidak adjusted p-values = max[1, 1 - (1 - p)^(m+1-i)]
  • Hochberg's adjusted p-values = max[1, p*i]
  • Benjamini-Hochberg adjusted p-values = max[1, pm/(m+1-i)]
  • Benjamini-Yekutieli adjusted p-values = max[1, pmC/(m+1-i)]; C = 1 + 1/2 + ... + 1/m
  • Output 2. Test Results


    Reject Null Hypothesis if p <= 0.025


    In this example, smoking groups showed significant, so we could conclude that FEF were not significantly different in LS-NI, LS-PS, and NI-PS groups. For other groups, P <0.025.