# Chi-square Test and Exact Binomial Method for One Proportion

#### 1. Functionalities

• To determine if the population rate/proportion behind your data is significantly different from the specified rate/proportion
• To determine how compatible the sample rate/proportion with a population rate/proportion
• To determine the probability of success in a Bernoulli experiment

• Your data come from binomial distribution (the proportion of success)
• You know the whole sample and the number of specified events (the proportion of sub-group)
• You have a specified proportion (p0)

#### Case Example

Suppose that in the general population, 20% of women who had infertility. Suppose a treatment may affect infertility. 200 women who were trying to get pregnant accepted the treatment. Among 40 women who got the treatment, 10 were still infertile. We wanted to know if there was a significant difference in the rate of infertility among treated women compared to 20% of the general infertile rate.

#### Output 2. Test Results

1. Normal Theory Method with Yates' Continuity Correction, when np0(1-p0) >= 5

2. Exact Binomial Method, when np0(1-p0) < 5

Explanations
• P Value < 0.05, then the population proportion/rate IS significantly different from the specified proportion/rate. (Accept the alternative hypothesis)
• P Value >= 0.05, then the population proportion/rate IS NOT significantly different from the specified proportion/rate. (Accept the null hypothesis)
From the default settings, we concluded that there was no significant difference in the rate of infertility among homozygous women compared to the general infertility rate (P = 0.55). In this case, np0(1-p0)=40*0.2*0.8 > 5, so the Normal Theory Method was preferable.

# Chi-square Test for Two Independent Proportions

#### 1. Functionalities

• To determine if the population rate/proportion behind your 2 groups data are significantly different

• Your 2 groups data come from binomial distribution (the proportion of success)
• You know the whole sample and the number of specified events (the proportion of sub-group) from 2 groups
• The 2 groups are independent observations

#### Case Example

Suppose all women in the study had at least on birth. We investigated 3220 breast cancer women as the case. Among them, 683 had at least one birth after 30 years old. Also, we investigated 10245 no breast cancer women as control. Among them, 1498 had at least one birth after 30 years old. We wanted to know if the underlying probability of having first birth over 30 years old was different in breast cancer and non-breast cancer groups.

#### Output 1. Data Preview

Data Table

Percentage Plot of

1. Case

2. Control

#### Output 2. Test Results

Explanations
• P Value < 0.05, then the population proportion/rate are significantly different in two groups. (Accept alternative hypothesis)
• P Value >= 0.05, then the population proportion/rate are NOT significantly different in two groups. (Accept null hypothesis)
From the default settings, we conclude that women with breast cancer are significantly more likely to have their first child after 30 years old compared to women without breast cancer. (P<0.001)

# Chi-square Test for More than Two Independent Proportions

#### 1. Functionalities

• To determine if the population rate/proportion behind your multiple group data are significantly different

• Your group data come from binomial distribution (the proportion of success)
• You know the whole sample and the number of specified events (the proportion of sub-group) from each group
• The multiple groups are independent observations

#### Case Example

Suppose we wanted to study the relationship between age at first birth and the development of breast cancer. Thus, we investigated 3220 breast cancer cases and 10254 no breast cancer cases. Then, we categorize women into different age groups. We wanted to know if the probability of having cancer were different among different age groups; or if their ages related to breast cancer.

Data Table

#### Output 2. Test Results

Explanations
• P Value < 0.05, then the population proportion/rate are significantly different. (Accept the alternative hypothesis)
• P Value >= 0.05, then the population proportion/rate are NOT significantly different. (Accept the null hypothesis)

In this default setting, we concluded that the probability of have cancer was significantly different in different age groups. (P < 0.001)

# Chi-square Test for Trend in Multiple Independent Samples

#### 1. Functionalities

• To determine if the population rate/proportion behind your multiple group data vary

• Your group data come from binomial distribution (the proportion of success)
• You know the whole sample and the number of specified events (the proportion of sub-group) from each group
• The multiple groups are independent observations

#### Case Example

Suppose we wanted to study the relationship between age at first birth and the development of breast cancer. Thus, we investigated 3220 breast cancer cases and 10254 no breast cancer cases. Then, we categorize women into different age groups. In this example, we wanted to know if the rate of having cancer tended from small to large ages.

Data Table

Cell-Column %

#### Output 2. Test Results

Explanations
• P Value < 0.05, then Case-Control (Row) is significantly associated with grouped Factors (Column) (Accept the alternative hypothesis)
• P Value >= 0.05, then Case-Control (Row) is not associated with grouped Factors (Column). (Accept the null hypothesis)

In this default setting, we concluded that the proportion of cancer varied among different ages. (P = 0.01)