## Question 532:

1

We are asked in this question to test whether the observed frequencies fit the expected frequencies. The Chi-Square Goodness of Fit test is appropriate to make this determination. It's a straight forward comparison. We basically want to know if the observed values are greater or less than what we'd expect from chance fluctuations.  To do so we use the following formula:

Sum of       (O-E)2
---------
E

• O means the observed frequencies
• E means the expected frequencies

So we're told the percentage for each gene type, now we need the generate the expected frequencies by multiplying each percentage times 145, which is the total sample.

 Gene Freq Observed Expected** (O-E)2 /E AA .25 20 (.137) 36.25 264.0625 7.28 Aa .50 90 (.62) 72.5 306.25 4.22 aa .25 35 (.24) 36.25 1.5625 0.043

* found by multiplying Expected Freq. by 145
** Values in parenthesis are the proportion observed out of 145 (e.g 20/145 = .137)

Now we add up all the values in the /E (divided by E) column = 11.551, which is our Chi-Square statistic. To interpret this, we need to look up the critical value in a Chi-Square table, or we can use the excel function =CHIDIST(11.551,2) where the second parameter is the degrees of freedom (groups - 1). We get .003 which is less than the alpha of .05 so we would reject the null hypothesis. For Chi-Square, the null hypothesis is that there is no difference between observed and expected frequencies. At p = .003 we have strong evidence to conclude that observed frequencies DO NOT fit the expected frequencies. This is mostly due to the lower occurrence of AA at .137 when it was expected to occur at .20.