## Question 622:

1

Question 1:

To find the area between to points on the normal curve we need to find the area for the larger value and subtract out the area for the smaller value. We can either use a table of normal probabilities or the z-score to percentile calculator . We'll want to use the 1-sided area.

Before we do this we need to convert our raw values into standard scores (z-scores) by subtracting the mean and dividing that result by the standard deviation. For the larger value we get a z-score of (7.36-6.56)/2.4 = .3333 and for the smaller we get (3.25-6.56)/2.4 = -1.379.  Now we lookup these values in the normal table (or calculator which I prefer).

For the larger z-value of .333 we get an area of 63.06% and for the smaller z-score of -1.379 we get 8.39%. So 63.06-8.39 = 54.66% of the area. Expressed as a probability it's just .5466 probability that a randomly selected day will fall between 3.25 and 7.36

Question 2:

We can use the F-test, which is a ratio of the variances to test to see if there is a difference between variance.

1. The ratio is 6.52/3.47 = 1.87 on 24 degrees of freedom in the numerator and 21 in the denominator.
2. We can use the excel function =FDIST(1.87,24,21) =.0742 which provides the 1-tailed probability. We need the 2-tailed probability so we multiply this value times 2 = .0148, which is our p-value.
3. Since our rejection criteria (alpha) is .01 and the p-value .0148 is greater than .01 we would NOT reject the null hypothesis (which was the variances are equal) and CANNOT conclude that there is a difference between the variances, that is the risk of the stocks.

A note on using the F-test. There are multiple ways to test for unequal variances, the F-test is one way, however,  it has been criticized as being too sensitive to outliers in the data (since the standard deviation is based on the mean and both are therefore affected by outliers). Other tests, such as the Levine test are recommended, however the raw data are needed for this test. Since the sample sizes are equal here, while there is evidence for differing variances, it is usually considered appropriate to proceed with t-tests and they can handle (are robust) to such a violation of homogeneity of variance.

Question 3:

We'd conduct a 1-sample proportion test and since the sample is reasonably large we can use the normal approximation to the binomial. We are testing whether the observed proportion 113/350 = .323 is greater than .30 while taking into account chance fluctuations.

The NULL hypothesis is that the sample is not different than the benchmark of 30% return rate. The alternative hypothesis is that the sample is greater than 30%.

1. The standard rule of thumb is that you're usually ok using the normal approximation if n*p > 5 and n*q > 5, where q is just 1-p. For this example that would be n*p = 350*.30 = 113 and for q  350*.60 = 237 so we're just fine.
2. We need to get a z-score out of the observed proportion. For continuous data we usually have the mean and standard deviation. We only have the mean (113) and no standard deviation. Fortunately, for discrete distributions, the standard deviation is equal to the square root of n*p*q.  For this data that would be SQRT ( 350*.323*.677 ) = a standard deviation of 8.75.
3. We now need to make a continuity correction since we have a discrete distribution and want to get as close as we can to a continuous one. We add .5 to our test mean value of 105 (.30*350) that we are testing for to get 105.5.
4. Now we compute the z-score with the new data (113  105.5)/8.75 = a z of .857. Using the z-score to percentile calculator gives us a one-sided probability of  .195 or 19.5 %, much higher than the 2.5 alpha level set.
5. So now we interpret what we got. We see that 32.3% of customers returned the product, which is above 30%. Given our sample size however, we do not have sufficient evidence to conclude that 32.2% is greater than 30%. However, there is pretty good evidence that this company is NOT beating the benchmark of 30 as the observed proportion is ABOVE 30%.

If we were to use the binomial probabilities instead of approximating, we'd have a probabilityof .161 versus the normal-approximation of .195 which is pretty close.

Note: The way this question and data are setup has us testing whether the sample is greater than the benchmark, I'd think a better way would be testing if the sample were below the benchmark of 30%, but we see that the observed proportion of .323 is above .30, so the p-value would well .825.