## Question 634:

1## Answer:

No answer provided yet.If we assume the number of inspections for the quarter of 3497 represent a sample of the total number of drains we can generate a percentage estimate of the service, provide a 95% confidence interval around the percentage and test whether there is sufficient evidence that more than 3% of drains are obstructed.

By adding up all Ratings from 5 through 3 we get 3228 out of 3497 drains that are not obstructed or 92.3%. All ratings 2 and below get us the number obstructed or 269/3497 = 7.69% obstructed.

There is a chance that more obstructed drains were sampled by chance alone. To test this we can conduct a 1-proportion test against the test proportion of 3%.

Since the sample is reasonably large we can use the normal approximation to the binomial. We are testing whether the observed proportion 269/3497 = .0769 is greater than .03 (.03*3497 = ~105) while taking into account chance fluctuations.

- The NULL hypothesis is that the proportion of drains blocked is less than or equal to 3%. The alternative hypothesis is the percentage is above 3.
- The standard rule of thumb is that you're usually ok using the normal approximation if n*p > 5 and n*q > 5, where q is just 1-p. For this example that would be 269 and 3228 respectively, so we're fine.
- We need to get a z-score out of the observed proportion. For continuous data we usually have the mean and standard deviation. We only have the mean (269) and no standard deviation. Fortunately, for discrete distributions, the standard deviation is equal to the square root of n*p*q. For this data that would be SQRT ( 3497*.0769*.923 ) = a standard deviation of 15.76.
- Our test statistic will be z, which is composed of the difference between the hypothesized percentage and actual percentage divided by the square root of (hypothesized probability * 1-hypothesisted probability) / sample size. SQRT( ( .03-.97)/ 3497 ) = .00289
- Now we compute the z-score with the proportions (.0769 -.03)/.00289= a z of 16.266. This z-value is extremely large and wont be found in normal table, meaning it is a significant at less than .00001.
- So, the probability 269 out of 3497 blocked drains is due to chance alone is less than .001% --very remote.
- We would reject the NULL hypothesis and conclude the required level of service is not achieved.

Finally, we can generate a confidence interval around the observed proportion using a similar procedure as above. For a 95% confidence interval, the critical z-value is 1.96.

- The margin of error is equal to this value times the standard error of the mean (SEM).
- The SEM is calculated as the square root of the observed proportion * 1-observed proportion divided by the sample size. Which is SQRT (0769 *(1-0769) ) / 3497 ) = .0045061.
- Multiplying this times the critical z = 1.96*.0045061 = a margin of .0088318.
- We now add and subtract this to the mean to get a 95% confidence interval between .068 and .857 or we can be 95% confident the total obstructed inlet pits is between 6.8% and 8.57%. You can confirm this result by using the confidence interval around a proportion calculator.

This also confirms that there is an unacceptably high level of obstructed drainage pits in the network.