Sample Mean Hypothesis Testing

Sample Mean Hypothesis Testingimagine That You Have A Portfolio Of Sto

Sample Mean Hypothesis Testingimagine That You Have A Portfolio Of Sto

Imagine that you have a portfolio of stocks. You want to compare the performance of your stocks to the market as a whole. You learn that the population mean, μ, of stock returns is 7%, and that the population standard deviation, σ, is 2%. You have several scenarios to analyze involving sample means and confidence intervals.

First, consider a scenario where you have a portfolio of 10 stocks with a sample mean return of 14%. To estimate the range within which the true population mean likely falls, we construct a 90% confidence interval. Using the known population standard deviation (σ = 2%), the formula for the confidence interval is:

CI = sample mean ± Zα/2 * (σ/√n)

For a 90% confidence level, Zα/2 ≈ 1.645. The calculation is as follows:

CI = 14% ± 1.645 (2% / √10) ≈ 14% ± 1.645 0.6325% ≈ 14% ± 1.041%

Therefore, the 90% confidence interval is approximately (12.959%, 15.041%).

Secondly, regarding the probability that the sample mean equals the population mean, in theoretical terms, the probability that a continuous random variable takes on exactly a specific value is zero. Therefore, the probability that the sample mean is exactly equal to the population mean is zero.

Next, suppose that you observe a sample mean of 2% returns from a portfolio of 10 stocks. To estimate the population mean, you construct a 95% confidence interval. Using Zα/2 ≈ 1.96 for 95%, the calculation is:

CI = 2% ± 1.96 (2% / √10) ≈ 2% ± 1.96 0.6325% ≈ 2% ± 1.241%

This yields an approximate 95% confidence interval of (0.759%, 3.241%).

Again, the probability that the sample mean exactly matches the population mean remains zero due to the properties of continuous distributions.

Lastly, if the sample mean of 8% is obtained from a different scenario with the same parameters, the probability that the sample mean equals exactly the population mean of 7% is still zero. However, analysis of the confidence interval can provide insights into whether the population mean could be 7%. Using the same approach, the confidence interval around 8% with the same standard error remains the same, indicating that the population might likely be near that value, but the probability of exact equality remains zero in the continuous case.

Sample Proportion Hypothesis Testing

Now, consider a population proportion of morbidly obese individuals at 2%. To apply normal approximation methods, the sample size n must be sufficiently large to ensure that np ≥ 5 and n(1-p) ≥ 5. For p = 0.02, solving np ≥ 5 yields:

n ≥ 5 / 0.02 = 250

Thus, the sample size should be at least 250 individuals to use the normal approximation confidently.

Suppose you collect data from a sample of 500 individuals, and the sample proportion of morbidly obese individuals is 4% (0.04). To construct a 90% confidence interval for this proportion, we use:

CI = p̂ ± Zα/2 * √[p̂(1 - p̂)/n]

Using Zα/2 ≈ 1.645 and p̂ = 0.04:

Standard error = √[0.04 * 0.96 / 500] ≈ √(0.0384 / 500) ≈ √0.0000768 ≈ 0.00876

CI = 0.04 ± 1.645 * 0.00876 ≈ 0.04 ± 0.0144

This results in a 90% confidence interval of approximately (0.0256, 0.0544), meaning the true proportion lies between about 2.56% and 5.44% with 90% confidence.

Since this interval includes the population proportion of 2%, it suggests that the observed sample proportion may be consistent with the population proportion, though the upper bound exceeds 2%. The probability that the sample proportion exactly equals the population proportion (2%) in a continuous setting is zero, but the interval provides potential estimates about the population parameter.

References

  • Agresti, A., & Coull, B. A. (1998). Approximate is better than "exact" for interval estimation of binomial proportions. The American Statistician, 52(2), 119–126.
  • Blitzstein, J., & Hwang, J. (2014). Introduction to probability. CRC Press.
  • Casella, G., & Berger, R. L. (2002). Statistical inference (2nd ed.). Duxbury.
  • Liu, H., & Singh, K. (1997). Notions of limiting p-values in multiple hypothesis testing. Journal of the American Statistical Association, 92(439), 393–404.
  • Wasserman, L. (2004). All of statistics: a concise course in statistical inference. Springer Science & Business Media.
  • Yates, F. (1934). Contingency tables involving small numbers and the χ² test. Supplement to the Journal of the Royal Statistical Society, 1(2), 217–235.
  • Newcombe, R. G. (1998). Two-sided confidence intervals for the single proportion: comparison of seven methods. Statistics in Medicine, 17(8), 857–872.
  • Wilcox, R. R. (2012). Introduction to robust estimation and hypothesis testing. Academic press.
  • Zar, J. H. (1999). Biostatistical analysis (4th ed.). Prentice Hall.
  • Moore, D. S., McCabe, G. P., & Craig, B. A. (2012). Introduction to the practice of statistics. W. H. Freeman.