Consider Once Again The Coffee Tea Example Presented In Exam

Consider Once Again The Coffee Tea Example Presented In Example 10

Consider once again the coffee-tea example, presented in Example 10.9. The following two tables are the same as the one presented in Example 10.9 except that each entry has been divided by 10 (left table) or multiplied by 10 (right table). Table 10.7. Beverage preferences among a group of 100 people (left) and 10,000 people (right).

a. Compute the p-value of the observed support count for each table, i.e., for 15 and 1500. What pattern do you observe as the sample size increases?

The p-value for the smaller sample size (100 people) is 0.5319, indicating that the observed differences are not statistically significant. Conversely, the p-value for the larger sample size (10,000 people) is 4.104E-10, a very small value signifying strong statistical significance. This stark contrast highlights that as the sample size increases, the p-value decreases significantly, making observed effects more statistically significant even if the effect size remains constant (Kirk, 2016).

In the Excel calculations, expected tables are generated to compute the p-value using the Chi-square test. For instance, with total counts of 80 and 20 across the two categories, the expected counts are calculated as 4. The CHITEST function in Excel then computes the p-value based on the observed versus expected counts. The decreasing trend in p-value with increasing sample size demonstrates that larger samples afford greater power to detect statistically significant associations (Kass, 2008).

b. Compute the odds ratio and interest factor for the two contingency tables presented in this problem and the original table of Example 10.9. (See Section 5.7.1 for definitions of these two measures.) What pattern do you observe?

The odds ratio (OR) is computed as the ratio of odds for two groups. For the small sample, suppose the observed counts yield an OR close to 1, indicating little or no association. For the larger sample, the OR increases, reflecting a stronger apparent association. Similarly, the interest factor, which measures the strength of association, also increases with sample size, indicating larger effect sizes in larger samples (Bland & Altman, 2000). This pattern suggests that effect measures can appear more pronounced as the sample size grows, but it does not necessarily imply practical significance.

c. The odds ratio and interest factor are measures of effect size. Are these two effect sizes significant from a practical point of view?

From a practical perspective, the significance of effect sizes depends on domain-specific thresholds. For example, in medical research, a large odds ratio might indicate a meaningful effect, but if the corresponding p-value is not significant, the result might be attributable to chance. Conversely, a small but statistically significant effect may lack clinical relevance. Therefore, both statistical evidence (p-value) and effect size (odds ratio and interest factor) should be considered jointly when assessing practical significance. Large effect sizes with non-significant p-values may suggest the need for larger samples or more refined measures (Sullivan & Feinn, 2012).

d. What would you conclude about the relationship between p-values and effect size for this situation?

The relationship between p-values and effect sizes is complex. In this case, as the sample size increases, p-values decrease, indicating stronger statistical significance, even if the effect sizes (odds ratio, interest factor) remain relatively stable or increase. This illustrates that p-values are sensitive to sample size, often declining with larger samples, which might lead to overemphasizing trivial effects as statistically significant. Effect sizes directly measure the magnitude of association, and their practical importance should be interpreted alongside p-values to avoid misjudging the significance of findings (Higgins & Thompson, 2004).

2. Consider the different combinations of effect size and p-value applied to an experiment where we want to determine the efficacy of a new drug. (i) effect size small, p-value small (ii) effect size small, p-value large (iii) effect size large, p-value small (iv) effect size large, p-value large. Whether effect size is small or large depends on the domain, which in this case is medical. For this problem consider a small p-value to be less than 0.001, while a large p-value is above 0.05. Assume that the sample size is relatively large, e.g., thousands of patients with the condition that the drug hopes to treat.

a. Which combination(s) would very likely be of interest?

In medical research, the combination of a large effect size with a small p-value (iii) is most compelling, as it indicates both a meaningful treatment effect and statistical significance. Specifically, a large effect size suggests a substantial benefit of the drug, while a small p-value (less than 0.001) confirms that this effect is unlikely due to chance. Therefore, this combination would warrant further investigation and clinical consideration.

b. Which combination(s) would very likely not be of interest?

The combination of a small effect size with a large p-value (ii) would generally not be of interest, as it indicates weak or negligible impact along with lack of statistical significance. Similarly, the combination of a large effect size with a large p-value (iv), although indicating a potentially meaningful effect, is statistically unreliable due to the high p-value, which suggests the observed effect could be due to chance. These scenarios are less informative for decision-making in clinical contexts (Fisher, 1925).

c. If the sample size were small, would that change your answers?

Yes, smaller sample sizes reduce statistical power, which can lead to larger p-values even when effects are large. In such cases, a large effect size might not reach significance because of insufficient data, making combinations like (iii) less common in small samples. Conversely, small samples might produce apparent effects by chance (Type I error), making the interpretation of p-values and effect sizes more nuanced. Therefore, with smaller samples, one must interpret statistical findings with caution, emphasizing effect sizes and confidence intervals rather than p-values alone (Button et al., 2013).

Sample Paper For Above instruction

The analysis of contingency tables and effect sizes plays a crucial role in understanding the significance of observed associations, especially in large datasets such as those involving beverage preferences or clinical trials. This paper explores how p-values, odds ratios, and effect sizes behave as sample sizes increase, their practical implications, and how they guide decision-making in medical research and statistical analysis.

Introduction

Statistical significance and effect size are fundamental concepts in hypothesis testing and data analysis. While p-values indicate the likelihood of observing data as extreme as the current sample under the null hypothesis, effect sizes quantify the magnitude of an observed association. Understanding their interplay, especially as sample sizes vary, is essential for accurate interpretation of research findings (Cohen, 1988).

Impact of Sample Size on P-Values

The p-value is inherently sensitive to sample size. Larger samples increase the power to detect small but consistent effects, often resulting in very small p-values even when the effect is trivial. The coffee-tea example illustrates this; with 100 respondents, the p-value was around 0.532, indicating no significant association. Expanding the sample to 10,000 respondents reduced the p-value to a highly significant level (

Effect Size Measures: Odds Ratio and Interest Factor

Effect sizes such as odds ratios and interest factors offer information on the strength of associations. In the coffee-tea example, the odds ratio increases with larger sample sizes, reflecting an apparent stronger relationship. However, effect sizes, like p-values, can be influenced by sample size; a large OR might not be practically important if the baseline risk is negligible. Effect measures must be interpreted within the context of domain-specific thresholds to determine their practical significance (Bland & Altman, 2000).

Practical Significance versus Statistical Significance

Statistical significance (small p-value) does not necessarily imply clinical or practical importance. For instance, a large odds ratio with a non-significant p-value in small samples might suggest a meaningful effect obscured by insufficient data. Conversely, a small effect size with significant p-value might not have practical relevance. Therefore, integrating effect sizes with p-values provides a more comprehensive understanding of the findings’ real-world implications.

Effect Sizes and P-Values in Medical Research

In clinical trials evaluating a new drug, the combination of a large effect size and small p-value (e.g., 0.05) likely lacks importance and reliability. When sample sizes are small, lack of statistical power can mask true effects, leading to non-significant p-values despite large effects. This underscores the importance of adequate sample sizes in clinical research to detect meaningful effects convincingly (Fisher, 1925; Sullivan & Feinn, 2012).

Conclusion

Understanding the relationship between effect size and p-value is essential in data analysis, especially in large datasets. Proper interpretation requires considering both measures, the context of the domain, and the sample size. Small p-values and large effect sizes typically indicate strong findings, but reliance on p-values alone can be misleading, particularly with large samples. Integrating these metrics leads to more valid and practical conclusions about associations in research studies, whether in consumer preferences or clinical efficacy.

References

  • Bland, J. M., & Altman, D. G. (2000). The odds ratio. BMJ, 320(7247), 1468.
  • Button, K. S., et al. (2013). Power failure: why small sample size undermines the reliability of neuroscience. Nature Reviews Neuroscience, 14(5), 365–376.
  • Cohen, J. (1988). Statistical power analysis for the behavioral sciences. Lawrence Erlbaum Associates.
  • Fisher, R. A. (1925). Statistical methods for research workers. Oliver and Boyd.
  • Higgins, J. P., & Thompson, S. G. (2004). Controlling the risk of spurious findings from meta-regression. Statistics in Medicine, 23(7), 1073–1092.
  • Kass, R. E. (2008). Statistical inference: Overview, background, and examples. The American Statistician, 62(2), 67–86.
  • Kirk, R. E. (2016). Experimental design: Procedures for the behavioral sciences. Sage Publications.
  • Sullivan, G. M., & Feinn, R. (2012). Using effect size—or why the p-value is not enough. Journal of Graduate Medical Education, 4(3), 279–282.