Research Methods 10rm Chi Square Author Ed Nelson Department

Research Methods 10rm Chi Squareauthor Ed Nelsondepartment Of So

Research Methods 10rm - Chi Square Author: Ed Nelson Department of Sociology M/S SS97 California State University, Fresno Fresno, CA 93740 Email: [email protected] Note to the Instructor: This is the tenth in a series of 13 exercises for an introductory research methods class. The exercise focuses on analyzing data using Chi Square as a test of significance and practicing with CROSSTABS in SDA. The data come from the 2017 Monitoring the Future Survey of high school seniors in the United States, available through the Inter-university Consortium for Political and Social Research at the University of Michigan. The exercise involves examining relationships between variables, conducting Chi Square tests, interpreting results, and understanding expected cell frequencies.

Paper For Above instruction

The purpose of this exercise is to introduce students to Chi Square as a statistical test of significance, particularly in the context of social survey data involving nominal or ordinal variables. Using data from the 2017 Monitoring the Future Survey, the exercise guides students through the process of analyzing relationships between variables, performing cross-tabulations, and interpreting Chi Square results. This practical approach helps students understand how to test hypotheses about populations based on sample data, accounting for sampling error and significance levels.

Initially, students familiarize themselves with the survey data, which involves a large sample of over 12,000 high school seniors in the United States. The survey employs complex sampling techniques, and the data are weighted to better represent the population. The exercise emphasizes that survey responses are used as measures of underlying concepts called variables, which can be examined for relationships using cross-tabulation.

As an illustrative example, the exercise investigates the relationship between sex (independent variable) and alcohol consumption frequency (dependent variable). Using SDA, students run a crosstabulation of two variables, first with the original measure of alcohol consumption (v2105) and then with a dichotomous version (v2105d), categorizing alcohol use into 'never' and 'one or more times.' The use of column percentages in the table allows for comparison of relative frequencies across sexes.

Students are encouraged to interpret the observed differences cautiously, recognizing that small differences (e.g., 2.7%) may not indicate true population differences due to sampling variability. This introduces the concept of sampling error, which diminishes with larger samples. The key purpose is to determine whether the observed differences are statistically significant, which leads to the discussion of the Chi Square test.

The exercise explains that the Chi Square test assesses the null hypothesis that the variables are independent—that they are unrelated in the population. If the test yields a p-value less than 0.05, the null hypothesis is rejected, providing evidence that the variables are related in the population. Conducting Chi Square in SDA involves obtaining the test statistic (Pearson Chi Square), degrees of freedom, and p-value. A small p-value (e.g., less than 0.005) indicates a statistically significant relationship, supporting the research hypothesis that sex and alcohol consumption are related.

Furthermore, the exercise introduces the concept of expected frequencies, which are calculated based on marginal totals under the null hypothesis. Comparing observed and expected cell frequencies allows for the computation of the Chi Square statistic. The exercise notes that all expected frequencies should generally be greater than five; if some are small, categories may need to be combined to satisfy test assumptions.

Finally, students are tasked with applying these concepts to analyze additional relationships between pairs of variables from the survey data, using cross-tabulation, Chi Square testing, and interpretation of results. The exercise emphasizes the critical thinking involved in drawing inferences about the population based on sample data and understanding the implications of statistical significance.

References

  • Agresti, A. (2007). An Introduction to Categorical Data Analysis. Wiley.
  • Everitt, B. (2002). The Controversy over the Use of the Chi-Square Test. International Statistical Review, 70(3), 581-591.
  • Field, A. (2013). Discovering Statistics Using IBM SPSS Statistics. Sage.
  • Goodman, L. A. (2001). Snowball Sampling. In L. Bickman & D. Rog (Eds.),Applied Social Research Methods Series, Volume 4. Sage.
  • Hosmer, D. W., Lemeshow, S., & Sturdivant, R. X. (2013). Applied Logistic Regression. Wiley.
  • McHugh, M. L. (2013). The Chi-Square Test of Independence. Biochemia Medica, 23(2), 143-149.
  • Tabachnick, B. G., & Fidell, L. S. (2013). Using Multivariate Statistics. Pearson.
  • Vittinghoff, E., McCulloch, C. E., & Moberly, J. (2012). Sample Size, Power, and Statistical Significance. Preventing the misinterpretation of statistical tests.
  • Willson, R. J. (2004). Statistics for the Social Sciences. Pearson.
  • Zhang, Z., & Yu, K. (1998). What's the P-value? The American Statistician, 52(3), 211-222.