Data Analysis Project 2 For Demonstration

Data Analysis Project 2for This Project You Will Demonstrate Competenc

Using the data covered in the Health, Education, and Crime slides, generate six research questions to study (two from each category). Each question must use a different data source and have at least 30 data points. Create an Excel sheet for each question with the data set, a graph, and calculated statistical metrics including mean, median, variance, standard deviation, coefficient of variation, range, 90th percentile, quintiles, skewness, Z-score for data points, and a discrete probability distribution histogram. Construct at least three different types of graphs across the project and ensure this statistical concept is used at least once.

Develop a PowerPoint presentation containing one slide per research question, with one graph, the statistical metrics, up to three bullet points, and hyperlinks to the data source. Include an introductory slide with your name, project number, and class. Submit both Excel and PowerPoint files via Blackboard before the deadline.

Note that graphs should be derived from Excel data, and presentation quality will influence grading. Do not copy graphs from websites or replicate other students' work. The project is graded 50% on the Excel component and 50% on the PowerPoint presentation.

Paper For Above instruction

The purpose of this research is to explore and analyze key issues within the realms of health, education, and crime by formulating specific questions, sourcing credible data, and employing statistical methods to interpret the information. This process not only enhances understanding of these vital societal facets but also demonstrates proficiency in data analysis and presentation skills.

Introduction

The initial phase of this project involved identifying relevant, credible data sources for each category: health, education, and crime. Six research questions were formulated to investigate pertinent issues, such as trends in thyroid cancer rates, disparities in educational attainment across races, and crime rate fluctuations. The aim was to use diverse, reliable data sources—such as the CDC, NCES, and FBI—and analyze data points with a minimum of 30 observations to ensure statistical validity.

The purpose of each question was to uncover patterns or relationships, supported by visual and numerical data analyses. The questions selected reflect current societal concerns and policies, aiming to provide insights for stakeholders and policymakers.

Research Questions and Data Sources

  1. Health: Has the prevalence of thyroid cancer increased since 2000? Data sourced from the CDC National Cancer Database.
  2. Health: What is the average blood pressure among adults in the United States over the last decade? Data from the National Health and Nutrition Examination Survey (NHANES).
  3. Education: What is the distribution of high school graduation rates across different races in California? Data from the California Department of Education.
  4. Education: How does standardized test scores in mathematics compare across states? Data from the National Assessment of Educational Progress (NAEP).
  5. Crime: What are the trends in violent crime rates in major U.S. cities over the past five years? Data from FBI Uniform Crime Reports.
  6. Crime: What is the distribution of arrest rates for property crimes across different states? Data from the Bureau of Justice Statistics.

Each data set was carefully selected to meet the minimum data point requirement, ensuring robust statistical analysis and meaningful visual representation.

Data Analysis and Visualization

For each research question, data was compiled in individual Excel spreadsheets. Graphs such as histograms, box plots, and scatter plots were created to visualize distributions, correlations, and trends. The statistical metrics—mean, median, variance, standard deviation, coefficient of variation, range, 90th percentile, quintiles, skewness, Z-scores, and histogram distributions—were calculated using Excel functions. These metrics provided quantitative measures to interpret the data rigorously.

For example, in analyzing thyroid cancer data, the mean and median indicated central tendencies, while skewness revealed distribution asymmetry. Z-scores identified outliers, and quintiles highlighted distribution spread. Similarly, trend analyses in crime data employed line graphs, while education disparities were illustrated through bar charts, aiding in comparative visual assessments.

Presentation and Communication

The PowerPoint slides synthesized each research question's findings visually and numerically. Each slide included a graph, key statistical measures, concise bullet points, and hyperlinks to original data sources to ensure transparency and reproducibility. The introduction slide outlined the purpose and scope of the project. Overall, the presentation aimed to communicate complex data insights clearly and effectively, enhancing understanding for the audience.

Design elements such as consistent color schemes, readable fonts, and logical organization contributed to the professionalism of the presentation. These aesthetic choices, coupled with accurate data and thorough analysis, exemplify effective communication skills essential for academic and professional contexts.

Conclusion

Completing this project reinforced the importance of rigorous data sourcing, statistical analysis, and effective presentation. It highlighted how data can illuminate societal issues and inform decision-making. Personally, the project deepened my understanding of health, education, and crime trends, showcasing how quantitative methods can uncover meaningful patterns and insights. The process of transforming raw data into comprehensible visual and numerical summaries enhanced my analytical skills and emphasized the value of meticulous research and clear communication in academic work.

References

  • Census Bureau. (2022). American Community Survey Data. https://www.census.gov/data.html
  • Centers for Disease Control and Prevention (CDC). (2022). National Cancer Database. https://www.cdc.gov/cancer
  • National Center for Education Statistics (NCES). (2022). The Condition of Education. https://nces.ed.gov
  • National Assessment of Educational Progress (NAEP). (2022). Mathematics Assessment Reports. https://nces.ed.gov/nationsreportcard
  • Bureau of Justice Statistics. (2022). crime data. https://bjs.ojp.gov
  • Federal Bureau of Investigation (FBI). (2022). Uniform Crime Reporting Program. https://ucr.fbi.gov
  • World Health Organization (WHO). (2023). Global Cancer Statistics. https://www.who.int
  • National Health and Nutrition Examination Survey (NHANES). (2022). Data Sets. https://www.cdc.gov/nchs/nhanes
  • Bureau of Justice Statistics. (2022). Arrest Data. https://bjs.ojp.gov
  • California Department of Education. (2022). California School Dashboard. https://data1.cde.ca.gov