STAT200: Written Assignment 1 - Descriptive Statistics ✓ Solved

STAT200 Written Assignment 1 Descriptive Statistics

STAT200: Written Assignment #1 - Descriptive Statistics

STAT200: Written Assignment #1 - Descriptive Statistics Data Analysis Plan

Assignment: Prepare Descriptive Statistics Data Analysis Plan. This is a plan-only assignment; no statistics will be calculated or graphs created. The dataset is a subsample from the US Department of Labor’s Consumer Expenditure Surveys (CE) describing household composition and annual expenditures.

Steps:

Step 1: Review the dataset. The CE dataset contains four socioeconomic variables (SE-) and four expenditure variables (USD-). Income is a required variable. The following variables are available:

- SE-Income: annual household income, quantitative.

- SE-MaritalStatus: marital status, qualitative.

- SE-AgeHeadHousehold: age of the head of household, quantitative.

- SE-FamilySize: total number of people in the family, quantitative.

- USD-AnnualExpenditures: total annual expenditures, USD.

- USD-Housing: annual housing expenditures, USD.

- USD-Electricity: annual electricity expenditures, USD.

- USD-Water: annual water expenditures, USD.

Step 2: Develop descriptive statistics data analysis plan.

Task 1: Develop a scenario. Imagine that you are the head of a household and have to determine a household budget plan based on the data available from the dataset. For instance, you are a 35 year old single parent with a high school diploma and one child.

Task 2: Select variables for analysis that match the scenario developed in Task 1. The data set provides information on household consumption; there are socioeconomic variables and expenditures variables. The socioeconomic variable names start with “SE-” and the expenditure variable names start with a “USD-”. All expenditures are in US dollars. Income is a required variable. Select two additional socioeconomic variables (one qualitative and one quantitative) and two expenditures for your analysis that match the scenario you developed for Task 1. For instance, using the example scenario of a 35 year old single parent with a high school diploma and one child, you could select “income,” “education,” and “number of children” as socioeconomic variables and then pick two household expenditure items to show the distribution of costs and compare that with your income.

Task 3: Determine appropriate measures of central tendency and dispersion for the selected variables. For each quantitative variable, select at least one measure of central tendency and at least one measure of dispersion (Please see below table for list of measures). For the qualitative variable, select one measure of central tendency. When determining the measures of central tendency and dispersion, think about what is appropriate given the level of measurement and type of variable. Recommend referring to the text and information posted in our LEO classroom to help with this task (Note: you will use this information to provide a rationale for your choice of measures).

Measures of Central Tendency

Measures of Dispersion

Task 4: Determine appropriate graph and/or table for each of the selected variables. Select one graph or table for each variable (Please see below table for list of graphs and tables). When determining the graphs and tables, think about what is appropriate given the level of measurement and type of variable. Recommend referring to the text and information posted in our LEO classroom to help with this task (Note: you will use this information to provide a rationale for your choice of graphs and/or tables).

Types of Graphs

Types of Tables

Step 3: Complete the “Assignment #1: Descriptive Statistics Data Analysis Plan Template.” Remember, you will not be conducting any statistical analysis, drawing any graphs, or compiling any tables for the first assignment. Rather, you need to wait for feedback from your instructor on this assignment and use that feedback to complete Assignment #2. Here are the main sections for this assignment (i.e., completing the plan template):

  • Identifying Information. Fill in information on name, class, instructor, and date.
  • Scenario. In this section, briefly (2-3 sentences) describe the scenario you developed in Step #2, Task 1.
  • Complete Table 1: Variables Selected for the Analysis. Enter information the variables selected for analysis in Step #2, Task 2. For each selected variable be sure to include its: name as listed in the data set, description, and variable type.
  • Reason(s) for Selecting the Variables and Expected Outcome(s): In this section, for each selected variable, please answer the following questions: Why did I choose this variable? What interests me about this variable? What do I think will be the outcome?
  • Complete Table 2. Numerical Summaries of the Selected Variables. Enter information on selected measures of central tendency and dispersion for each selected variable. Be sure to briefly explain why you choose those measurements. Note: The information for the required variable, “Income,” has already been completed and can be used as a guide for completing information on the remaining variables.
  • Complete Table 3. Type of Graphs and/or Tables for Selected Variables. Enter information on selected graph and/or table for each selected variable. Be sure to briefly explain why you choose those measurements. Note: The information for the required variable, “Income,” has already been completed and can be used as a guide for completing information on the remaining variables.

Assignment Submission: Note that submission details have been provided in prior course communications; this plan is intended to be submitted via the designated course platform after you have completed the template.

Notes: You must include at least two figures or tables in your final plan. These must be created by you (not copied from sources). You must cite at least 10 references in your final paper, with at least five from peer-reviewed scholarly journals accessible via the UC Library. The paper should be in proper APA format and should be a minimum of 8 pages in length (double-spaced), excluding title page and references. You are not to perform statistical calculations, produce graphs, or compile tables for this assignment; the goal is planning and justification for the analyses you would conduct in Assignment #2.

Dataset description and variable information used in this assignment reflect a subsample from the US Department of Labor’s Consumer Expenditure Surveys (CE), including socioeconomic variables starting with SE- and expenditure variables starting with USD-; income is included as a required variable. The example scenario used for Task 1 is a hypothetical household profile intended to guide variable selection and analysis planning.

Paper For Above Instructions

Introduction and rationale. The Descriptive Statistics Data Analysis Plan is a planning exercise designed to help students articulate a thoughtful approach to analyzing a real-world dataset prior to performing any calculations. The CE dataset provides a compact example with both socioeconomic and expenditure dimensions, enabling a student to illustrate how to select variables that align with a plausible budget scenario and to justify decisions about central tendency, dispersion, and graphical representations. By emphasizing the alignment between scenario, variable choice, and chosen descriptive measures, the plan fosters critical thinking about data characteristics (e.g., measurement level, skewness, outliers) and the implications for summarization and visualization.

Scenario and variable selection. The scenario anchors decision-making around a representative household. For clarity and feasibility, the plan should include income as a mandatory variable and two additional socioeconomic variables (one qualitative, one quantitative) alongside two expenditures. The CE dataset contains four SE variables—SE-Income (quantitative), SE-MaritalStatus (qualitative), SE-AgeHeadHousehold (quantitative), SE-FamilySize (quantitative)—and four USD expenditures—USD-AnnualExpenditures, USD-Housing, USD-Electricity, USD-Water. Given the constraint of selecting two expenditures, reasonable choices include USD-AnnualExpenditures and USD-Housing to showcase overall spending and housing-related costs. The inclusion of SE-MaritalStatus as the qualitative variable and SE-AgeHeadHousehold as a quantitative variable supports contrasting category-based summaries with numerical descriptions of head-of-household age, thereby illustrating the different measures appropriate for mixed-variable analyses.

Measures of central tendency and dispersion. The plan must justify the chosen measures for each quantitative variable and optionally for the qualitative variable. For income, age head of household, family size, annual expenditures, and housing expenditures, typical choices include measures of central tendency such as mean or median (depending on distribution and outliers) and dispersion measures such as standard deviation, variance, or interquartile range. In the case of qualitative SE-MaritalStatus, the mode or a frequency distribution is a sensible central tendency descriptor. The rationale should reference the data’s level of measurement (nominal for qualitative, interval/ratio for quantitative) and anticipated data characteristics (e.g., potential skewness in income, the possibility of outliers in expenditures).\p>

Graphs and tables. For quantitative variables, histograms or box plots are appropriate to display distributional properties; for the qualitative variable, a bar chart (or pie chart) can illustrate category frequencies. The plan should specify one graphic or table per variable, with a brief justification for each choice based on the variable’s measurement level and the distributional considerations (e.g., skewness, modality, sample size). The template sections should be completed with descriptions and rationales rather than actual numerical results.

Template completion and notes. The final plan should include the following sections: Identifying Information (names, class, instructor, date), Scenario (2–3 sentences), Table 1 (Variables Selected for Analysis: variable name, description, and type), Reasons for Selecting the Variables and Expected Outcome(s) (one rationale per variable), Table 2 (Numerical Summaries: measures of central tendency and dispersion with rationale), Table 3 (Type of Graphs/Tables for Selected Variables with rationale). The plan explicitly states that no calculations or graphs will be produced in Assignment #1 and that feedback from the instructor will be used to inform Assignment #2.

Conclusion. This descriptive statistics plan provides a structured framework for how to approach data summarization and visualization when working with the CE dataset. By carefully selecting a mix of qualitative and quantitative variables and pairing them with appropriate measures and graphs, students demonstrate readiness to carry out the subsequent analysis steps in Assignment #2 while maintaining a clear link between the research scenario and the analytical choices.

References

  1. Agresti, A., & Franklin, C. (2018). Statistics: The Art and Science of Learning from Data (4th ed.). Pearson.
  2. Moore, D. S., McCabe, G. P., & Craig, B. A. (2017). Introduction to the Practice of Statistics (9th ed.). W. H. Freeman.
  3. Weiss, N. A. (2016). Intro Statistics (9th ed.). Pearson.
  4. Field, A. (2013). Discovering Statistics Using IBM SPSS (4th ed.). SAGE.
  5. Diez, D., Barr, C., & Çetinkaya-Rundel, N. (2019). OpenIntro Statistics (2nd ed.). OpenIntro.
  6. Cleveland, W. S., & McGill, M. E. (1984). Graphical Perception: Theory, Experimentation, and Application to the Development of Graphical Methods. Journal of the American Statistical Association, 79(387), 531–554.
  7. Tukey, J. W. (1977). Exploratory Data Analysis. Addison-Wesley.
  8. Ware, C. (2013). Visual thinking for design. Morgan Kaufmann. (Note: This is a book; included for foundational visualization concepts.)
  9. Wilkinson, L. (2005). The Grammar of Graphics. Springer.
  10. Kaiser, M. S., & colleagues. (2010). Data visualization in statistics education: A scholarly review. Journal of Statistics Education, 18(3).