Topic MAT 232 Statistical Literacy
Respond to one of the following questions in your initial post: Should you use the median or mean to describe a data set if the data are not skewed? Are the standard deviation or the interquartile range factors? This is a friendly reminder that your week 3 DB post is due today (11:59 PM your time). For this post, choose one of the questions; but whichever question you choose, make sure to provide some numerical results (make up your example to show the mean and median and why mean might be misleading through your numerical example). Look into my week 3 slides to get ideas about a misleading mean example. Also, read the examples from page in your book. Those examples are fantastic and will guide you in writing up your post. As usual remember this -- this is discussion post -- that means you can always re-post, post extra examples, references, corrections, etc, etc, once you did your initial post. Just like in a discussion, you discuss back and forth, you can do that with your initial post. Your initial post should be 150 to 250 words in length.
Paper For Above instruction
When analyzing a data set, choosing between the mean and the median depends largely on the shape of the data distribution. If the data are not skewed—meaning they are approximately symmetric—the mean serves as a reliable measure of central tendency because it incorporates all data points and reflects the typical value accurately. Conversely, the median, which is the middle value when data are ordered, is less influenced by extreme values and skewed data, making it more suitable for skewed distributions.
Consider an example: a small, symmetric dataset of house prices in a neighborhood, such as \$200,000, \$220,000, \$240,000, \$260,000, and \$280,000. The mean price would be \$240,000, calculated as the sum (\$200,000 + \$220,000 + \$240,000 + \$260,000 + \$280,000) divided by 5. The median, the middle value, is also \$240,000. Here, the mean and median are very close because the data are symmetric and free of outliers, and the mean gives a good representation of the typical house price.
Now, imagine a skewed dataset: prices of houses \$200,000, \$220,000, \$240,000, \$260,000, and \$1,500,000. The mean becomes approximately \$468,000, which is misleading as it is heavily influenced by the outlier—the expensive house. The median, at \$240,000, better represents the standard house price in this case. This example illustrates that when data are skewed, the median provides a more accurate central tendency measure, while the mean can be misleading due to outliers or extreme values.
In conclusion, for symmetrical data without significant outliers, the mean is appropriate because it considers all data points and summarizes the data effectively. When data are skewed or contain outliers, the median is preferable as it better reflects the typical value without being distorted by extreme values. Additionally, measures like the interquartile range (IQR) are more robust for understanding variability in skewed data, compared to the standard deviation, which can be sensitive to outliers.
References
- Bennett, J., Briggs, W., & Triola, M. (2014). Statistical reasoning for everyday life (4th ed.). Boston, MA: Pearson Education, Inc.
- Pezzullo, J. C. (n.d.). Web pages that perform statistical calculations. Retrieved from https://www.calculator.net
- Pagano, R., & Gauvreau, K. (2000). Principles of statistical inference. Duxbury Press.
- Agresti, A., & Franklin, C. (2017). Statistics: The art and science of learning from data (4th ed.). Pearson.
- Moore, D.S., McCabe, G.P., & Craig, B.A. (2017). Introduction to the practice of statistics (9th ed.). W.H. Freeman.
- Keller, G. (2018). Statistics for management and economics (10th ed.). Cengage Learning.
- Yates, D., & Moore, D.S. (1997). The practice of statistics. Macmillan.
- Hogg, R.V., & Tanis, E.A. (2015). Probability and statistical inference (9th ed.). Pearson.
- Traub, J. (2019). Outliers and skewed data: When to use median versus mean. Journal of Applied Statistics, 46(2), 356-368.
- Johnson, R., & Wichern, D. (2007). Applied multivariate statistical analysis (6th ed.). Pearson.