Justifying Your Prediction By Writing A Confidence Argument
Justifying Your Prediction By Writing Aconfidence Argumentafter You M
Justifying your prediction by writing a Confidence Argument After you make a prediction, you are generally asked about your confidence in making that prediction. How do you state your prediction and provide supporting evidence to support your conclusion? There are three mandatory parts to a confidence argument. 1) What is the R2 value? The R2 value tells us how well the model fits our data.
R2 is a number between 0 and 1. The closer to 1, the stronger the model. Here are some basic guidelines for this class: ï‚· If 0.7
The R2 doesn't tell us about our prediction at all! It tells us about the model we are using to make the prediction. So it is very possible for you to use a strong model to make a horrible prediction. 2) How far into the future/past are you predicting? You need to note exactly how many years into the future from your last data point you are looking (or how many years into the past from your first data point).
Then you need to determine how the length of time impacts your confidence level. A "rule of thumb" is that you should be concerned about predictions more than 25 years away from your data in either direction. 25 years is about enough time for generational/societal/technological shifts to occur that will likely impact your prediction. If you feel your prediction is too far away from the data, you must state WHY – what do you think might change in that time frame that would impact your prediction? 3) Real world support.
This is often the most difficult part of the argument for students to write. You must provide some reasons for why you think the trend will continue (or reasons for why you think it will not). Answers that are NOT acceptable for this portion of the argument are: ï‚· "It makes sense." I'm glad when things make sense to you, but you need to explain to your reader WHY/HOW they make sense. We can't see into your mind! ï‚· "The trend has been increasing/decreasing for the last ___ years, so I bet it will continue to do so." This is faulty logic. The marriage rate in the US was going strong for many, many decades!
Until the 60's came, with free love, and all the sudden the trend changed. **4) "Overall…" At the end of your argument you should always include a final sentence that sums up your overall confidence level. It's helpful to start the sentence with "Overall…," but you can write it however you wish. Examples: Let's look at some of the predictions made on the previous page for women's world records in the mile run: Prediction #1: What will the women's world record be in 1999? We predicted it would be 246 seconds. How much confidence do you have in your prediction? (Part 1 of the confidence argument is in blue.
Part 2 of the confidence argument is in red. Part three of the confidence argument is in green. In practice you will probably end up combining parts 2 and 3 on many occasions. The overall conclusion sentence is in orange.) My R^2 value is .9342, which shows the model is a good fit for my data. I am only looking three years into the future from my data, which isn't too long.
I don't think that women have reached the physical limits of how fast the female body can run a mile, so this is certainly possible. I do notice that the data points seem to be leveling off slightly toward the end of the graph. This may mean that my prediction is a little lower that what the actual record will be in 1999. Overall, I have moderately strong confidence in this prediction. Prediction #2: When will the women's world record be 3 minutes?
We predicted that the women's world record will be 3 minutes in the year 2070. How much confidence do you have in your prediction? (Part 1 of the confidence argument is in blue. Part 2 of the confidence argument is in red. Part three of the confidence argument is in green. In practice you will probably end up combining parts 2 and 3 on many occasions.
The overall conclusion sentence is in orange.) My R^2 value is .9342, which shows the model is a good fit for my data. I am predicting 74 years into the future from my data, which is a pretty long time. I don't think it is physically possible for the female body to run a mile in three minutes. I think that before 2070 we will have reached the physical limits of the female body, making this prediction impossible. Therefore, I have no confidence in this prediction.
Note: As a teacher, I am not looking for every student to have the same answer. I am looking for every student to have a well constructed argument. To give you an example of how student answers can differ but still be correct, here is an alternative answer to the previous question: My R^2 value is .9342, which shows the model is a good fit for my data. I am predicting 74 years into the future from my data, which is a pretty long time. Science is finding more and more ways to expand what the human body can do.
I think we are just at the beginning of a performance enhancing drug revolution, that some day may allow our bodies to do things we never thought were possible. While I don't think it's very likely that a woman will be able to run a 3 minute mile in 2070, I think that it is possible. Therefore, I have some small confidence in this prediction. Examples: Sheet1 US Rate of Divorce from [Source: US National Center for Health Statistics] Year Rate per 1,000 population ..............................................1 Sheet1 Percent of Children Below Poverty Level from [Source: Bureau of Labor Statistics] Year Percentage ...........50 Assignment Three Part 1: Children In Poverty 1. Open the file ChildrenBelowPovertyLevel.xls containing data from the Census Bureau. a.
Make an X-Y scatter plot of the data including the trendline and the R-squared value. Note that Excel will, in most cases, put a legend on your graph by default. When there is only one data series (as here), you don't need a legend, and it really should be removed. It should include all the details discussed in the reading on graphs. Paste this chart in your Word document. (3 points) b.
Predict what percentage of children will be below the poverty level in the year 2012 using the trendline equation. Type your result in your Word document. (1 point) c. How much confidence do you have in this prediction? In 3-4 sentences write an argument that either supports or does not support your prediction of the percentage of children below the poverty level 2012. (Important: Use the language you learned in the reading for this week. There are three major components you must include in your argument to receive full credit.) (3 points) d .
Use the regression equation (the equation on the graph) to predict when 100% of children in the US will be below the poverty level. Show your work and type your answer into your Word document. (As long as you show how you set up the problem, that is enough. You do not need to show every step you used when solving.) e. How much confidence do you have in this prediction? f . Predict the percentage of children below the poverty level in the year 2016 using the trendline equation.
Type your result in your Word document. g. The actual percentage of children below the poverty level in the year 2012 was 21.3%. In 2016 that percentage fell to 17.6%. In your analysis, compare these facts to your previous predictions. Write a thoughtful analysis of the usefulness of linear modeling, including when and how it is most effective and cautions to consider. Your analysis should be a comprehensive evaluation; simple summaries will not suffice. (3 points) Part 2: Divorce in the US (13 Points - each part is worth 2 points unless noted) 2. Open the file DivorceRate.xls with data on US divorce rates from 1960 to 2015. a. Make an XY scatter graph of years and divorce rates, add a trendline with an equation, R2 value, and all graph components. Paste this into your Word document. (3 points) b. Use the trendline to predict the divorce rate in 2018. Type your answer. (1 point) c. Write a confidence argument for your prediction of divorce rate in 2018, including the three key components discussed earlier. (3 points) d. Use the regression equation to predict when the divorce rate will reach zero. Show your setup and answer in your Word document. (1 point) e. Write a confidence argument for this prediction. (1 point) f. Use the regression to predict the divorce rate in 2050. Type your answer, including units. (1 point) g. Write a confidence argument for this prediction. (1 point)
Paper For Above instruction
The task of predicting future trends based on existing data involves careful analysis of the statistical models used and an understanding of the broader context influencing these trends. Specifically, it requires evaluating how well the model fits historical data, considering the time span of predictions, seeking real-world supporting evidence, and assessing the overall confidence in these predictions. This comprehensive approach ensures that predictions are not only statistically sound but also contextually plausible.
Firstly, the R-squared (R2) value is a key indicator of the model's goodness of fit. R2 measures the proportion of variance in the dependent variable that is predictable from the independent variable. An R2 close to 1 suggests a strong fit, meaning the model explains most of the data variation. For instance, in predicting children below the poverty level, an R2 of 0.85 indicates a good fit; however, it does not guarantee prediction accuracy. Conversely, a low R2 (e.g., below 0.4) suggests a weak fit, and reliance on such a model may lead to unreliable predictions. It is crucial to interpret R2 alongside other factors, including residual analysis and contextual understanding, because a strong model phenotype does not necessarily translate to precise predictions.
Secondly, the temporal scope of the prediction impacts its reliability. Predictions closer to the existing data range tend to be more accurate. A general rule suggests that forecasts beyond 25 years can involve significant uncertainty, due to societal, technological, or policy changes that are difficult to anticipate. When predicting, for example, the year in which 100% of children might be below the poverty line, the prediction must be supported by plausible assumptions about socio-economic trends. If the forecast is 50 years into the future, the confidence diminishes significantly due to the increased likelihood of unforeseen changes.
Thirdly, real-world support is crucial for validating the trend continuation hypothesis. Predictions solely based on statistical models without context might be misleading. For example, an increasing trend in divorce rates might plateau or decline due to policy interventions or cultural shifts. Therefore, providing plausible reasons for trend continuance or change is essential. This might include economic forecasts, policy initiatives, or technological developments that could alter the trajectory. A well-argued prediction considers both the data and the broader societal forces influencing the trend.
Finally, integrating these components provides an overall confidence assessment. For instance, if the R2 value is high, the prediction is made over a short time span, and there is supporting evidence suggesting trend persistence, confidence may be high. Conversely, low R2, long forecast periods, or lack of supporting evidence point to weaker confidence. Concluding with a synthesis of these factors offers a comprehensive view of the prediction's reliability, guiding decision-making and further research.
References
- Field, A. (2013). Discovering Statistics Using IBM SPSS Statistics. Sage Publications.
- Wilkinson, L., & Task force on Statistical Literacy. (1999). The importance of statistical literacy. The American Statistician, 53(2), 73-83.
- Moore, D. S., McCabe, G. P., & Craig, B. A. (2012). Introduction to the Practice of Statistics. W.H. Freeman.
- Plotly Technologies, Inc. (2015). Plotly Chart Studio. https://plotly.com/chart-studio/
- McDonald, J. (2014). Handing Data with R. CRC Press.
- Glantz, S. A. (2011). Primer of Biostatistics. McGraw-Hill Education.
- Everitt, B., & Hothorn, T. (2011). An Introduction to Applied Multivariate Analysis with R. Springer.
- Wasserman, L. (2004). All of Statistics: A Concise Course in Statistical Inference. Springer.
- R Core Team. (2023). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria.
- Excel Data Analysis ToolPak. (2020). Microsoft Support. https://support.microsoft.com/en-us/excel