Linear Modeling Of NYC MTA Transit Fare Over Time To Predict ✓ Solved
Linear Modeling of NYC MTA Transit Fare Over Time to predict
In this activity you will investigate the rising cost of the MTA transit fare over a period of time in New York City. The goal is to use data to develop a simple mathematical model which can be used to make reasonable predictions. You will incorporate algebraic skills such as graphing, rate of change, and linear functions to complete this activity.
Steps:
Introduction: write a brief introduction of the goals of this activity, including what data you will analyze and what a simple linear model can reveal about fare trends.
Scatter Plot: use the provided data to obtain a scatter plot of time vs. price and describe the trend. Discuss whether the trend appears linear or if there are indications of nonlinearity or irregularities in the data.
Defining your variables: identify your independent and dependent variables (time in years and cost in dollars). State them clearly with appropriate symbols (for example, t for time and P for price).
Average rate of Change: compute the average rate of change per year of the cost over each time period and discuss whether the values suggest a linear trend.
Linear Modeling: assuming the trend is linear, generate a linear model. For calculation ease, rescale the time values for 2009 through 2015 with 2009 as year 0. Using two data points from the rescaled table, develop a linear model P = a t + b. For example, using (0, 2.25) and (4, 2.75) give P = 0.125 t + 2.25. Predict the price in 2021 by substituting t = 12. According to this model, the price is about $3.75 in 2021.
Using Excel: obtain a scatter plot with the best-fit line; generate the linear model in Excel; use it to predict the price in 2021.
Reflection: what have you learned from this activity? What challenges did you face and how did you overcome them? What mathematical skills and knowledge did you use to complete this assignment? Did you need to look back at any prior mathematical skills? Did you seek assistance from anyone and was it helpful? Compare the predictions from the two models and discuss whether your computed value is reasonable in the context of current fare trends. Discuss potential pitfalls with this modeling approach and reflect on how this exercise shapes your view of mathematics.
Addendum: Excel instructions for creating a scatter plot and regression line. Follow the steps to input data, generate a scatter plot, and add a regression line with the display of the equation on the chart.
Paper For Above Instructions
Introduction and objective. The inquiry- and problem-solving exercise on linear modeling of NYC MTA transit fares centers on using empirical data to build a simple, interpretable model that forecasts fare changes over time. The central aim is not to produce a perfect forecast but to cultivate a process: collecting time-series data, visualizing relationships, evaluating linearity, and deriving a basic predictive equation. This approach aligns with foundational practices in statistics and data modeling, emphasizing transparency, assumptions, and critical reflection (ISLR, 2013; Hastie et al., 2009).
Data visualization and trend assessment. A scatter plot of time versus fare creates an initial visual assessment of the relationship. If the points broadly align along a straight line, a linear model is reasonable as an initial approximation. Deviations from linearity or structured patterns (e.g., curvature, abrupt shifts corresponding to policy changes) suggest that a simple linear model may be inadequate or require transformation or piecewise segments (James et al., 2013). In the MTA case, historical fare data typically show gradual increases with occasional step changes due to policy decisions. The goal is to interpret the plot, not to force a linear fit where another form is more appropriate (Hastie et al., 2009).
Variables and scaling. The independent variable is time, measured in years. The dependent variable is price, measured in dollars. A common practice is to center or scale time to simplify interpretation and numerical stability, such as setting 2009 as t = 0. This rescaling does not change the underlying relationship but makes the slope interpretation more intuitive for predictions within the chosen window (Montgomery et al., 2012).
Average rate of change and linearity assessment. The average rate of change (ARC) per year is computed as ΔP/Δt over successive intervals. If ARCs are relatively stable across intervals, a linear model is plausible. If ARCs show systematic variation, a nonlinear model or piecewise segments may be more appropriate. Linear models assume constant marginal change in price per unit time, which is a reasonable first approximation for many public transit fare histories, especially when data points are sparse or irregular (Weisberg, 2005).
Linear modeling and estimation. Assuming linearity, a simple model P = a t + b is fitted. In the presented illustration, rescaling to 2009 as t = 0 yields two data points: (0, 2.25) and (4, 2.75). The slope a is computed as (2.75 - 2.25) / (4 - 0) = 0.50 / 4 = 0.125 dollars per year. The intercept b equals P - a t at t = 0, so b = 2.25. The resulting linear model is P = 0.125 t + 2.25, where t is years since 2009. This model predicts P = 0.125 × 12 + 2.25 = 3.75 dollars for 2021 (i.e., twelve years after 2009). While simplistic, this example demonstrates how to derive a linear relationship from a small, rescaled dataset and how to interpret the slope as the annual fare increase (Montgomery et al., 2012; James et al., 2013).
Excel application and model comparison. Using Excel, one can input the same data and generate a scatter plot with a least-squares regression line. The equation displayed on the chart corresponds to the empirical fit and provides an alternative estimate for P in 2021. A regression-based approach, as described in many statistical texts, can produce a different slope and intercept depending on the data points used and how they are weighted or scaled (Microsoft, 2023). The two-model comparison—simple arithmetic extrapolation from a rescaled pair of points vs. a regression line across all points—highlights a key lesson: different modeling choices yield different forecasts, underscoring the importance of examining the assumptions behind each method (ISLR, 2013; Hyndman & Athanasopoulos, 2018).
Limitations and interpretation. A linear model assumes constant incremental increases over time, which may not reflect real-world policy changes, inflation, or price elasticity. Fare data for urban transit systems are influenced by political decisions, subsidies, and inflation, which can introduce nonlinearity or regime shifts. As such, predictions should be viewed as conditional on historical patterns and the chosen modeling frame, not as precise forecasts. The exercise also reinforces mathematical thinking: recognizing when a model is appropriate, understanding its parameters, and evaluating its predictive usefulness (Wooldridge, 2019; Kutner et al., 2005).
Reflection and learning outcomes. Students typically report increased familiarity with plotting data, interpreting scatter plots, and translating a visual relationship into a formula. They also learn to articulate the assumptions of linear modeling and to anticipate potential pitfalls, such as overreliance on a single data point or ignoring data quality and context. The process emphasizes critical thinking about mathematics as a tool for explanation and prediction, consistent with the goals of inquiry-based learning (Hastie et al., 2009; Hyndman & Athanasopoulos, 2018).
Conclusion. The activity demonstrates a structured approach to building a simple predictive model for a real-world quantity—the MTA fare—while illustrating core statistical principles: data visualization, variable definition, assessment of linearity, model construction, and interpretation. The exercise also reflects the value of comparing methods and recognizing the limits of any single model in the presence of real-world complexity (James et al., 2013; Montgomery et al., 2012).
References
- James, G., Witten, D., Hastie, T., Tibshirani, R. An Introduction to Statistical Learning with Applications in R. Springer, 2013.
- Hastie, T., Tibshirani, R., Friedman, J. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. 2nd ed. Springer, 2009.
- Montgomery, D. C., Peck, E. A., Vinning, G. Introduction to Linear Regression Analysis. Wiley, 5th ed., 2012.
- Weisberg, S. Applied Linear Regression. 3rd ed. Wiley, 2005.
- Kutner, M. H., Nachtsheim, C. J., Neter, J., Li, W. Applied Linear Statistical Models. 5th ed. McGraw-Hill, 2005.
- Wooldridge, J. M. Introductory Econometrics: A Modern Approach. 7th ed. Cengage, 2019.
- Hyndman, R. J., Athanasopoulos, G. Forecasting: Principles and Practice. OTexts, 2018. (Available online at https://otexts.com/fpp2/)
- Metropolitan Transportation Authority (MTA). Fares information. https://new.mta.info/fares
- Metropolitan Transportation Authority (MTA). History of fares and passes. https://new.mta.info/fares/history
- Microsoft. Regression analysis in Excel (Analysis ToolPak). Microsoft Office Support. https://support.microsoft.com