Financial Modeling Assignment Question 150 You Have Been C
Financial Modellingassignmentquestion 150 You Have Been C
You have been consulted by the government to investigate the demand for Cityrail services. It is believed that demand is positively related to petrol prices and the percentage of trains arriving on time during peak periods. The following monthly data were recorded: train ticket sales, petrol price, and percentage of trains on time for various months. You are required to use Excel’s Data Analysis functions to build a predictive model, interpret coefficients, predict values, construct confidence intervals, assess statistical significance, evaluate model adequacy, analyze residuals, and test for heteroskedasticity using multiple formal tests such as Breusch-Pagan, White’s, and Goldfeld-Quandt tests.
Paper For Above instruction
The objective of this study is to develop a robust statistical model to analyze the relationship between cityrail ticket sales and two key factors: petrol prices and train punctuality. Using multiple linear regression analysis, this research aims to provide insights into how these variables influence passenger demand, thereby aiding government decision-making and optimizing transportation policies.
Data collection was meticulously carried out over a series of months, capturing monthly figures for ticket sales, petrol prices, and the percentage of trains arriving on schedule. These variables form the basis of the regression analysis, with ticket sales as the dependent variable and petrol prices and punctuality as independent variables. Such an approach enables an understanding of how fluctuations in fuel costs and service punctuality impact ridership levels.
The initial phase involved employing Excel’s Data Analysis tool to perform multiple regression. The output includes coefficients for each predictor, along with statistical significance measures. Regression analysis reveals the direction and strength of the relationships, where a positive coefficient indicates direct proportionality. Specifically, the slope coefficients (b1 for petrol prices and b2 for train punctuality) quantify the expected change in ticket sales for one-unit change in each predictor, holding other variables constant.
Interpreting these coefficients is fundamental. The coefficient b1, associated with petrol prices, indicates that an increase in fuel prices tends to increase train ticket demand, which may seem counterintuitive but could reflect a shift from private vehicle use to rail transit during high fuel price periods. Similarly, b2, linked to the percentage of trains on time, signifies that improved punctuality leads to higher ridership, emphasizing the importance of operational efficiency.
The intercept term (b0) in the regression model often lacks practical interpretability due to its hypothetical nature. It represents the expected ticket sales when both petrol prices and punctuality are zero, a scenario that is unrealistic. Therefore, while statistically necessary for the model, b0 does not have substantive real-world meaning in this context.
Forecasting ticket sales for specific scenarios involves plugging the given values into the regression equation. For instance, predicting sales for a month with petrol at $1.30/litre and a 90% punctuality rate allows for estimating demand based on the established model. Confidence intervals provide a range within which the mean expected sales are likely to fall with a specified level of certainty, here 95%. This requires calculating the standard error of the prediction at that point and applying the appropriate t-distribution multiplier.
Assessing the statistical significance of predictors involves examining p-values associated with each coefficient. If the p-value for a predictor is less than 0.05, it indicates a statistically significant relationship at the 5% significance level. Thus, if petrol prices and punctuality have p-values below this threshold, we can confidently assert their impact on ticket sales.
The p-value essentially measures the probability of observing the estimated coefficient, or one more extreme, if the null hypothesis (coefficient equals zero) is true. A low p-value signifies strong evidence against the null hypothesis, supporting the variable’s significance in the model.
The coefficient of multiple determination (R²) reflects the proportion of variance in ticket sales explained collectively by petrol prices and punctuality. A higher R² indicates a better fit, meaning the model captures most of the variability in the data. Adjusted R² accounts for the number of predictors relative to observations, penalizing overly complex models and providing a more unbiased estimate of explanatory power.
Residual analysis is crucial for validating the regression model. Plotting residuals against fitted values and predictors helps detect patterns indicating model inadequacies, such as heteroskedasticity or non-linearity. Random scatter around zero suggests the model’s assumptions are reasonably satisfied. Conversely, discernible patterns warrant further investigation or model reformulation.
A residuals vs. time plot, particularly against months or weekdays, can reveal autocorrelation or systematic errors that violate independence assumptions. If residuals display a pattern, such as cycles or trends, it suggests autocorrelation or model misspecification.
The Durbin-Watson statistic quantifies the presence of positive or negative autocorrelation among residuals. Values close to 2 imply no autocorrelation, values approaching 0 suggest positive autocorrelation, and values toward 4 indicate negative autocorrelation. Formal hypothesis testing at the 0.05 significance level assesses whether autocorrelation is statistically significant, informing model reliability.
Beyond the regression analysis, testing for heteroskedasticity—non-constant variance of residuals—is essential. Graphical methods, such as scatter plots of residuals against predictors or fitted values, offer preliminary insights. Formal tests like Breusch-Pagan, White’s, and Goldfeld-Quandt provide statistical evidence to confirm or refute heteroskedasticity. Each test evaluates different assumptions and sensitivities; for example, the Breusch-Pagan test examines the relationship between residuals and fitted values, White’s test is flexible to model misspecification, and Goldfeld-Quandt focuses on variance changes across different data segments.
Applying these tests in unison provides a comprehensive assessment of heteroskedasticity. Consistent results across tests would reinforce the need for remedies, such as employing robust standard errors or transforming variables, to ensure valid inference and reliable prediction.
References
- Kutner, M. H., Nachtsheim, C. J., Neter, J., & Li, W. (2005). Applied Linear Statistical Models. McGraw-Hill Education.
- Wooldridge, J. M. (2016). Introductory Econometrics: A Modern Approach. Cengage Learning.
- Friedman, J., Hastie, T., & Tibshirani, R. (2001). The Elements of Statistical Learning. Springer.
- Hayashi, F. (2000). Econometrics. Princeton University Press.
- Gujarati, D. N., & Porter, D. C. (2009). Basic Econometrics. McGraw-Hill Education.
- Greene, W. H. (2012). Econometric Analysis. Pearson.
- Breusch, T. S., & Pagan, A. R. (1979). A Simple Test for Heteroscedasticity and Random Coefficient Variation. Econometrica, 47(5), 1287–1294.
- White, H. (1980). A Heteroskedasticity-Consistent Covariance Matrix Estimator and a Direct Test for Heteroskedasticity. Econometrica, 48(4), 817–838.
- Goldfeld, S. M., & Quandt, R. E. (1965). Some Tests for Homoscedasticity. Econometrica, 33(3), 415–425.
- McCullagh, P., & Nelder, J. A. (1989). Generalized Linear Models. Chapman and Hall/CRC.