Question 15.11: Regression Analysis And Significance Testing
Question 15.11 Regression Analysis and Significance Testing
In order to determine whether or not the sales volume of a company (y in millions of dollars) is related to advertising expenditures (x₁ in millions of dollars) and the number of salespeople (x₂), data were gathered for 10 years. Part of the Excel output is shown below: ANOVA Regression 321.11 Residual 63.39 df SS MS F Coefficients Standard Error Intercept 7.0174 1.8972 x₁ 8.6233 2.3968 x₂ 0.0858 0.1845
Paper For Above instruction
The regression analysis provided aims to explore the relationship between sales volume (dependent variable y) and two independent variables: advertising expenditure (x₁) and the number of salespeople (x₂). The statistical output from Excel includes the sums of squares (SS), mean squares (MS), F-statistic, coefficients, and standard errors necessary for detailed interpretation of the model and hypothesis testing.
First, the regression equation can be formulated from the coefficients. Using the provided regression coefficients for the intercept and the predictors, the equation is:
ŷ = 7.0174 + 8.6233x₁ + 0.0858x₂
This equation predicts the company's sales volume based on the advertising expenditure and number of salespeople. The intercept (7.0174) indicates the expected sales when both independent variables are zero; though practically, this may be a baseline or theoretical value.
To assess the significance of each predictor, hypothesis testing is performed. The null hypothesis for each coefficient posits that it is zero, implying no effect of that variable on sales. The t-statistic for each predictor can be calculated by dividing the coefficient by its standard error. For x₁, the t-value is approximately 3.605 (8.6233 / 2.3968), suggesting statistical significance at common levels; similarly, for x₂, the t-value is about 0.465 (0.0858 / 0.1845), which is less likely to be significant.
The coefficient of determination (R²) measures the proportion of variance in sales explained by the model. It is calculated as:
R² = SS Regression / SS Total
Given the regression Sum of Squares (SS Regression) is 321.11 and the total sum (SS Total) is the sum of regression and residual: 321.11 + 63.39 = 384.50, R² ≈ 0.834, indicating that approximately 83.4% of the variability in sales is explained by advertising and salespeople.
The adjusted R² accounts for the number of predictors and the sample size, providing a more accurate measure of model fit. It can be calculated as:
Adjusted R² = 1 - [(1 - R²)*(n - 1)/(n - k - 1)]
Where n=10 (years), and k=2 (predictors). Plugging in the values yields an adjusted R² close to 0.806, indicating a strong model fit with some penalization for the number of predictors.
To test the overall significance of the regression model, an F-test is employed. The calculated F-value (from the provided data) can be compared with the critical F-value at alpha = 0.01 with degrees of freedom (k, n - k - 1). The high F-value suggests the model is statistically significant, confirming that at least one predictor significantly explains the variance in sales.
Next, individual t-tests determine if each predictor’s coefficient is significantly different from zero at alpha = 0.01. For x₁, given the t-value of approximately 3.605 exceeds the critical value (~3.249 for df=7), the coefficient is significant, implying advertising expenditure significantly influences sales. Conversely, for x₂, the t-value (~0.465) is well below the critical threshold, indicating its coefficient is not significantly different from zero; thus, salespeople’s number may not have a statistically significant effect in this model.
In conclusion, the regression equation demonstrates that advertising expenditure has a significant positive impact on sales, while the number of salespeople does not significantly contribute. The model explains a substantial portion of sales variability, and the overall regression is statistically significant, validating the use of the predictors for forecasting sales.
References
- Albright, S. C., Winston, W. L., & Zappe, C. (2016). Data Analysis and Decision Making. Cengage Learning.
- Field, A. (2013). Discovering Statistics Using IBM SPSS Statistics. Sage Publications.
- Kutner, M. H., Nachtsheim, C. J., Neter, J., & Li, W. (2004). Applied Linear Statistical Models. McGraw-Hill Education.
- Mendenhall, W., Sincich, T. (2016). Statistics for Engineering and the Sciences. CRC Press.
- Ryan, T. P. (2013). Modern Regression Methods. Wiley.
- Hair, J. F., Black, W. C., Babin, B. J., & Anderson, R. E. (2010). Multivariate Data Analysis. Prentice Hall.
- Kline, R. B. (2015). Principles and Practice of Structural Equation Modeling. Guilford Publications.
- Wooldridge, J. M. (2015). Introductory Econometrics: A Modern Approach. South-Western College Pub.
- Tabachnick, B. G., & Fidell, L. S. (2013). Using Multivariate Statistics. Pearson.
- Damodar N. Gujarati, D. (2003). Basic Econometrics. McGraw-Hill Education.