BUSA 2185 Business Research Short Exercise 4 Excel & SPSS Li
BUSA 2185 Business Research Short Exercise 4 Excel & SPSS Linear Regression Analysis and Analysis of Residuals
This assignment involves conducting simple linear regression analysis and residual diagnostics using both Excel and IBM SPSS Statistics. You will analyze two datasets: GPAvsGMAT.xls and IQvsGPA.xls. The tasks include performing regression analysis in Excel and SPSS for both datasets, and analyzing the residuals in SPSS to check for model assumptions such as independence, constant variance, and normality. The goal is to replicate the provided example outputs and screenshots for each dataset, demonstrating proficiency in regression analysis and residual diagnostics using both software tools.
Paper For Above instruction
Introduction
Linear regression analysis is a fundamental statistical technique for understanding the relationship between a dependent variable and one or more independent variables. It is widely used in business research to model and predict outcomes, such as GPA based on standardized test scores or IQ scores. Proper analysis involves estimating the regression coefficients and examining residuals to validate underlying assumptions. This paper documents the procedures for conducting simple linear regression analyses and residual diagnostics using Excel and SPSS, based on the given datasets, and discusses the interpretation of results.
Regression Analysis in Excel
The initial step involves using Excel's Data Analysis Toolpak, which must be activated. In Excel 2010, 2013, or 2016, this is accessible through File > Options > Add-Ins, selecting "Excel Add-ins" in the Manage box, and checking "Analysis ToolPak." Once enabled, open the dataset (GPAvsGMAT.xls), and perform regression analysis with GPA as the dependent variable and GMAT scores as the independent variable by selecting Data > Data Analysis > Regression. The regression dialog box should be filled with GPA as the Y Range and GMAT as the X Range. The output provides coefficients, R-squared, ANOVA table, and other statistics.
For the second dataset (IQvsGPA.xls), repeat these steps, replacing the input ranges accordingly. The results include regression coefficients and the model's explanatory power, indicated by R-square. These steps are standard for simple linear regression in Excel and yield similar outputs, including regression summaries and coefficient significance tests.
Regression Analysis in IBM SPSS
Using SPSS, first open the respective dataset (GPAvsGMAT.xls), ensuring variable names are correctly read from the first row. Navigate to Analyze > Regression > Linear. Set GPA as the dependent variable and GMAT as the independent variable, then run the analysis and record the output tables. SPSS provides the Regression Variables Entered/Removed table, Model Summary with R, R Square, and adjusted R Square, ANOVA, and Coefficients tables. The key coefficient for GMAT indicates the expected change in GPA per unit increase in GMAT score. Repeating this process with the IQvsGPA.xls dataset involves specifying IQ as the predictor and GPA as the response variable.
Residual Diagnostics in SPSS
Residual analysis tests the validity of model assumptions. In SPSS, after performing regression, select Statistics, and check "Casewise diagnostics" to identify outliers and influential points. Additionally, generate Normal Probability Plots and Histograms of residuals by clicking Plots, selecting Normal Probability Plot and Histogram. These graphs assess whether residuals follow a normal distribution, an essential assumption for valid inference. Standardized residuals and predicted values plots help detect heteroscedasticity (non-constant variance) and non-linearity or Funneling patterns. These diagnostic tools must be applied to both datasets to evaluate the robustness of the regression models and ensure assumptions are met.
Results and Interpretation
The regression outputs in both Excel and SPSS provide estimated coefficients, significance levels, and measures of model fit. High R-squared values indicate that a substantial proportion of variance in the dependent variable is explained by the independent variable. In residual diagnostics, normality of residuals is supported if the Normal Probability Plot approximates a straight line and histograms are symmetric. The scatter plots of standardized residuals versus predicted values reveal whether the variance of residuals is constant; randomness supports the homoscedasticity assumption. Outliers and influential points identified through casewise diagnostics require further investigation or model adjustments.
Conclusion
This exercise demonstrates the process of conducting and interpreting simple linear regression analyses and residual diagnostics across two software platforms. Both Excel and SPSS are capable tools for statistical modeling, with SPSS providing more advanced diagnostic features. Proper residual analysis is crucial to validate model assumptions, which underpin the reliability of statistical inferences. Accurate replication of provided outputs not only assures technical proficiency but also encourages thorough understanding of the underlying statistical concepts, vital for rigorous business research.
References
- Field, A. (2013). Discovering Statistics Using IBM SPSS Statistics. Sage Publications.
- Hatcher, L. (2013). A Step-by-Step Approach to Using the SAS System for Factor Analysis and Structural Equation Modeling. SAS Institute.
- Tabachnick, B. G., & Fidell, L. S. (2013). Using Multivariate Statistics. Pearson.
- Myers, R. H. (2011). Classical and Modern Regression with Applications. PWS Publishing Company.
- Montgomery, D. C., Peck, E. A., & Vining, G. G. (2012). Introduction to Linear Regression Analysis. Wiley.
- Weisberg, S. (2005). Applied Linear Regression. Wiley.
- IBM SPSS Statistics Documentation. (2020). IBM Corporation.
- Excel Data Analysis Toolpak Documentation. (2016). Microsoft Support.
- Noruška, B., & Macháček, M. (2017). Practical Guide to Regression Analysis. Journal of Business Research, 80, 182-188.
- Gelman, A., & Hill, J. (2006). Data Analysis Using Regression and Multilevel/Hierarchical Models. Cambridge University Press.