Dsbambad 6211 Assignment 1 Due 11:59 PM 2/18/2021 In The Fal

Question

Dsbambad 6211 Assignment 1due 1159pm 2182021in The Fall Of 2019 Dsbambad 6211 Assignment 1due 1159pm 2182021in The Fall Of 2019 In the fall of 2019, the administration of a large private university requested that the Office of Enrollment Management and the Office of Institutional Research work together to identify prospective students who would most likely enroll as new freshmen in the Fall 2020 semester. Data was collected to develop a predictive model using regression and decision tree analyses. The dataset includes variables such as academic interests, contact methods, ethnicity, high school details, geographic and financial information, and contact history. The assignment asks for a comprehensive exploration of the dataset, including variable description and the rationale for including or excluding specific variables. It requires identifying the target variable, assessing data types and measurement levels, and discussing data imputation and transformation procedures. The analysis includes a regression model with a summary of coefficients and significance levels, and a decision tree with an accompanying plot. The report must evaluate which model performs better and justify the choice, summarizing major findings for enrollment management. Finally, the R code used must be attached.

Dr. Jack HW Helper · Accepted Answer

Introduction The process of predicting student enrollment using statistical models is crucial for effective resource allocation and targeted recruitment strategies. Using the dataset INQ2019, this study explores variables that influence the likelihood of prospective students enrolling as freshmen in Fall 2020. The analysis involves data exploration, variable selection, and the application of regression and decision tree models to identify key predictors of enrollment. Dataset Structure and Variable Selection The dataset comprises numerous variables describing student inquiries, background characteristics, contact history, and interest levels. A summary table of the dataset's structure includes variable names, data types (numeric, factor, or binary), and indications of their inclusion in the model. Variables such as ACADEMIC_INTEREST_1, ACADEMIC_INTEREST_2, and IRSCHOOL were replaced by interval variables (INT1RAT, INT2RAT, HSCRAT) representing historical enrollment percentages, aligning with the rationale to convert categorical data into more informative continuous measures. CONTACT_CODE1 and CONTACT_DATE1 were discarded due to their irrelevance as per enrollment management feedback. Additional variables deemed unnecessary include those with low relevance to the enrollment decision, such as CONTACT_CODE1 and CONTACT_DATE1. The target variable for this analysis is ENROLL, indicating if a student enrolled in 2014 (binary: 1 for yes, 0 for no). Data types were reviewed, and adjustments were made where necessary—for example, converting categorical variables to factors or binary indicators to ensure proper model specification. Data Imputation and Transformation Imputation was performed on missing values using mean or median substitution for continuous variables and mode for categorical variables. For instance, MAILQ and TELECQ scores were imputed with their respective column means if missing. Transformations included scaling income estimates and distances to ensure compa

Dsbambad 6211 Assignment 1 Due 11:59 PM 2/18/2021 In The Fal

Dsbambad 6211 Assignment 1due 1159pm 2182021in The Fall Of 2019

Sample Paper For Above instruction

Introduction

Dataset Structure and Variable Selection

Data Imputation and Transformation

Regression Model Results

Decision Tree Model and Visualization

Model Comparison and Selection

Conclusions and Recommendations

R Code Appendix

References