CSE 5160 Machine Learning Spring 2021 Assignment 1

Question

CSE 5160 Machine Learning Spring 2021 Assignment 1 CSE 5160 Machine Learning Spring 2021 Assignment 1 Analyze and demonstrate your understanding of fundamental machine learning algorithms by completing three tasks: constructing a decision tree with impurity measures, applying k-nearest neighbors classification with data normalization, and implementing naive Bayes classification with data discretization. Follow the detailed procedures below and provide comprehensive explanations and visualizations where appropriate. Detailed Tasks 1. Decision Tree Construction and Classification Given a dataset, manually train a decision tree by carefully selecting attributes to split the data at each node. Use the Gini index to measure impurity, and detail the process of attribute selection at each step. Then, illustrate the final decision tree diagram and use it to classify the specific test instance: X = (Outlook = rainy, Temperature = hot, Humidity = high, Windy = FALSE). 2. K-Nearest Neighbors Classification Preprocess the provided training data by applying min-max normalization on the features Speed and Weight. Plot the normalized instances on a 2D plane, representing Speed on the X-axis and Weight on the Y-axis, marking classes 'no' with '−' and 'yes' with '+'. Using the normalized data, classify the test instance X = (Speed = 5.20, Weight = 500) as qualified or not, adopting the KNN algorithm with k values of 1, 3, and 5, respectively. Provide detailed calculations and reasoning for each case. 3. Naive Bayes Classifier Perform classification of the test instance: X = (HM = No, MS = Divorced, AI = 120000) using the given dataset. For the continuous attribute AnnualIncome, employ discretization at a threshold of 91000 to convert it into a binary attribute or estimate the probability density function for more accurate modeling. Calculate the posterior probabilities for each class and determine the predicted class label based on maximum posterior probability. Note: Ensure your ex

Dr. Jack HW Helper · Accepted Answer

Decision Tree Construction, Classification, KNN, and Naive Bayes Analysis Introduction The process of machine learning involves various algorithms that classify and predict data based on underlying patterns. In this comprehensive analysis, I will illustrate the steps to manually construct a decision tree utilizing the Gini index, classify a test instance using k-nearest neighbors with data normalization, and evaluate a classification using the naive Bayes approach with discretized data. Each method demonstrates different facets of supervised learning algorithms, contributing to a holistic understanding of their applications and mechanisms. Decision Tree Construction Using Gini Index The dataset provided for the decision tree task contains features such as Outlook, Temperature, Humidity, and Windy, with the target attribute being Play (Yes/No). The goal is to select the attribute for splitting at each node based on the lowest Gini impurity. Initially, I calculated the Gini impurity for the root node, which includes all instances. The impurity is computed as: Gini = 1 - Σ (pi)^2 where pi are the proportions of each class in the node. For the initial set, there are 9 'no' and 5 'yes' instances, so: p_no = 9/14, p_yes = 5/14 Leading to a Gini impurity of approximately 0.459. Attribute Selection and Splitting Next, I evaluated each attribute's potential to reduce impurity after splitting: Outlook: The attribute divides the data into three subsets with different distributions of Play, with impurity reductions calculated accordingly. Temperature, Humidity, and Windy were similarly assessed based on their potential splits. Among all, Outlook provided the largest gain in purity, with the following splits: Sunny: Instances with Play primarily 'no'. Overcast: All instances with Play 'yes', impurity zero. Rain: Contains a mixture, but with lower impurity after split. This process is repeated recursively on each subset, resulting in the final decision tree, which I visually depi

CSE 5160 Machine Learning Spring 2021 Assignment 1

CSE 5160 Machine Learning Spring 2021 Assignment 1

Detailed Tasks

Note:

Sample Paper For Above instruction

Decision Tree Construction, Classification, KNN, and Naive Bayes Analysis

Introduction

Decision Tree Construction Using Gini Index

Attribute Selection and Splitting

Final Decision Tree Diagram

Classifying Test Instance

K-Nearest Neighbors Classification

Data Preprocessing and Visualization

Classification with KNN

Naive Bayes Classification

Conclusion

References

CSE 5160 Machine Learning Spring 2021 Assignment 1

Detailed Tasks

Note:

Sample Paper For Above instruction

Decision Tree Construction, Classification, KNN, and Naive Bayes Analysis

Introduction

Decision Tree Construction Using Gini Index

Attribute Selection and Splitting

Final Decision Tree Diagram

Classifying Test Instance

K-Nearest Neighbors Classification

Data Preprocessing and Visualization

Classification with KNN

Naive Bayes Classification

Conclusion

References

Related Assignments