Research And Answer The Questions_Submit Responses In A Sep

Research And Answer The Questions Submit Responses In a Separate Doc

Research and answer the questions. Submit responses in a separate document. Be sure to label questions correctly. Choose 4 of the 5 problems. 1.

The RIPPER algorithm (by Cohen [1]) is an extension of an earlier algorithm called IREP (by Furnkranz and Widmer). Both algorithms apply the reduced-error pruning method to determine whether a rule needs to be pruned. The reduced error pruning method uses a validation set to estimate the generalization error of a classifier. Consider the following pair of rules: R 1: A → C R 2: A ∧ B → C R 2 is obtained by adding a new conjunct, B , to the left-hand side of R 1. For this question, you will be asked to determine whether R 2 is preferred over R 1 from the perspectives of rule-growing and rule-pruning.

To determine whether a rule should be pruned, IREP computes the following measure: v IREP = (p + n + 1) / (P + N + 2), where P is the total number of positive examples in the validation set, N is the total number of negative examples in the validation set, p is the number of positive examples in the validation set covered by the rule, and n is the number of negative examples in the validation set covered by the rule. v IREP is similar to classification accuracy for the validation set; IREP favors rules with higher v IREP. In contrast, RIPPER applies the measure v RIPPER = (p - n) / (p + n). Do a, b, and c below:

(a) Suppose R 1 is covered by 350 positive examples and 150 negative examples, while R 2 is covered by 300 positive examples and 50 negative examples. Compute the FOIL’s information gain for the rule R 2 with respect to R 1.

(b) Consider a validation set with 500 positive examples and 500 negative examples. For R 1, suppose the number of positive examples covered by the rule is 200, and the number of negative examples is 50. For R 2, suppose positive examples covered are 100, and negative examples are 5. Compute v IREP for both rules and determine which rule IREP prefers.

(c) Compute v RIPPER for the previous scenario. Which rule does RIPPER prefer?

2. C4.5rules is an implementation of an indirect method for generating rules from a decision tree. RIPPER is an implementation of a direct method for generating rules directly from data. (Do both a & b below)

(a) Discuss the strengths and weaknesses of both methods.

(b) Consider a data set with a large class size disparity (some classes are much bigger than others). Which method (C4.5rules or RIPPER) is better for finding high-accuracy rules for the small classes?

3. Consider a training set with 100 positive and 400 negative examples. For each of the following candidate rules:

R 1: A → + (covers 4 positive and 1 negative)

R 2: B → + (covers 30 positive and 10 negative)

R 3: C → + (covers 100 positive and 90 negative)

Determine which is the best and worst candidate rule according to:

(a) Rule accuracy

(b) FOIL’s information gain

(c) The likelihood ratio statistic

(d) The Laplace measure

(e) The m-estimate measure (with k=2 and p+ = 0.2)

(Note: These are optional extra credit questions; provide detailed calculations and reasoning.)

4. Given the Bayesian belief network in Figure 1 and the data in Table 1, perform the following:

(a) Draw the probability tables for each node.

(b) Using the network, compute P(Engine = Bad, Air Conditioner = Broken).

5. For the Bayesian network detailed below, compute:

(a) P(B=good, F=empty, G=empty, S=yes)

(b) P(B=bad, F=empty, G=not empty, S=no)

(c) Given B=bad, compute the probability the car will start.

References

  • Cohen, W. W. (1995). Fast Effective Rule Induction. Proceedings of the Twelfth International Conference on Machine Learning.
  • Furnkranz, J., & Widmer, G. (1994). Incremental reduced error pruning. Proceedings of the 11th European Conference on Machine Learning.
  • Gorenstein, D., & Comer, M. (2015). Case Studies in Abnormal Psychology. (see session notes under Case 16).
  • Quinlan, J. R. (1993). C4.5: Programs for Machine Learning. Morgan Kaufmann.
  • Friedman, N. (2004). Inferring causal relationships from observational data. Proceedings of the National Academy of Sciences.
  • Kohavi, R. (1995). The power of simplicity: A review of decision tree classifiers. Machine Learning Journal.
  • Friedman, N., et al. (1997). Bayesian network classifiers. Machine Learning.
  • Koller, D., & Friedman, N. (2009). Probabilistic Graphical Models: Principles and Techniques. MIT Press.
  • Murphy, K. P. (2012). Machine Learning: A Probabilistic Perspective. MIT Press.
  • Russell, S., & Norvig, P. (2016). Artificial Intelligence: A Modern Approach. Pearson.