Update Everyday: Chapter 8 (8.3 8.4)

Today, I read Chapter 8 (8.3 8.4).

Summary

8.3 Rule-Based Models

The complexity of the model tree can be further reduced by either removing entire rules or removing some of the conditions that define the rule.

In figure 8.12, pruning has a large effect on the model and smoothing just has a large impact on the unpruned models.

The number of terms in the linear models decreases as more rules are created. This makes sense because there are fewer data points to construct deep trees.

8.4 Bagged Trees

Bagging, short for bootstrap aggregation, is a general approach that uses bootstrapping in conjunction with any regression model to construct an ensemble.

Advantages:

1. Reduce the variance and be more stable (average)

2. Provide their own internal estimate of predictive performance that correlates well with either cross-validation estimates or test set estimates (out-of-bag samples)

Most improvement in predictive performance is obtained aggregating across ten bootstrap replications.

Caveats:

1. Computational costs and memory requirements increase as the number of bootstrap samples increases. (parallel computing)

2. A bagged model is less interpretable than a model that is not bagged. (variable importance)

Next week, I will continue to read Chapter 8.

Update Everyday

11/05/2016

Chapter 8 (8.3 8.4)

No comments:

Post a Comment