10/20/2016

Chapter 6 (6.3 6.4)

Today, I finished reading 6.3 and 6.4 of Chapter 6.

Summary

Both PCR and PLS have similar predictive ability, but PLS does so with far fewer components.
The NIPALS algorithm works fairly efficiently for data sets of small-to-moderate size (< 2500 samples and < 30 predictors). When the number of samples and predictors climbs, the algorithm becomes inefficient.
Kernel approach: improve the speed of the algorithm
SIMPLS: deflate the covariance matrix between the predictors and the response
Covariance: .
GIFI approach: split each predictor into two or more bins for those predictors that are thought to have a nonlinear relationship with the response. Cut points for the bins are selected by the user and are based on either prior knowledge or characteristics of the data.
Penalized models:

A generalization of the lasso model is the elastic net:

This model will more effectively deal with groups of high correlated predictors.


Tomorrow, I will begin computing of Chapter 6.

2 comments: