10/27/2016

Chapter 7 (7.1 7.2)

Today, I read Chapter 7 (7.1 7.2).

Summary
As the regulation value increases, the fitted model becomes more smooth and less likely to over-fit the training set. Reasonable values of  range between 0 and 0.1. Since the regression coefficients are being summed, they should be on the scale; hence the predictors should be centered and scaled prior to modeling.
There are many other kinds such as models where there are more than one layer of hidden units (i.e., there is a layer of hidden units that models the other hidden units). Also, other model architectures have loops going both directions between layers.
A model similar to neural networks is self-organizing maps. This model can be used as an unsupervised, exploratory technique or in a supervised fashion for prediction.
The resulting parameter estimates are hardly to be the globally optimal estimates. As an alternative, several models can be created using different starting values and averaging the results of these models to produce a more stable prediction.

Multivariate Adaptive Regression Splines (MARS)
Once the full set of features has been created, the algorithm sequentially removes individual features that do not contribute significantly to the model equation.
GCV: generalized cross-validation
There are two tuning parameters associated with the MARS model: the degree of the features that are added to the model and the number of retained terms. The latter parameter can be automatically determined using the default pruning procedure (using GCV), set by the user or determined using an external resampling technique.
Since the GCV estimate does not reflect the uncertainty from feature election, it suffers from selection bias.
Two advantages of MARS:

1. The model automatically conducts feature selection; 2. interpretability

Tomorrow, I will continue to read Chapter 7.

No comments:

Post a Comment