Summary
Over-fitting
Model tuning:
tune an appropriate value of a tuning parameter
Data splitting:
maximum dissimilarity sampling
Common steps
in model building:
1.
Pre-processing
the predictor data
2.
Estimating
model parameters
3.
Selecting
predictors for the model
4.
Evaluating
model performance
5.
Fine
tuning class prediction rules (via ROC curves, etc)
Resampling
techniques:
1.
K-fold
cross validation
2.
Generalized
cross validation (approximate the leave-one-out error rate)
For
example, in the last leave-one-out cross validation, df=1 and n=12.
3.
Repeated
training/testing splits
4.
Bootstrap
Choose final
tuning parameters with different considerations of various factors
Choosing between
models:
1. Start with several models that are the least interpretable and most flexible.
1. Start with several models that are the least interpretable and most flexible.
2. investigate
simpler models that are less opaque
3.
consider using simplest model that reasonably approximates the performance of the
more complex methods
No comments:
Post a Comment