Today, I read the book “Applied Predictive
Modeling”. I practiced the exercise 3. I also read the Chapter 4 (4.1 4.2).
Exercise 3
library(caret)
data(BloodBrain)
logBBB
bbbDescr
library(e1071)
correlations=cor(bbbDescr)
library(corrplot)
corrplot(correlations)
Chpater 4 (4.1 4.2)
Over-fitting is a concern for any predictive model regardless of field of research.
In addition to learning the general patterns in the data, the model has also learned the characteristics of each sample’s unique noise. This type of model is said to be over-fit and will usually have poor accuracy when predicting a new sample.
A choice of too few neighbors may over-fit the individual points of the training set while too many neighbors may not be sensitive enough to yield reasonable performance. This type of model parameter is referred to as a tuning parameter because there is no analytical formula available to calculate an appropriate value.
‘cost’ parameter large: complicated model
‘cost’ parameter small: simple model
The apparent error rate can produce extremely optimistic performance estimates. A better approach is to test the model on samples that were not used for training.
Tomorrow, I will continue to read Chapter 4.
No comments:
Post a Comment