11/10/2016

Chapter 8 Exercises

Today, I finished exercises of Chapter 8.

# 8.5
library(caret)
data(tecator)
absorp
# moisture fat protein
endpoints
absorpchange=absorp
dim(absorpchange)
absorpchange2=cbind(absorpchange, moisture=endpoints[,1])
absorpchange2=data.frame(absorpchange2)
library(RWeka)
treemodel1=M5P(moisture~., data=absorpchange2)
plot(treemodel1)
set.seed(100)
treemodel2=train(absorpchange2[,1:100], absorpchange2$moisture, method="M5", trControl = trainControl(method = "cv"),
             control= Weka_control(M=10))
treemodel2 # RMSE min = 6.5
plot(treemodel2)
toohigh=findCorrelation(cor(absorpchange2[,1:100]), cutoff = 0.999)
toohigh
trainabsorp=absorpchange2[,-toohigh]
treemodel3=M5P(moisture~., data=trainabsorp)
plot(treemodel3)
set.seed(100)
treemodel4=train(trainabsorp[,1:4], trainabsorp$moisture, method="M5", trControl = trainControl(method = "cv"),
                 control= Weka_control(M=10))
treemodel4 # RMSE min = 6.0
plot(treemodel4)
 

# 8.6
library(AppliedPredictiveModeling)
data(permeability)
dim(fingerprints)
permeability
library(caret)
lowfrequencies=nearZeroVar(fingerprints, freqCut = 20, uniqueCut = 10)
fingerfiltered=fingerprints[, -lowfrequencies]
dim(fingerfiltered)
library(randomForest)
rfmodel=randomForest(permeability~., data=fingerfiltered)
plot(rfmodel)
rfmodel2=randomForest(fingerfiltered, permeability, importance = TRUE, ntrees=1000)
plot(rfmodel2)
imp=importance(rfmodel2)
# %IncMSE: the higher number, the more important
# IncNodePurity: More useful variables achieve higher increases in node purities



Tomorrow, I will read Chapter 9.

No comments:

Post a Comment