Today, I finished Computing and did exercises of Chapter 8.
Computing
# 5. Boosted Trees (gbm: gradient boosting machines)
library(gbm)
gbmmodel=gbm.fit(solTrainXtrans, solTrainY, distribution = "gaussian")
gbmmodel=gbm(y~., data=trainData, distribution = "gaussian")
# The furthest you can go is to split each node until there is only 1 observation in each terminal node.
# This would correspond to n.minobsinnode=1.
gbmgrid=expand.grid(.interaction.depth=seq(1, 7, by=2), .n.minobsinnode=10,
.n.trees=seq(100, 1000, by=50), .shrinkage=c(0.01, 0.1))
set.seed(100)
gbmtune=train(solTrainXtrans, solTrainY, method="gbm", tuneGrid = gbmgrid, verbose=FALSE)
gbmtune
system.time(gbmtune)
# 6. Cubist
library(Cubist)
# an argument, committees, fits multiple models
cubistmodel=cubist(solTrainXtrans, solTrainY, committees = 5)
cubistmodel
# an argument, neighbors,can take on a single integer value (0-9) to
# adjust the rule-based predictions from the training set
cubistpred=predict(cubistmodel, solTestXtrans)
summary(cubistpred)
head(cubistpred)
# the train function in the caret package can tune the model over values of
# committees and neighbors through resampling
cubisttuned=train(solTrainXtrans, solTrainY, method="cubist")
cubisttuned
Exercises
library(mlbench)
set.seed(200)
simulated=mlbench.friedman1(200, sd=1)
simulated=cbind(simulated$x, simulated$y)
simulated=as.data.frame(simulated)
colnames(simulated)[ncol(simulated)]="y"
head(simulated)
library(randomForest)
library(caret)
model1=randomForest(y~., data=simulated, importance=TRUE, ntree=1000)
rfimp1=varImp(model1, scale=FALSE)
rfimp1
simulated$duplicate1=simulated$V1+rnorm(200)*0.1
cor(simulated$duplicate1, simulated$V1)
model2=randomForest(y~., data=simulated, importance=TRUE, ntree=1000)
rfimp2=varImp(model2, scale=FALSE)
rfimp2
library(party)
model3ctr=cforest_control(mtry=ncol(simulated)-1)
model3tree=cforest(y~., data=simulated, controls = model3ctr)
model3tree
cfimp=varimp(model3tree)
cfimp
Tomorrow, I will continue to do exercises of Chapter 8.
how many more exercises are pending?
ReplyDeleteThere are five left, seven in total. I will finish them tomorrow.
ReplyDelete