10/06/2016

Chapter 4 (Computing)

Today, I did the Chapter 4 Computing. They were Data Splitting and Resampling.

library(AppliedPredictiveModeling)
data(twoClassData)
str(predictors)
str(classes)
set.seed(1)
library(caret)
trainingrows=createDataPartition(classes, p=0.8, list=FALSE)
head(trainingrows)
trainpredictors=predictors[trainingrows, ]
trainclasses=classes[trainingrows]
testpredictors=predictors[-trainingrows, ]
testclasses=classes[-trainingrows]
trainpredictors
trainclasses
testpredictors
testclasses
#maxdissim
#resampling
set.seed(1)
repeatedsplits=createDataPartition(trainclasses, p=0.8, time=3)
str(repeatedsplits)
#to create indicators for 10-fold cross-validation
set.seed(1)
cvsplits=createFolds(trainclasses, k=10, returnTrain = TRUE)
str(cvsplits)
fold1=cvsplits[[1]]
cvpredictors1=trainpredictors[fold1,]
cvclasses1=trainclasses[fold1]
nrow(trainpredictors)
nrow(cvpredictors1)

Tomorrow, I will do more on Computing.

No comments:

Post a Comment