mardi 5 janvier 2021

Why does my for loop for cross validations keep returning the same predict values on the test data?

I want to see which number of cross validations returns the lowest MSE on the test data using OLS. I created a for loop to test a set of cross validations. The predictions on the test set of each model with a different number of cross validations (5, 6, 7...) is stored in a matrix. However, the loop continues to return the same set of predictions for each iteration. I am expecting each set of predictions to be slightly different. Even when done manually, predict continues to return the same set of predicted values for different numbers of cross validations. What am I missing? Any information would be greatly appreciated.

set.seed(30122020)
idtrain <- sample(c(1:dim(data_clean)[1]), round(0.8*dim(data_clean)[1]))

train <- data_clean[idtrain,]
test <- data_clean[-idtrain,]
cv_min <- 5
cv_max <- 10
cv_predict <- matrix(, nrow(test), ncol = (cv_max - cv_min + 1))
for (i in cv_min : cv_max) {
  train.control <- trainControl(method = 'cv', number = i)
  OLS.log.cv <- train(logSalePrice ~ ., data = train,
                      method = 'lm', trControl = train.control)
  
  cv_predict[, (i - cv_min + 1)] <- predict(OLS.log.cv, newdata = test)
}
ols_predict

Aucun commentaire:

Enregistrer un commentaire