Hello I hope you can help me
I want to generate the best possible model for a data frame I have. As an example, I can use (iris). With a function I defined, I have generated 100 subsets of this data frame, each containing 100 rows.
data(iris)
foo <- function(dat, train_percent = 0.7) {
n <- seq_len(nrow(dat))
train <- sample(n, floor(train_percent * max(n)))
test <- sample(setdiff(n, train))
list(train = dat[train,], test = dat[test,])
}
replicate(100, foo(iris), simplify = FALSE)
Ideally, I would like to test the model I get the other remaining rows from the 50 data frames.
For a model example, I am trying to know if my final model should consider interactions or not. How can I link this model to the 100 subsets I generated?
model<-glm(iris$Sepal.Length ~ iris$Sepal.Width * iris$Species,
family=poisson, data=iris)
My understanding is that if over the 100 subsets the interaction is significant my model should consider it. However, I understand that model construction should follow a stepwise method, but it seems complicated to do this 100 times. So I don´t fully understand if this is possible in R. I have seen this kind of procedure in SPSS and Maxent, but these programs follow other ways to test models.
I hope you can help me or give me some advice.
Aucun commentaire:
Enregistrer un commentaire