I have been using a learning machine in order to forecast a variable from time-series data. My question comes when I create both data sets with the following script:
>library(caret)
>ind=createDataPartition(Data$variable, p=2/3, list = FALSE)
>train<-Data[ind,]
>test<-Data[-ind,]
These data sets are randomly chosen from the whole data set, having 2/3 for training and 1/3 for testing.
Do you consider that this technique is it correct? As my point of view, the predicted data will have a high r^2 because it is a time-series dataset (highly correlated). Do you consider that it would be more beneficial picking the last 1/3 of the data (ordinal technique).
Thanks,
Regards.
Aucun commentaire:
Enregistrer un commentaire