lundi 4 septembre 2017

does feature selection on test data result overfitting on that?

I have a regression problem with 336 features and 45000 train data points and 22000 test data points.Because of large number of features I did backward elimination method and decreased features to 61. I did it on the basis of test scores and I get this results: train score: 0.42 test score: 0.28 ( these numbers are normal in my problem because of complexity of problem) while with 339 features I get train score 45 and test score 24.Also with backward elimination on the basis of train results I get train score about 50 and test score 25 with the same number of features(61). My question is that does feature selection on test data result overfitting on that? Also only linear SVM models were used in this problem. Thanks.

Aucun commentaire:

Enregistrer un commentaire