Let's say I have a CNN model to classify the handwritten numbers 1 to 10. I am using a dataset with 20,000 samples and I make a train test split of 50:50.
That leaves me with 10,000 for training and testing. Will, it automatically pick 1000 images from each class for testing/training, or will it approximate it?
I am trying a similar problem, (with different numbers of samples and classes) but I noticed that the testing data is not evenly split. For example, it has 1010 number ones being tested but only 990 number twos.
Is this normal? I couldn't find any documentation verifying this. My dataset is large enough that the small discrepancy is irrelevant, but I still would like to confirm.
Thanks!
Aucun commentaire:
Enregistrer un commentaire