I use CRNN (CNN+RNN+CTCLoss) for my OCR. My dataset is 1000 A WORD image-text (7000 training+valid and 3000 testing)
My architecture is inspired by VGG-16, I'm using 13 conv layers and 3 bi-directional LSTM Layer and I am using CTC Loss. Here is my architecture (training):
I don't know why but my output only predicted 'p' character for all my dataset.
I can't put in here all my code, so you all can see my code here: https://colab.research.google.com/drive/1Mio5i6ySlSPnSs1o1sc-WpgfYLu9i9iy?usp=sharing
Aucun commentaire:
Enregistrer un commentaire