RNN training is mostly done by a teacher forcing method:
- [Training] y_{pred}^{i+1} = RNN(y_{real}^{i})
- [Training error] Error_{train} = abs(y_{real}^{i+1} - y_{pred}^{i+1})
However, RNN applications have to use outputs of previous predictions steps because the correct outputs are not known:
- [Real prediction] y_{pred}^{i+1} = RNN(y_{pred}^{i})
So, here is my question: How about the validation/testing process? Real outputs are known for validation and testing processes. Errors can be calculated as the same way as the training process.
- Teacher forcing for the test error calculation
- [Test error] Error_{test} = abs(y_{real}^{i+1} - RNN(y_{real}^{i}))
- Test error would have a value similar to the training one (since both of them use teacher forcing). Training error can be used as a proxy for the test error.
- The error during the real application running can be very differ from the testing error.
- Prediction outputs for the test error calculation
- [Test error] Error_{test} = abs(y_{real}^{i+1} - RNN(y_{pred}^{i}))
- Test error can be very differ from the training one. Good training error does not guarantee good test error.
- Test error would have a value similar to the real application one (since both of them use prediction outputs).
I know there are other techniques such as professor teaching to make the test error similar to the real application one. However, I don't consider such kind of further techniques at here.
Is it okay to use teacher forcing with real outputs for validation/test? Or should I have to use prediction outputs for them?
Aucun commentaire:
Enregistrer un commentaire