lundi 16 mars 2020

RNN test error: teacher forcing or prediction data?

RNN training is mostly done by a teacher forcing method:

  • [Training] y_{pred}^{i+1} = RNN(y_{real}^{i})
  • [Training error] Error_{train} = abs(y_{real}^{i+1} - y_{pred}^{i+1})

However, RNN applications have to use outputs of previous predictions steps because the correct outputs are not known:

  • [Real prediction] y_{pred}^{i+1} = RNN(y_{pred}^{i})

So, here is my question: How about the validation/testing process? Real outputs are known for validation and testing processes. Errors can be calculated as the same way as the training process.

  • Teacher forcing for the test error calculation
    • [Test error] Error_{test} = abs(y_{real}^{i+1} - RNN(y_{real}^{i}))
    • Test error would have a value similar to the training one (since both of them use teacher forcing). Training error can be used as a proxy for the test error.
    • The error during the real application running can be very differ from the testing error.
  • Prediction outputs for the test error calculation
    • [Test error] Error_{test} = abs(y_{real}^{i+1} - RNN(y_{pred}^{i}))
    • Test error can be very differ from the training one. Good training error does not guarantee good test error.
    • Test error would have a value similar to the real application one (since both of them use prediction outputs).

I know there are other techniques such as professor teaching to make the test error similar to the real application one. However, I don't consider such kind of further techniques at here.

Is it okay to use teacher forcing with real outputs for validation/test? Or should I have to use prediction outputs for them?

Aucun commentaire:

Enregistrer un commentaire