I have a mapreduce job writing in Python. Before I put it on EMR I'd like to test it locally.
Currently the only way I know to test is to run the command:
cat input_file | python mapper.py | sort -k 1,1 | python reducer > output_file
But the pipe is a little scary to me cause if anything breaks in it I wouldn't know (other than check the exit code of this command).
Is there a more elegant/pythonic way to run the mapreduce and check it runs successfully (so I can catch a specific exception and handle it)?
Thank you
Aucun commentaire:
Enregistrer un commentaire