I'm working in a molecular biology lab where we implemented some degree of lab automation using robotic systems. In particular, we have a measurement robot that produces plain text files containing biological data. In order for lab members to further process and analyse the data, I have written a Python application that converts the plain text files into some more useful tabular format and performs basic statistical analyses. I am reaching the point where I am about to leave the lab. I want to clean up my code, so I can be (more or less) easily maintained and expended on by future scientists working with it. It is kind of a mess, since I grew with my knowledge and proficiency in coding over a few years now. While I am refactoring and rewriting parts of the software, I want to make sure that stuff is still working properly. I have found that automated unit testing is the most solid strategy here (rather than to manually check every time an analysis runs).
I found this to be easy for the statistical functions, as I can simply come up mock data and know what to expect and how to handle it. My question now is: How do I mock several hundred files in a highly specific format being copied correctly?
What I have done so far is to just take input data from previous runs that I know are correct to run on my current build. I then compared the output file to the corresponding output file from the previous build. To me as a self-taught coder this seems like I am missing something here. How would you recommend me to test my program? Or is this strategy actually something that would be acceptable?
Aucun commentaire:
Enregistrer un commentaire