A very large number of applications can be summarized as taking an initial dataset A
and transforming it to some target dataset B
.
There's a very effective model of testing for these applications: just store a serialized representation of a known dataset A
along with its desired transformed product B
. Then in your test, de-serialize A
, run the processing application on it, then compare your result to B
.
I'm not sure if this model of testing has a formal name; I call it a functional regression test.
This model works great when datasets A
and B
are small. However, once they get to very large sizes, this model becomes impractical. There may not be enough storage space in the system to store 2 more copies of a huge dataset. Restoring them from serialized form, and doing a full comparison at the end, will often be impractically time-intensive as well.
Is there an approach to make this model practical again for big data applications?
Aucun commentaire:
Enregistrer un commentaire