I am working on some data work recently. When I try to do some one-time population script, I find it is hard to write tests over this script. In another word, I am struggling to write tests covering all lines of code running quickly. I attach an example below,
I would like to do some id match from source A and source B involving Django models here.
class Matching:
def __init__(self):
self.mapping = {}
def update_mapping_with_id_set(self, id_set, B_record):
# update the self.mapping
def match_id(self):
# some logic to get a matching set
# Using unmatched set to get the partial matched set
for each in matching_set:
query = B.objects.only("id", "first_name", "last_name")
id_set = set()
for q in query.all(): # it iterates over all records in db, so it takes time here
if each.first_name == q.first_name or each.last_name == q.last_name:
id_set.add(each.id)
self.update_mapping_with_id_set(id_set, q)
def upsert_id(self):
for a in A.objects.all():
if a.id in self.mapping.keys():
a.match_id = self.mapping[a.id]
a.save()
So the question I would like to re-declare is that how I can add test piece by piece over this structure of code design instead of run the whole class in the test case(it iterates over all records so the test time will last long). Or is there any good design pattern for such a one-time population script?
I would be of appreciation if there is any good general suggestions for my case. Thanks in advance!
Aucun commentaire:
Enregistrer un commentaire