jeudi 23 août 2018

How do you test more complicated functions?

I'm a total amateur/hobbyist developer trying to learn more about testing the software I write. While I understand the core concept of testing, as the functions get more complicated, I feel as though it's a rabbit hole of varations, outcomes, conditions etc. For example...

The function below reads files from a directory into a Pandas DataFrame. A few columns adjustments are made before the data is passed to a different function that ultimately imports the data to our database.

I've already coded a test for the convert_date_string function. But what about this entire function as as whole - how do I write a test for it? In my mind, much of the Pandas library is already tested - thus making sure core functionality there works with my setup seems like a waste. But, maybe it isn't. Or, maybe this is a refactoring question to break this down into smaller parts?

Anyway, here is the code... any insight would be appreciated!

def process_file(import_id=None):
    all_files = glob.glob(config.IMPORT_DIRECTORY + "*.txt")

    if len(all_files) == 0:
        return []

    import_data = (pd.read_csv(f, sep='~', encoding='latin-1',
                               warn_bad_lines=True, error_bad_lines=False,
                               low_memory=False) for f in all_files)

    data = pd.concat(import_data, ignore_index=True, sort=False)
    data.columns = [col.lower() for col in data.columns]
    data = data.where((pd.notnull(data)), None)

    data['import_id'] = import_id
    data['date'] = data['date'].apply(lambda x: convert_date_string(x))

    insert_data_into_database(data=data, table='sales')
    return all_files

Aucun commentaire:

Enregistrer un commentaire