jeudi 24 septembre 2020

Python Hypothesis package: can I ensure that certain values are used?

Problem Statement

Below is a toy example that is close to what I am trying to do.

@given(
    idx_start=integers(min_value=0, max_value=100000),
    idx_window=integers(min_value=0, max_value=100000),
)
def test_calc_max(conftest_df, idx_start, idx_window):
    row_idxs = conftest_df.index[idx_start : (idx_start + idx_window)]
    assert calc_max(conftest_df.loc[row_idxs, "my_column"]) >= 0

conftest_df is a Dataframe that I am making available in my conftest.py fixture file, which represents a portion of real data that I am using for my package.

This dataframe has very few NaN values in it. I want to use hypothesis because, well, it's awesome and I strongly believe it is the right way to do things.

But I also want to make sure that the methods and functions under test work for NaN's. I don't really want to just say NaNs, something else might come up in the future (say, a number that uses a comma instead of a period to represent a decimal).


Ideal Solution via hypothesis

I would rather be able to do something like this:

@given(
    idx_start=integers(min_value=0, max_value=100000, includes=[5, 4000, 80000]),
    idx_window=integers(min_value=0, max_value=100000, includes=[20]),
)
.
.
.

And have a way to ensure that certain values are considered via the includes argument.

I know that hypothesis keeps track of failing values, but it does not seem to guarantee their use, from my experience.

Is there a way to do what I would like?

Aucun commentaire:

Enregistrer un commentaire