NPI-4453 Framework for DataFrame hashing & test baselining#110
NPI-4453 Framework for DataFrame hashing & test baselining#110
Conversation
…used by pipeine), and tests subdir (common in development)
… to support any object type, not just DataFrames
ronaldmaj
left a comment
There was a problem hiding this comment.
Did a quick read through the code, high-level understand that this introduces a framework for having data which we can regression test against - if answers change, we need to investigate.
I think this is a nice framework, as it allows us to update the hashes (and pickled versions of the data) in case the new values are actually better - having to do manual overwrite is a feature.
Ran the tests locally as well, and in it's current form works fine.
I peppered in some comments but it's nothing major and I think it is good to go in it's current form 👍
| try: | ||
| df = DataFrame(["a", "b", "c"]) | ||
|
|
||
| # Baseline (do not commit uncommented!) Note: every function needs its own baseline, becuase the |
| # likely fail). | ||
|
|
||
| # We're only testing it with the verify function below, but both verify and baseline functions use the same | ||
| # caller check logic, and store the caller record statically in a class variable. ? |
There was a problem hiding this comment.
Is the question mark here for a reason?
There was a problem hiding this comment.
I wasn't sure on the specific terminology. Checked and updated with a clarification.
| "DF / object list verification should succeed here (unless baseline files are missing, or baselining has been turned on)", | ||
| ) | ||
|
|
||
| # The local variable df still points to the same DF, so now the list contains [a,b,b]. This should be an error. |
There was a problem hiding this comment.
Not sure where the dataframe [a,b,b] came from? You mean ["b", "c", "d"] which is what the df var now points at?
There was a problem hiding this comment.
That was intended to be shorthand for the different DataFrame objects, rather than their content.
This check doesn't care about the data the object stores, just the object's memory address.
|
|
||
| df = DataFrame(["a", "b", "c"]) | ||
|
|
||
| # Baseline (every function needs its own baseline, becuase the function name determines the filename, |
| ) | ||
|
|
||
| # The local variable df still points to the same DF, so now the list contains [a,b,b]. This should be an error. | ||
| objects_to_hash.extend([df]) |
There was a problem hiding this comment.
You are trying to add the same dataframe here to objects_to_hash and that is what is going to cause the error when verifying right? Because you have a duplicate?
There was a problem hiding this comment.
Yep. This checks for duplicate references to the same objects at the top level (as a safety check). It's not recursive, but the top-level check is arguably the most important.
| self.fail("DF / object list verification should fail on *second*/repeated calls from a function.") | ||
|
|
||
| def test_duplicate_object_rejection(self): | ||
|
|
There was a problem hiding this comment.
No description of the test here, whereas the other previous has an intro to what is being tested
|
|
||
| class TestUnitTestBaseliner(unittest.TestCase): | ||
|
|
||
| def test_verify_refusal_in_wrong_mode(self): |
There was a problem hiding this comment.
Short intro to what is being tested would be good here (kinda implied in the name, but could be good to add in)
|
Minor changes suggested above, are now on: #111 |
Introduces a framework for baselining lists of
DataFrames(and in future, other object types) produced by unit tests, then checking for regression against these in subsequent runs.Workflow
Baselining mode
The exact
DataFrameoutputs of a unit test can be 'baselined' within unit tests:The baseline is comprised of two files (a sha256 hash, and a pickled
list[object]) which can be committed along with relevant changes.To prevent accidental baselines being created,
UnitTestBaseliner.mode = "baseline"must be set, which turns offverifymode and raises warnings to reduce the risk of this state being committed.Verification mode
Subsequent unit test runs can call
UnitTestBaseliner.verify(), passing alist[object], of the outputs they have produced.This compares the current unit test output against the baseline on file (using just hash for detection).
Troubleshooting regressions with DataFrame diffs
When a unit test invokes
verify()and it fails (hash not valid), theverify()function can load the baselinedlist[object]from the pickle file, and print diffs between these, and the current output of the unit test.Dataframeand will fail if other data types are included.Baseline file storage
Baseline files are stored at:
gnssanalysis/tests/unittest_baselines/<class_name>/<unittest_function_name>.{pickledlist,pickledlistsha256}Note: When
create_baseline()orverify()is invoked, the names of the calling class and function are determined automatically using frame inspection.This means simply invoking them from within a unit test, will cause a corresponding directory and baseline files to be written or read at the path noted above.