NPI-4453 Framework for DataFrame hashing & test baselining by treefern · Pull Request #110 · GeoscienceAustralia/gnssanalysis

treefern · 2026-02-05T17:20:19Z

Introduces a framework for baselining lists of DataFrames (and in future, other object types) produced by unit tests, then checking for regression against these in subsequent runs.

Workflow

Baselining mode

The exact DataFrame outputs of a unit test can be 'baselined' within unit tests:

df_1 = load_some_data()
df_2 = transform_something()
self.assertEqual(df_1, some_value)
...
df_list = [df_1, df_2]
UnitTestBaseliner.mode = "baseline"
UnitTestBaseliner.create_baseline(df_list)

The baseline is comprised of two files (a sha256 hash, and a pickled list[object]) which can be committed along with relevant changes.
To prevent accidental baselines being created, UnitTestBaseliner.mode = "baseline" must be set, which turns off verify mode and raises warnings to reduce the risk of this state being committed.

Verification mode

Subsequent unit test runs can call UnitTestBaseliner.verify(), passing a list[object], of the outputs they have produced.

df_1 = load_some_data()
df_2 = transform_something()
self.assertEqual(df_1, some_value)
...
df_list = [df_1, df_2]
UnitTestBaseliner.verify(df_list)

This compares the current unit test output against the baseline on file (using just hash for detection).

This will allow detection of regressions too subtle to be found by our existing unit tests.

Troubleshooting regressions with DataFrame diffs

When a unit test invokes verify() and it fails (hash not valid), the verify() function can load the baselined list[object] from the pickle file, and print diffs between these, and the current output of the unit test.

Currently this is only supported for Dataframe and will fail if other data types are included.

⚠️ NOTE: Due to the security implications of deserialization, it must be explicitly enabled when needed, with UnitTestBaseliner.enable_unpickling = True

Baseline file storage

Baseline files are stored at:
gnssanalysis/tests/unittest_baselines/<class_name>/<unittest_function_name>.{pickledlist,pickledlistsha256}

Note: When create_baseline() or verify() is invoked, the names of the calling class and function are determined automatically using frame inspection.

This means simply invoking them from within a unit test, will cause a corresponding directory and baseline files to be written or read at the path noted above.

…used by pipeine), and tests subdir (common in development)

… to support any object type, not just DataFrames

ronaldmaj

Did a quick read through the code, high-level understand that this introduces a framework for having data which we can regression test against - if answers change, we need to investigate.

I think this is a nice framework, as it allows us to update the hashes (and pickled versions of the data) in case the new values are actually better - having to do manual overwrite is a feature.

Ran the tests locally as well, and in it's current form works fine.
I peppered in some comments but it's nothing major and I think it is good to go in it's current form 👍

ronaldmaj · 2026-02-13T02:31:59Z

tests/test_utils.py

+        try:
+            df = DataFrame(["a", "b", "c"])
+
+            # Baseline (do not commit uncommented!) Note: every function needs its own baseline, becuase the


typo: becuase

ronaldmaj · 2026-02-13T02:32:22Z

tests/test_utils.py

+        #   likely fail).
+
+        # We're only testing it with the verify function below, but both verify and baseline functions use the same
+        # caller check logic, and store the caller record statically in a class variable. ?


Is the question mark here for a reason?

I wasn't sure on the specific terminology. Checked and updated with a clarification.

ronaldmaj · 2026-02-13T02:33:03Z

tests/test_utils.py

+            "DF / object list verification should succeed here (unless baseline files are missing, or baselining has been turned on)",
+        )
+
+        # The local variable df still points to the same DF, so now the list contains [a,b,b]. This should be an error.


Not sure where the dataframe [a,b,b] came from? You mean ["b", "c", "d"] which is what the df var now points at?

That was intended to be shorthand for the different DataFrame objects, rather than their content.
This check doesn't care about the data the object stores, just the object's memory address.

ronaldmaj · 2026-02-13T02:33:30Z

tests/test_utils.py

+
+        df = DataFrame(["a", "b", "c"])
+
+        # Baseline (every function needs its own baseline, becuase the function name determines the filename,


typo: becuase

ronaldmaj · 2026-02-13T02:35:49Z

tests/test_utils.py

+        )
+
+        # The local variable df still points to the same DF, so now the list contains [a,b,b]. This should be an error.
+        objects_to_hash.extend([df])


You are trying to add the same dataframe here to objects_to_hash and that is what is going to cause the error when verifying right? Because you have a duplicate?

Yep. This checks for duplicate references to the same objects at the top level (as a safety check). It's not recursive, but the top-level check is arguably the most important.

ronaldmaj · 2026-02-13T02:38:33Z

tests/test_utils.py

+            self.fail("DF / object list verification should fail on *second*/repeated calls from a function.")
+
+    def test_duplicate_object_rejection(self):
+


No description of the test here, whereas the other previous has an intro to what is being tested

ronaldmaj · 2026-02-13T02:39:52Z

tests/test_utils.py

+
+class TestUnitTestBaseliner(unittest.TestCase):
+
+    def test_verify_refusal_in_wrong_mode(self):


Short intro to what is being tested would be good here (kinda implied in the name, but could be good to add in)

treefern · 2026-02-19T08:23:48Z

Minor changes suggested above, are now on: #111

treefern added 4 commits February 5, 2026 15:34

NPI-4453 introduce DataFrame hashing and test baselining functionality

89ae8b9

NPI-4453 improvements to handle launches from both project root dir (…

17a0b97

…used by pipeine), and tests subdir (common in development)

NPI-4453 move comment for clarity

5d94a26

NPI-4453 restructure into single class for consistency

c429792

treefern requested a review from ronaldmaj February 5, 2026 17:20

treefern self-assigned this Feb 5, 2026

NPI-4453 clean up, refactor and rename things to allow easy extention…

ee4fae0

… to support any object type, not just DataFrames

This was referenced Feb 9, 2026

NPI-4443 Pandas 3 compatibility fixes #109

Draft

NPI-4377 Small revisions to NANU processing utility functions #106

Draft

treefern changed the title ~~NPI-4453 DataFrame hashing & test baselining~~ NPI-4453 Framework for DataFrame hashing & test baselining Feb 9, 2026

ronaldmaj approved these changes Feb 13, 2026

View reviewed changes

treefern merged commit 9bafcae into main Feb 18, 2026
4 checks passed

treefern deleted the NPI-4453-implement-hash-baselining-unit-tests branch February 18, 2026 06:09

ronaldmaj mentioned this pull request Feb 25, 2026

NPI-4485 Cleanup and extension of unittest baseliner utility #111

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NPI-4453 Framework for DataFrame hashing & test baselining#110

NPI-4453 Framework for DataFrame hashing & test baselining#110
treefern merged 5 commits intomainfrom
NPI-4453-implement-hash-baselining-unit-tests

treefern commented Feb 5, 2026 •

edited

Loading

Uh oh!

ronaldmaj left a comment

Uh oh!

ronaldmaj Feb 13, 2026

Uh oh!

ronaldmaj Feb 13, 2026

Uh oh!

treefern Feb 19, 2026

Uh oh!

ronaldmaj Feb 13, 2026

Uh oh!

treefern Feb 19, 2026

Uh oh!

ronaldmaj Feb 13, 2026

Uh oh!

ronaldmaj Feb 13, 2026

Uh oh!

treefern Feb 19, 2026

Uh oh!

ronaldmaj Feb 13, 2026

Uh oh!

ronaldmaj Feb 13, 2026

Uh oh!

Uh oh!

treefern commented Feb 19, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants


		df = DataFrame(["a", "b", "c"])

		# Baseline (every function needs its own baseline, becuase the function name determines the filename,

		self.fail("DF / object list verification should fail on second/repeated calls from a function.")

		def test_duplicate_object_rejection(self):


		class TestUnitTestBaseliner(unittest.TestCase):

		def test_verify_refusal_in_wrong_mode(self):

Conversation

treefern commented Feb 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Workflow

Baselining mode

Verification mode

Troubleshooting regressions with DataFrame diffs

Baseline file storage

Uh oh!

ronaldmaj left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

treefern commented Feb 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

treefern commented Feb 5, 2026 •

edited

Loading

treefern commented Feb 19, 2026 •

edited

Loading