Add group-wise inference risks #53

itrajanovska · 2025-12-16T18:30:46Z

We need group wise risks to be able to assess the fairness of the assigned risks within groups
Successor of #48 implementing the computation of the group wise risks (as well as passing a custom ML model).

itrajanovska · 2025-12-16T18:31:04Z

@MatteoGiomi After rebasing I deleted the branch because of some conflicts, and #52 was automatically closed.

MatteoGiomi

Thanks for this second contribution @itrajanovska, I left a few comments for your consideration.

src/anonymeter/evaluators/inference_evaluator.py

tests/test_inference_evaluator.py

…ates; Remove old comments; Add RuntimeError test.

MatteoGiomi · 2026-01-08T09:20:27Z

tests/test_inference_evaluator.py

+    with pytest.raises(Exception) as runtime_error:
+        evaluate_inference_guesses(guesses=pd.Series(guesses), secrets=secrets, regression=False)
+    assert runtime_error.type is RuntimeError


pytest lets you check for specific exceptions, so that you don't need a separate assert for the exception type.

Suggested change

with pytest.raises(Exception) as runtime_error:

evaluate_inference_guesses(guesses=pd.Series(guesses), secrets=secrets, regression=False)

assert runtime_error.type is RuntimeError

with pytest.raises(RuntimeError) as runtime_error:

evaluate_inference_guesses(guesses=pd.Series(guesses), secrets=secrets, regression=False)

MatteoGiomi · 2026-01-08T09:20:59Z

tests/test_inference_evaluator.py

+    with pytest.raises(Exception) as runtime_error:
+        evaluate_inference_guesses(guesses=pd.Series(guesses), secrets=secrets, regression=False)
+    assert runtime_error.type is RuntimeError
+    assert "The predictions indices do not match the target indices" in str(runtime_error.value)


maybe move this inside the with pytest.raises context.

MatteoGiomi · 2026-01-08T09:23:05Z

tests/fixtures.py

+    samples = pd.read_csv(os.path.join(TEST_DIR_PATH, "datasets", fname), nrows=n_samples)
+    return samples.drop_duplicates(subset=deduplicate_on) if deduplicate_on else samples


In this case, the number of returned samples is not always n_samples. To fix this one can read the whole dataframe, drop the duplicates, then return the first n_samples rows

MatteoGiomi · 2026-01-08T09:24:16Z

tests/test_inference_evaluator.py


 def test_evaluator_not_evaluated():
-    df = get_adult("ori", n_samples=10)
+    df = get_adult("ori", deduplicate_on=None, n_samples=10)


I'd stick to implicitly using the default in case no deduplication is needed.

Suggested change

df = get_adult("ori", deduplicate_on=None, n_samples=10)

df = get_adult("ori", n_samples=10)

MatteoGiomi · 2026-01-08T09:25:28Z

tests/test_inference_evaluator.py

+
+    group_wise = evaluator.risk_for_groups(confidence_level=0)
+
+    for _, results in group_wise.items():


can you add a test on the value of the risk? that way we are sure that all works e2e?

MatteoGiomi · 2026-01-08T09:25:45Z

tests/test_mixed_types_kneigbors.py


 def test_mixed_type_kNN():
-    df = get_adult("ori", n_samples=10)
+    df = get_adult("ori", deduplicate_on=None, n_samples=10)


as mentioned above:

Suggested change

df = get_adult("ori", deduplicate_on=None, n_samples=10)

df = get_adult("ori", n_samples=10)

MatteoGiomi · 2026-01-08T09:25:55Z

tests/test_mixed_types_kneigbors.py

 @pytest.mark.parametrize("n_neighbors, n_queries", [(1, 10), (3, 5)])
 def test_mixed_type_kNN_shape(n_neighbors, n_queries):
-    df = get_adult("ori", n_samples=10)
+    df = get_adult("ori", deduplicate_on=None, n_samples=10)


MatteoGiomi · 2026-01-08T09:33:22Z

hi @itrajanovska, I gave a final look and left a few minor comments. It's almost ready to go!

Add group-wise inference risks

ffc3d5f

MatteoGiomi reviewed Dec 17, 2025

View reviewed changes

itrajanovska added 2 commits December 17, 2025 15:34

Address code refactoring comments; Make test fixture code drop duplic…

cd88767

…ates; Remove old comments; Add RuntimeError test.

Address code refactoring comments.

3b01d5a

MatteoGiomi reviewed Jan 8, 2026

View reviewed changes

itrajanovska added 2 commits January 8, 2026 11:48

Add risk check test; Cleanup get_adult call.

291c1f5

Add risk check test; Cleanup get_adult call.

030fbfb

		samples = pd.read_csv(os.path.join(TEST_DIR_PATH, "datasets", fname), nrows=n_samples)
		return samples.drop_duplicates(subset=deduplicate_on) if deduplicate_on else samples

	df = get_adult("ori", deduplicate_on=None, n_samples=10)
	df = get_adult("ori", n_samples=10)


		group_wise = evaluator.risk_for_groups(confidence_level=0)

		for _, results in group_wise.items():

Add group-wise inference risks #53

Are you sure you want to change the base?

Add group-wise inference risks #53

Uh oh!

Conversation

itrajanovska commented Dec 16, 2025

Uh oh!

itrajanovska commented Dec 16, 2025

Uh oh!

MatteoGiomi left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

MatteoGiomi Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

MatteoGiomi Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

MatteoGiomi Jan 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

MatteoGiomi Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

MatteoGiomi Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

MatteoGiomi Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

MatteoGiomi Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

MatteoGiomi commented Jan 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

MatteoGiomi Jan 8, 2026 •

edited

Loading