Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1727 +/- ##
==========================================
- Coverage 87.94% 87.88% -0.07%
==========================================
Files 140 140
Lines 12845 12962 +117
==========================================
+ Hits 11297 11392 +95
- Misses 1548 1570 +22
Flags with carried forward coverage won't be shown. Click here to find out more.
|
…tionality - Updated parameter names in LC2ST initialization for consistency (thetas -> prior_samples). - Modified get_scores and get_statistics_under_null_hypothesis methods to return LC2STScores objects, encapsulating probabilities and scores. - Adjusted usage of get_scores in the tutorial and tests to reflect the new return type. - input validation in LC2ST to prevent indexing errors. - Updated tests to assert the presence of scores in the returned null statistics.
Resolved conflicts integrating GPU support (PR #1715) into the refactored LC2ST code: - Added device parameter to LC2ST.__init__ with GPU detection - Integrated NeuralNetBinaryClassifier for GPU-accelerated training - Added skorch ValidSplit and EarlyStopping configuration - Fixed target dtype for skorch (float32) vs sklearn (int64) - Updated tests to use refactored cal_data fixture 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
manuelgloeckler
left a comment
There was a problem hiding this comment.
Great, looks overall good thanks for refactoring.
I do have some minor remarks, but otherwise looks ready to merge from my point of view.
I did run slow tests and they pass.
The current version of the notebook however fails i.e. in Quantitative diagnostics chapter it fails on
quantiles = np.quantile(T_null, [0, 1-conf_alpha])because T_null is no longer an array but a LC2STScores object.
| # Set the parameters for the null hypothesis testing | ||
| self.null_distribution = flow_base_dist | ||
| self.permutation = False | ||
| self.trained_clfs_null = trained_clfs_null |
There was a problem hiding this comment.
Mhh, not entirely sure if LC2ST_NF(trained_clfs_null=...) will work.
Mostly because __init__ is called normally, hence _state is left as INITIALIZED, no?
If a pretrained model is given, shouldn't it be READY.
| ) | ||
| self.trained_clfs = trained_clfs | ||
|
|
||
| # Update state |
There was a problem hiding this comment.
Mhh, the unconditional else can be dangerous here, no?
For example:
lc2st.train_under_null_hypothesis()
lc2st.train_on_observed_data()
lc2st.p_value(...) # works
lc2st.train_on_observed_data() # retrain observed classifier for some reason (a bit artifitial)
lc2st.p_value(...) # now raisesIf its ready, shouldnt it stay ready?
| trained_clfs: List[BaseEstimator], | ||
| return_probs: bool = False, | ||
| ) -> Union[np.ndarray, Tuple[np.ndarray, np.ndarray]]: | ||
| ) -> Union[LC2STScores, np.ndarray, Tuple[np.ndarray, np.ndarray]]: |
There was a problem hiding this comment.
np.ndarray type is never returned, no?
Also not sure if its necessary to add this backward compatibility here. Returning the structured object will already break it (i.e. get_score(...).mean() or so will no longer work).
There was a problem hiding this comment.
true, this will break anyway. I changed it to FutureWarnings to make sure they will be raised. we can then remove this in the next release after this coming one.
| @@ -499,19 +809,28 @@ | |||
| defaults to False. | |||
| verbosity: Verbosity level, defaults to 1. | |||
| Normalized theta if z_score is enabled, otherwise unchanged theta. | ||
| """ | ||
| if self.z_score: | ||
| return (theta - self.theta_p_mean) / self.theta_p_std |
There was a problem hiding this comment.
It maybe make sense to make this more robust i.e. this could go into a divide by zero on constant params/features.
| Normalized x if z_score is enabled, otherwise unchanged x. | ||
| """ | ||
| if self.z_score: | ||
| return (x - self.x_p_mean) / self.x_p_std |
Fixes three merge blockers from review: - LC2ST_NF(trained_clfs_null=...) was deadlocked: __init__ left _state at INITIALIZED when pretrained null classifiers were passed, so train_on_observed_data advanced to OBSERVED_TRAINED (not READY) and p_value raised. Now advances to NULL_TRAINED when pretrained classifiers are supplied. - train_on_observed_data downgraded READY -> OBSERVED_TRAINED on retrain, breaking the documented loop-over-seeds workflow from the tutorial. READY is now preserved. - Advanced-tutorial notebook called np.quantile / axes.hist on the new LC2STScores return value. Both failing cells now extract .scores.
- Narrow get_scores / get_statistics_under_null_hypothesis return type to Union[LC2STScores, Tuple[np.ndarray, np.ndarray]]; the bare np.ndarray branch was never returned. - Clarify return_probs deprecation warning to mention the (probs, scores) tuple order and the eventual removal. - Guard z-score normalization against constant feature dimensions: std == 0 is replaced by 1.0 so constant columns become pass-through (mean-centered) instead of producing NaN/Inf. - Rewrite the error message raised when re-entering train_under_null_hypothesis so it applies cleanly to both LC2ST (permutation, data-dependent) and LC2ST_NF (analytical, reusable). - Fix docstring: verbosity in get_statistics_under_null_hypothesis defaults to 0, not 1. - Add regression tests for single-normalization in null training, constant-dim normalization robustness, and document the lc2st_instance fixture scope.
|
Good catches @manuelgloeckler , thanks! I fixed them all and will merge. |
LC2ST Module Refactoring
Summary
This PR refactors the LC2ST diagnostics module to improve API clarity, error handling, and maintainability while preserving full backward compatibility.
Note 1: I used Claude Code with this refactoring. Every commit was reviewed by me and a very critical Gemini Pro reviewing agent. I believe the changes here are very useful and correct, but the reviewer should please take extra care and be extra skeptical when reviewing this.
Note 2: I started this in parallel or motivated by the flaky test spotted in #1715 , so we need to resolve potential conflicts between these PRs down the line.
Motivation
The LC2ST implementation had accumulated several issues:
assertstatements, which are stripped in optimized buildsget_scores()returned different types based on a boolean flagDesign Choices
State Machine (
LC2STStateenum): Rather than documenting method call order, the refactoring enforces it through explicit state transitions. The LC2ST object progresses throughINITIALIZED → OBSERVED_TRAINED/NULL_TRAINED → READY, with methods checking state before execution. Both training orders are valid and reachREADY.Structured Returns (
LC2STScoresdataclass): Replaces theUnion[array, Tuple[array, array]]return type with a dataclass containingscoresand optionalprobabilities. The oldreturn_probs=Truebehavior is preserved but deprecated.Parameter Rename with Deprecation:
thetas→prior_samplesfor clarity (it's samples from the prior, not parameters). The old parameter works via keyword-only argument withDeprecationWarning.DRY Normalization: Extracted
_normalize_theta()and_normalize_x()helpers. Normalization now happens exactly once in_train()andget_scores(), fixing the double-normalization bug.Validation via Exceptions: All
assertstatements replaced withValueError/TypeErrorwith actionable messages citing actual vs expected values.Backward Compatibility
thetas=...keyword works with deprecation warningget_scores(..., return_probs=True)works with deprecation warningTest Improvements
Reduced parametrization from 48 combinations to 28 focused tests using pytest patterns (dataclass fixtures, composable fixture hierarchy, parametrized validation tests).