test(plotting): regenerate stale baselines on Linux; un-xfail (#1328)#1360
Merged
Conversation
Follow-up to #1359, which exposed (by removing the resize fallback) seven committed baselines it had been masking and marked them xfail(strict=False) for #1328 to regenerate. With the #1359 dpi fix now in the tree, regenerate all seven from the Linux `rendered-figures` CI artifact (py3.12-stable) so they match the renderer CI validates against, and remove the xfail markers + `_STALE_*` reason constants: - 5 GPCCA macrostate-scatter: test_gpcca_meta_states{,_discrete,_no_same_plot, _time}, test_gpcca_final_states (scvelo->scanpy content + the oversized dpi). - test_proj_default_ordering, test_msc_default (layout/figsize). Reviewed each regenerated figure against its predecessor: same data/content, correct size -- benign drift, not a rendering regression. All seven pass at STRICT_TOL=50 on macOS locally too (deterministic UMAP-positioned renders). Also bump test_paga_pie to tol=75. Removing the resize fallback surfaced that the paga pie graph -- the most rasterization-sensitive plot in the suite (a dense web of edges + pie-wedge nodes) -- drifts to ~55 RMS on pre-release matplotlib; its layout is deterministic (basis="umap"), so this is pure cross-version rendering jitter. A modestly looser per-test tolerance absorbs it on every platform (still far tighter than this plot's old tol=250, and still catches real regressions), which is cleaner than a version-conditional xfail. Full plotting suite: 311 passed, 7 skipped. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
7ddb5c4 to
4652878
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Coordination follow-up to #1359, which (by removing the bilinear-resize fallback) exposed 7 committed baselines it had been masking and marked them
xfail(strict=False)for #1328 to regenerate. The #1359 dpi fix is now in the tree, so the macrostate renders are correctly sized.Regenerated 7 baselines on Linux
Pulled the Linux
py3.12-stablerendered-figuresartifact from the post-#1359 main run (dpi fix in place) and promoted them, then removed the 7xfailmarkers +_STALE_*reason constants:test_gpcca_meta_states{,_discrete,_no_same_plot,_time},test_gpcca_final_statestest_proj_default_ordering,test_msc_defaultReviewed each regenerated figure against its predecessor — same data/content, correct size (the old ones were pre-dpi-fix oversized / pre-#1302 scvelo renders). All 7 also pass at
STRICT_TOL=50on macOS locally (deterministic UMAP-positioned + correctly-sized).paga_pie:tol=75, not a conditional xfail#1359 also surfaced (on the non-blocking pre-release job) that
test_paga_piedrifts on pre-release matplotlib — measured RMS 55.3 vs the strict 50. Its node positions are deterministic (basis="umap"), so this is pure cross-version rasterization jitter:paga_pieis the most pixel-sensitive plot in the suite (a dense web of overlapping edges + pie-wedge nodes), and matplotlib rendering isn't version-stable.Rather than a version-conditional
xfail(which would ignore the result on pre-release and bake in dependency-channel detection), this gives the one sensitive plot a modestly looser per-test tolerance:tol=75absorbs the ~55 jitter on every platform, stays 3.3× tighter than this plot's oldtol=250, and still catches a real regression everywhere.Full plotting suite: 311 passed, 7 skipped locally.
🤖 Generated with Claude Code