Track: Track1; Team name: r2; Model: PolyFilter GNN by aniervs · Pull Request #354 · geometric-intelligence/topobench

aniervs · 2026-06-14T01:24:13Z

Checklist

My pull request has a clear and explanatory title.
My pull request passes the Linting test.
I added appropriate unit tests and I made sure the code passes all unit tests.
My PR follows PEP8 guidelines.
My code is properly documented, using numpy docs conventions, and I made sure the documentation renders properly.
I linked to issues and PRs that are relevant to this PR.

Description

This PR introduces the PolynomialFilterGNN backbone, which implements the single-polynomial-filter pattern with a swappable basis registry, and registers seven variable-basis spectral filters from the literature.

Thesis. A large family of spectral GNNs are the same architecture, a polynomial filter y = Σ_k θ_k T_k(L̃) x in the normalized Laplacian, differing only in the polynomial sequence {T_k}. This PR operationalizes that view: one backbone owns the propagation loop, coefficients, accumulation, pre/post MLPs, and Laplacian normalization; a Basis module owns only the recurrence. Adding a basis is a single new file plus a Hydra _target_ swap, with zero backbone changes.

What's added

PolynomialFilterGNN backbone + the Basis protocol (init / effective_thetas / forward, one uniform signature for signal-dependent and signal-independent bases).
Seven registered bases, each in its own file with the recurrence transcribed from Liao Appendix B and the primary paper.
One Hydra config per basis (graph/polynomial_filter_gnn_<basis>) plus the default graph/polynomial_filter_gnn (Chebyshev).
70 unit tests (abstraction + per-basis algebraic correctness + Hydra composition) and pipeline integration on MUTAG.

Taxonomy placement (Liao et al. 2024, Table 1 / Appendix B, "Variable Basis"). All seven are O(KmF) three-term recurrences in T_{k-1}, T_{k-2}:

Basis	Liao entry	Primary reference
Monomial	Monomial (GPR-GNN)	Chien et al. 2021, arXiv:2006.07988
Chebyshev	Chebyshev (ChebNet/ChebBase)	Defferrard et al. 2016 (1606.09375); He, Wei & Wen 2022 (2202.03580)
ChebNetII	Chebyshev Interpolation	He, Wei & Wen 2022 (2202.03580)
Jacobi	Jacobi (JacobiConv)	Wang & Zhang 2022, arXiv:2205.11172
Legendre	Legendre	Chen & Xu 2023 (IEEE Access)
FavardGNN	Favard	Guo & Wei 2023, arXiv:2302.12432
OptBasisGNN	OptBasis	Guo & Wei 2023, arXiv:2302.12432

Design rationale (why one backbone, not seven). The abstraction is stress-tested by two cases that would normally force bespoke code. OptBasisGNN is signal-dependent (its coefficients come from inner products on the running signal), and it rides the same code path as every stateless basis because signal is passed to all bases uniformly. FavardGNN and ChebNetII own learnable parameters of their own (so Basis is an nn.Module, not a function); ChebNetII reparameterizes the accumulator coefficients via the one protocol hook, effective_thetas. Both land with no backbone branching.

Scope (explicitly out)

Bernstein omitted: Liao Appendix B gives it in closed form per k (O(K²mF), the only variable basis not of the three-term O(KmF) shape), so it does not fit the protocol without stretching it.
Clenshaw / Horner deferred: they fit the protocol but are variants of Chebyshev / Monomial and add no new stress on the abstraction.
Specformer / wavelets / framelets out: not polynomial-in-L̃ (eigendecomposition + attention), a different mathematical pattern.

Evaluation note. All seven bases were run through the full GraphUniverse challenge grid (12 settings = 3 homophily × 2 degree × 2 power-law, × 3 seeds, × 2 tasks = 504 training runs), with in-distribution and OOD (train on one setting, test on the other 11) evaluation. Headline numbers (mean over the non-homophily axes and seeds):

Community detection - accuracy by homophily (higher better):

basis	heterophilous (0–0.1)	mid (0.4–0.6)	homophilous (0.9–1.0)
favard	0.354	0.453	0.685
legendre	0.352	0.449	0.685
jacobi	0.354	0.452	0.684
optbasis	0.348	0.441	0.680
monomial	0.348	0.434	0.652
chebyshev	0.348	0.430	0.629
chebnetii	0.327	0.331	0.373

Triangle counting - test MSE by homophily (lower better; raw counts grow with homophily, so compare within a column):

basis	heterophilous	mid	homophilous
chebyshev	411	11330	83 328
monomial	432	14272	86 452
optbasis	945	12602	94 923
chebnetii	384	12486	95 814
jacobi	691	8180	113 770
legendre	743	7207	115 034
favard	704	18622	119 921

Three findings, echoing the survey's themes of homophily sensitivity and basis expressiveness (Liao et al. 2024, RQ3 / RQ7):

Basis choice matters most under homophily; filters converge under heterophily. On community detection, every basis is far stronger on homophilous graphs (0.63-0.69) than heterophilous ones (~0.33–0.35), and the spread between bases is widest in the homophilous regime and nearly vanishes in the heterophilous one - when the structural signal is weak, the choice of polynomial basis cannot recover it.
Flexible orthogonal / learnable bases lead on classification. Favard, Jacobi, Legendre, and OptBasis (the adaptive-weight families) consistently top the table, with Monomial and Chebyshev a few points behind - consistent with the expressiveness argument that a tunable/learned weight function fits the community-separating frequency response better than a fixed one.
A task-dependent inversion, and the ChebNetII outlier. On triangle counting the order flips: the fixed/simpler bases (Chebyshev, Monomial) and OptBasis generalize best, while Favard/Jacobi/Legendre are worst - a smooth global count rewards a less flexible filter. ChebNetII is the sharpest structural signal: it sits at the heterophily floor across all homophily levels on community detection (0.33–0.37), yet is fully competitive on triangle counting (best at low homophily, mid-pack at high). Its interpolation reparameterization induces generally decaying, low-pass-leaning coefficients (He, Wei & Wen 2022), which suit a smooth regression target but are poorly matched to the band-discrimination community-detection task.

OOD evaluation preserves the in-distribution ordering (e.g. community-detection OOD accuracy: favard 0.444, jacobi 0.443, legendre 0.442 vs chebyshev 0.403, chebnetii 0.305), indicating the ranking reflects basis inductive bias rather than overfitting to a single structural regime.

Issue

Track 1 (GNN) submission to the TDL Challenge 2026. Spectral polynomial filters with tunable frequency response are exactly the architectures whose behavior on the challenge's central axis (structural sensitivity: homophily/heterophily, degree distribution) is most interesting and most explainable, which is why this family is a useful contribution to the benchmark.

Additional context

Draft: benchmark results are not yet included; the code, configs, and tests are complete and the evaluation note will follow.
Reference: the intellectual spine is Liao et al. 2024, "A Comprehensive Benchmark on Spectral GNNs" (SIGMOD '26, arXiv:2406.09675); Appendix B is the canonical recurrence source, cross-checked against each primary paper.
A follow-up PR is planned for the FilterBankGNN (multi-channel) backbone, which reuses this basis registry.

Add a single-polynomial-filter graph backbone that implements y = post(sum_{k=0..K} theta_k * T_k(L_norm) * pre(x)) where {T_k} is a polynomial sequence produced by a swappable basis. The backbone owns the propagation loop, the coefficients theta_k, the accumulation, the pre/post MLPs, the Laplacian normalization convention, and the (x, edge_index, batch, edge_weight) interface expected by GNNWrapper. The basis owns the recurrence and any parameters the recurrence needs. The basis interface (poly_filter/basis.py) is: class Basis(nn.Module): def init(self, x, L_apply) -> u_0 def effective_thetas(self, backbone_theta) -> theta_eff def forward(self, u_prev, u_prev_prev, L_apply, signal, k) -> u_k with a single uniform forward signature shared by signal-independent and signal-dependent bases. Bases are nn.Module subclasses so they can own learnable parameters. The backbone treats every basis as opaque and never branches on the concrete class. Adding a new basis is a single new file plus a Hydra _target_ swap. Reference: Liao et al. (2024) "A Comprehensive Benchmark on Spectral GNNs", SIGMOD '26, arXiv:2406.09675 -- survey unifying every variable- basis spectral GNN under this recurrence-in-L_norm template. Concrete bases land in a follow-up commit.

Register seven bases under topobench.nn.backbones.graph.poly_filter.bases, each a Basis subclass with the recurrence transcribed from Liao Appendix B and the corresponding primary paper: - Monomial: u_k = L_norm * u_{k-1} (GPR-GNN family; Chien, Peng, Li & Milenkovic 2021, arXiv:2006.07988). - Chebyshev (first kind): three-term recurrence; boundary at k=1 handled inside the basis via u_prev_prev=None. Covers ChebNet (Defferrard et al. 2016, arXiv:1606.09375) and ChebBase (He, Wei & Wen 2022, arXiv:2202.03580). - ChebNetII: Chebyshev recurrence + interpolation reparameterization of theta via the discrete Chebyshev transform at Chebyshev nodes (He, Wei & Wen 2022). The only basis here that uses the effective_thetas protocol hook; the backbone's theta is inert by design. - Jacobi(alpha, beta): three-term recurrence with k-dependent coefficients (Wang & Zhang 2022, arXiv:2205.11172). First basis where the k argument is genuinely consumed. - Legendre: shipped as Jacobi(alpha=0, beta=0). Liao's standalone Legendre recurrence uses z=L_norm directly, which evaluates P_k outside its [-1, 1] orthogonality interval (L_norm has eigenvalues in [0, 2]) and grows as Theta(3^k/sqrt(k)) at the spectrum boundary. The Jacobi reparameterization shifts to z=I-L_norm and is uniformly bounded. Documented as the deviation from Liao's literal formula. - FavardGNN: three-term recurrence with learnable coefficients a_k = sqrt(alpha_k) (parameterized via softplus to guarantee positivity) and beta_k (Guo & Wei 2023, arXiv:2302.12432). First basis owning learnable parameters of its own. - OptBasisGNN: Lanczos-style orthonormal recurrence where alpha, gamma are derived from inner products on the running signal u_prev (Guo & Wei 2023, arXiv:2302.12432, Theorem 4.1). The signal- dependent basis in the registry; demonstrates the uniform-signature protocol survives the load-bearing case. Add the Hydra config configs/model/graph/polynomial_filter_gnn.yaml defaulting to Chebyshev; bases are swappable via the CLI override model.backbone.basis._target_=...<Name>. Bernstein is deliberately omitted: Liao Appendix B presents it in closed form per k (O(K^2 m F) vs O(K m F) for every other variable basis), so it does not fit the three-term recurrence protocol without stretching the abstraction. Deferred.

Add 70 unit tests under test/nn/backbones/graph/test_polynomial_filter_gnn.py covering: - TestPolynomialFilterGNN: backbone propagation loop is basis-agnostic; basis receives the uniform (u_prev, u_prev_prev, L_apply, signal, k) signature on every step; default Basis.init returns the signal untouched; backbone runs under sym/rw/none Laplacian normalizations; the _build_laplacian_apply closure matches a hand-computed symmetric Laplacian on a path graph; K=0 and invalid-K guard. - TestMonomialBasis: single-step recurrence applies L_apply once; stateless w.r.t. signal and k. - TestChebyshevBasis: k=1 boundary returns L u_0 (not 2L u_0); for L = alpha*I the basis collapses to T_k(alpha)*x for classical first-kind T_k. - TestJacobiBasis: hyperparameter validation; k=1 closed form at L=0; for L = gamma*I the basis collapses to P_k^{(alpha,beta)}(1-gamma)*x computed from Liao's own recurrence; symmetric (alpha=beta) case kills the middle delta'_k term cleanly. - TestLegendreBasis: literal equivalence to Jacobi(0, 0) on every k; uniform boundedness |u_k| <= 1 across the full L eigenvalue range [0, 2] -- the property that motivated shipping Jacobi(0, 0) rather than Liao's standalone z=L Legendre formula. - TestChebNetIIBasis: M[k, kappa] = (2/(K+1)) * T_k(x_kappa) matches hand computation for K=2; effective_thetas returns M @ theta_interp ignoring backbone_theta; K mismatch raises ValueError; gradient reaches basis.theta_interp and backbone.theta.grad is None by design. - TestFavardGNNBasis: 2(K+1) learnable parameters; softplus keeps a_k strictly positive at extreme raw values; k=1 and k=2 closed forms; gradients flow to both a_raw and beta. - TestOptBasisGNN: init normalizes per-channel and resets state; OptBasis(c*x) = OptBasis(x) scale invariance (the literal signature of signal-dependence), contrasted with Chebyshev(c*x) = c*Chebyshev(x); Lanczos orthonormality <u_k, u_j> approx delta_{kj}; repeated forward passes do not leak state across each other. - TestPolynomialFilterGNNHydraConfig: full run.yaml composes with graph/polynomial_filter_gnn + graph/MUTAG; basis swaps via model.backbone.basis._target_ overrides for Monomial, Jacobi (with hyperparameters), Legendre, ChebNetII (with K interpolation from the backbone), FavardGNN (with K interpolation), and OptBasisGNN. Add graph/polynomial_filter_gnn to MODELS in test/pipeline/test_pipeline.py so the backbone trains 2 epochs on MUTAG as part of the standard wire-up gate.

Three style consistency fixes across the new poly_filter/ subpackage and its tests: - Switch docstrings that carry .. math:: blocks to raw strings (r"""..."""). The runtime content is unchanged, but source code now shows \tilde L, \sum, \frac directly instead of \\tilde L, \\sum, \\frac. Easier to read and edit. - Pick one form for the normalized Laplacian per context: \tilde L inside .. math:: blocks (where Sphinx renders it), and the Unicode character L_tilde everywhere else in prose. Previously mixed across the same file. - Drop em dashes throughout in favour of ASCII punctuation (colons for explanatory clauses, semicolons or parentheses elsewhere). Matches the rest of the project's plain-ASCII docstring style. No behaviour change. All pre-commit hooks (ruff, ruff-format, numpydoc-validation) pass; 70/70 polynomial-filter tests pass.

Add one standalone model config per registered basis so each can be selected in the TDL challenge evaluation notebook via a single MODEL_CONFIG string (the notebook is unmodifiable and takes only a config path, not basis overrides): graph/polynomial_filter_gnn_monomial graph/polynomial_filter_gnn_chebyshev graph/polynomial_filter_gnn_chebnetii graph/polynomial_filter_gnn_jacobi graph/polynomial_filter_gnn_legendre graph/polynomial_filter_gnn_favard graph/polynomial_filter_gnn_optbasis Each file mirrors polynomial_filter_gnn.yaml; only the basis block and model_name differ. The hyperparameterized bases pin their extra args: ChebNetII and FavardGNN set K: ${model.backbone.K} so the basis size tracks the backbone degree, and Jacobi sets explicit alpha=beta=1.0. Distinct model_name values keep per-basis output dirs and results.json from colliding. All seven verified end-to-end through the challenge harness (topobench.run.run) on both tasks (community_detection node-level, triangle_counting graph-level), 1 epoch each: 14/14 configs compose, instantiate the correct basis, and train without error. The default polynomial_filter_gnn.yaml (Chebyshev) is unchanged and remains the config exercised by the unit and pipeline tests.

One results.json per registered PolynomialFilterGNN basis, produced by the challenge harness over the full 12-setting x 3-seed grid on both tasks (community detection, triangle counting): 72 runs per basis, 504 total. Includes comparison_summary.csv (in-distribution + OOD aggregate per basis). Tracked under 2026_tdl_challenge/outputs/ as the challenge submission files.

aniervs and others added 7 commits June 10, 2026 07:57

Merge branch 'main' into track1-polyfilter-gnn

d711d56

aniervs marked this pull request as ready for review June 14, 2026 18:32

gbg141 added the track-1-gnn 2026 Topological Deep Learning Challenge -- Track 1 GNNs label Jun 15, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Track: Track1; Team name: r2; Model: PolyFilter GNN#354

Track: Track1; Team name: r2; Model: PolyFilter GNN#354
aniervs wants to merge 7 commits into
geometric-intelligence:mainfrom
aniervs:track1-polyfilter-gnn

aniervs commented Jun 14, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

aniervs commented Jun 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Checklist

Description

Issue

Additional context

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

aniervs commented Jun 14, 2026 •

edited

Loading