Skip to content

Add the DC-EGM solver (discrete-continuous endogenous grid method)#390

Draft
hmgaudecker wants to merge 25 commits into
mainfrom
feat/dcegm
Draft

Add the DC-EGM solver (discrete-continuous endogenous grid method)#390
hmgaudecker wants to merge 25 commits into
mainfrom
feat/dcegm

Conversation

@hmgaudecker

Copy link
Copy Markdown
Member

Draft. Stacked on #388 (feat/solver-seam) → #389main.

Adds the DC-EGM solver (discrete-continuous endogenous grid method) behind the
Solver ABC seam introduced in #388. It inverts the Euler equation on an exogenous
end-of-period grid instead of grid-searching the continuous action, with a FUES upper
envelope to drop dominated candidates. Selected per regime via solver=DCEGM(...);
GridSearch (the default) is untouched, and the engine dispatches polymorphically — no
fork on solver type.

Commit structure (reviewable in steps)

  1. DC-EGM solver implementation (squashed) — the kernel: Euler inversion, the FUES
    envelope, the closed-form constrained segment, multi-target continuation, and
    asset-row mode; wired through the ABC DCEGM.validate / build_period_kernels.
  2. Stochastic-node expectation in lax.scan blocks — a memory knob for the child
    stochastic-node mesh.
  3. FUES provenance + credits (3 commits) — NOTICE/license for the FUES derivation and
    an acknowledgments page.
  4. Modular split of step.py (5 pure-move commits) — regime_introspection
    kernel_scopecontinuationstep_coreasset_row, taking the monolith from
    ~2900 to ~800 lines so the simple single-post-state path reads cleanly apart from the
    asset-row and continuation machinery.
  5. bind_continuation — collapse the continuation behind one bound
    savings → (expected_value, expected_marginal) callable (ContinuationPlan).
  6. Beartype + test fixes (2 commits) — the contract dataclasses are beartyped by the
    package claw, so cyclic field types use the two-definition alias pattern (precise for
    ty, bare container at runtime); plus a valid-DCEGM seam build test.
  7. Seam Layer 2 — a uniform PeriodKernel/KernelResult adapter; the solve loop
    calls one kernel and branches only on optional generic outputs, not on solver type.
  8. Streaming refine-to-query — fold the asset-row single-query interpolation into the
    FUES scan (refine_to_bracket + publish_node_from_bracket), removing the n_pad
    intra-node envelope scratch. Single-post-state still publishes the full refined
    envelope as its carry, unchanged.

Validation

  • Full gpu-01 oracle battery (Tesla V100, -n 4, on a free GPU): 1405 passed, 15
    skipped, 2 xfailed
    . Covers the DC-EGM-vs-brute value-function-parity battery, the
    19-case refine-to-query unit equivalence (streamed ≡ row-then-interp), and the Layer-2
    distinct-compiled-cores regression.
  • ty and prek clean.

Pending (does not block reviewing the engine)

  • ACA-scale memory (Gate C): refine-to-query's n_pad runtime win is only visible at
    ACA scale — the in-suite oracles are too small to exercise it; handed to the aca side.
  • Throughput vs the brute-force solver at matched accuracy (Gate E) is a downstream
    experiment this work unlocks, not settled here.

🤖 Generated with Claude Code

hmgaudecker and others added 3 commits June 17, 2026 15:01
Route the existing brute-force grid search through a per-regime solver
configuration and a builder registry, with no change to the numerics.

- `Regime.solver: BruteForce | DCEGM` (default `BruteForce()`), exported from
  `lcm` alongside the `DCEGM` configuration class.
- `_lcm.solution.registry`: `SolverBuildContext`, `SolverKernels`, the
  `SolverKernelBuilder` protocol, and `_build_brute_force_kernels` (the former
  `_build_max_Q_over_a_per_period`), dispatched on `type(regime.solver)` via
  `SOLVER_KERNEL_BUILDERS`.
- `DCEGM` is published as the final configuration surface (fields + field
  validation), but its engine is not yet wired in; a regime requesting it is
  rejected at model build with `NotImplementedError`.

Behavior-preserving: the full test suite passes unchanged and an explicit
`BruteForce()` yields the same value function as the default.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Points the benchmark feature's aca-model at feat/dcegm-solver's tip — main's
audit fixes + the DC-EGM solver + smooth-share eligibility — the version the
solver-seam work is developed against.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Replace the `type(solver)`-keyed builder registry with a polymorphic `Solver`
ABC. The engine now calls `solver.validate(context)` then
`solver.build_period_kernels(context)` — no `SOLVER_KERNEL_BUILDERS` dict, no
`BruteForce | DCEGM` union, no standalone DC-EGM guard.

- `_lcm/solution/contract.py` (new): the `Solver` ABC (abstract
  `build_period_kernels`, default no-op `validate`), `SolverBuildContext`, and
  `SolverKernels`. An engine leaf — imports nothing that reaches `lcm.solvers`,
  so the façade can re-export it without an import cycle.
- `_lcm/solution/solvers.py` (new): `GridSearch(Solver)` (the relocated
  grid-search builder, with function-local `jax`/`get_max_Q_over_a` imports) and
  `DCEGM(Solver)` (the published config; `validate` raises the not-yet-available
  guard, so a regime requesting it is rejected at model build).
- `lcm/solvers.py` → thin re-export façade; `registry.py` deleted; the
  `processing` dispatch and `Regime.solver` field updated.
- Rename the default solver `BruteForce` → `GridSearch` (more descriptive;
  alpha permits the break).

Faithful to dcegm-solver-seam-abc-design.md (ABC, not Protocol). Layer 2
(generic KernelResult) does not apply here — the stub seam has no EGM fork.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@review-notebook-app

Copy link
Copy Markdown

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@read-the-docs-community

read-the-docs-community Bot commented Jun 19, 2026

Copy link
Copy Markdown

@github-actions

github-actions Bot commented Jun 19, 2026

Copy link
Copy Markdown

Benchmark comparison (main → HEAD)

Comparing 3e29f916 (main) → b85d8249 (HEAD)

Benchmark Statistic before after Ratio Alert
aca-baseline execution time 14.348 s 14.350 s 1.00
peak GPU mem 588 MB 588 MB 1.00
compilation time 430.02 s 372.17 s 0.87
peak CPU mem 7.59 GB 7.46 GB 0.98
aca-baseline-debug execution time 82.696 s 83.318 s 1.01
peak GPU mem 587 MB 588 MB 1.00
compilation time 492.19 s 522.76 s 1.06
peak CPU mem 8.24 GB 8.19 GB 0.99
Mahler-Yum execution time 4.793 s 4.538 s 0.95
peak GPU mem 520 MB 520 MB 1.00
compilation time 11.17 s 11.16 s 1.00
peak CPU mem 1.58 GB 1.58 GB 1.00
Precautionary Savings - Solve execution time 23.6 ms 24.8 ms 1.05
peak GPU mem 8 MB 8 MB 1.00
compilation time 1.56 s 1.58 s 1.01
peak CPU mem 1.16 GB 1.16 GB 1.00
Precautionary Savings - Simulate execution time 61.7 ms 59.2 ms 0.96
peak GPU mem 157 MB 157 MB 1.00
compilation time 3.61 s 3.39 s 0.94
peak CPU mem 1.33 GB 1.33 GB 1.00
Precautionary Savings - Solve & Simulate execution time 86.5 ms 86.6 ms 1.00
peak GPU mem 566 MB 566 MB 1.00
compilation time 4.76 s 4.74 s 1.00
peak CPU mem 1.30 GB 1.31 GB 1.01
Precautionary Savings - Solve & Simulate (irreg) execution time 202.5 ms 199.0 ms 0.98
peak GPU mem 2.18 GB 2.18 GB 1.00
compilation time 4.95 s 5.16 s 1.04
peak CPU mem 1.37 GB 1.37 GB 1.00
IskhakovEtAl2017DCEGMSimulate execution time 217.1 ms
compilation time 4.20 s
peak CPU mem 1.61 GB
IskhakovEtAl2017DCEGMSolve execution time 1.849 s
compilation time 6.11 s
peak CPU mem 1.51 GB
IskhakovEtAl2017Simulate execution time 221.9 ms 191.9 ms 0.86
compilation time 4.31 s 4.10 s 0.95
peak CPU mem 1.29 GB 1.30 GB 1.01
IskhakovEtAl2017Solve execution time 47.5 ms 47.3 ms 1.00
compilation time 0.67 s 0.71 s 1.06
peak CPU mem 1.15 GB 1.15 GB 1.00
IskhakovEtAl2017DCEGMSimulateGpuPeakMem peak GPU mem 282 MB
IskhakovEtAl2017DCEGMSolveGpuPeakMem peak GPU mem 0 MB
IskhakovEtAl2017SimulateGpuPeakMem peak GPU mem 281 MB 281 MB 1.00
IskhakovEtAl2017SolveGpuPeakMem peak GPU mem 67 MB 67 MB 1.00

hmgaudecker and others added 17 commits June 21, 2026 18:09
The full discrete-continuous endogenous grid method, re-rooted as a single
commit on top of the solver-selection seam (PR #388). The tree reproduces the
reviewed feat/dcegm tip exactly, minus two root-level audit scratch files.

On top of the seam this adds the `_lcm.egm` engine, DC-EGM build-time
validation, the `_build_dcegm_kernels` builder and the EGM-specific
SolverBuildContext / SolverKernels / SolutionPhase fields, and replaces the
seam's build-time `NotImplementedError` guard with the real solver. To be
split into reviewable sub-PRs at review time.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The child stochastic-node expectation built the full node-stacked
`node_values`/`node_marginals` arrays and reduced them with a weighted sum
downstream of the map. So `stochastic_node_batch_size` (a `lax.map`) shed only
the per-node gather working-set while still materialising the full
`(..., n_nodes)` output — half a fix, and at large scale the residual stack can
bind and even raise peak.

Fold the weighted sum into a `lax.scan` carry instead: each block reads only its
nodes and accumulates the running expectation, so neither the per-node gather
nor the full node-stacked output is materialised. `0` (or a size >= the mesh)
keeps the single fused vmap + reduction.

The block reduction reorders the floating-point adds, so the value function
matches the fused solve to tight numerical tolerance rather than bit-identically;
the node-batch invariance test asserts `allclose` at rtol/atol 1e-9.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- New `docs/credits.md` (linked into the docs nav and the README), crediting
  the methods pylcm builds on (Carroll EGM; Iskhakov–Jørgensen–Rust–Schjerning
  DC-EGM; Dobrescu–Shanker FUES; Fella generalized EGM; the Druedahl /
  Druedahl–Jørgensen / Dobrescu–Shanker multidimensional-EGM methods behind the
  reserved upper-envelope backend), the discretization methods (Tauchen,
  Rouwenhorst, Kopecky–Suen, Fella–Gallipoli–Pan), the replicated example models
  shipped in pylcm (IJRS 2017, Mahler–Yum 2024), QuantEcon, and the open-source
  ecosystem (OSE, NumEconCopenhagen, akshayshanker, JeppeDruedahl, fediskhakov).
- Add the missing BibTeX entries (Carroll 2006, Fella 2014, Dobrescu–Shanker
  FUES 2022 and RFC 2024, Druedahl 2021, Druedahl–Jørgensen 2017).
- Fix the Mahler & Yum (2024) page range in references.bib: 1697–1733
  (Econometrica 92(5)), not 1307–1343.

Every citation verified against the journal/DOI or the authors' repositories.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
pylcm's Fast Upper-Envelope Scan was adapted from the
`OpenSourceEconomics/upper-envelope` package (Apache-2.0, © The Upper-Envelope
Authors) and since substantially modified. Both projects are Apache-2.0, so the
derivation is permitted; this records the upstream attribution and the
modification notice (Apache-2.0 §4(b)/(c)) in the `fues.py` header and the
Credits page, which previously credited only the method's paper.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
pylcm's Fast Upper-Envelope Scan is adapted from OpenSourceEconomics/upper-envelope
(Apache-2.0). Add a root `NOTICE` recording that derivation and the upstream
copyright, and vendor the upstream license text verbatim at
`licenses/upper-envelope-LICENSE`. The in-source attribution in `fues.py` (which
ships in the wheel via `src/`) already carries the modification notice; these
files add the conventional repo/sdist-level attribution.

No other part of the codebase derives from third-party source — the DC-EGM kernel
and the NEGM Phase-0 toys are written against pylcm's own API — so only the
upper-envelope derivation is recorded.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Move the pure spec-introspection helpers (discrete/passive/process state
names, child state name, child discrete actions, child resources function and
its arg names, regime-function concatenation) out of step.py into
egm/regime_introspection.py. These read only a regime's static spec and hold
no kernel state, so they form the dependency leaf that the kernel build, the
continuation subsystem, and the kernel-scope checks all import from — breaking
the otherwise-circular dependency between step.py and the scope checks that
move next. Behavior-preserving pure move.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Move the four `_find_unsupported_*` functions — the build-time checks that
name features outside the kernel's current scope — out of step.py into
egm/kernel_scope.py. They read the processed kernel-build context (period
carry targets, qualified flat params, the regime's VInterpolationInfo), which
the model-time `validation` module does not have, so they belong with the
kernel build rather than with the model-construction validators. step.py keeps
calling `_find_unsupported_feature` to install the raising step. Behavior-
preserving pure move.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Move the multi-target continuation machinery — target splitting, the per-child
carry reader, per-row resources queries and gradients, the stochastic-node
expectation, passive-state blending, discrete-choice aggregation (hard max or
EV1 logsum), and the child-read build — out of step.py into egm/continuation.py.
The core EGM step and the asset-row step keep calling `_get_child_carry_reader`
/ `_build_child_reads` / `get_egm_continuation_targets`, so step.py shrinks by
~870 lines and the textbook Euler inversion no longer sits in the same file as
the multi-target, passive-state, taste-shock, and stochastic-node logic.
Behavior-preserving pure move; the bound continuation contract is a follow-up.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Move the certifiable core algorithm — _EgmKernelPieces, _get_solve_one_combo,
_compute_nodes_over_savings, _get_compute_node, _compute_constrained_candidates,
_publish_V_and_carry_rows, and the constrained-offset constant — out of step.py
into egm/step_core.py. It reads the expected continuation through
continuation's per-target carry reader and depends on nothing in asset-row mode
or the kernel orchestration. step.py keeps the orchestration (build, dispatch,
combo map) and the asset-row path (extracted next). The one test importing
_publish_V_and_carry_rows from step is repointed to step_core. Behavior-
preserving pure move.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Move the per-Euler-node solve — _get_solve_one_combo_asset_rows, its
differentiated continuation closure _get_expected_continuation_value, and the
scalar-query publisher _publish_node_V_and_policy — out of step.py into
egm/asset_row.py. It maps step_core's single-post-state pipeline over the Euler
grid and consumes continuation's per-target reader; step.py keeps only the
orchestration (build, dispatch, combo map, feasibility, raising step). This
completes the split: the textbook core (step_core) and the asset-row extension
(asset_row) are now separate, certifiable modules. Behavior-preserving pure
move; the streaming refine-to-query feature lands here next.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Collapse the multi-target continuation aggregation into `ContinuationPlan`
(build-time statics) plus `bind_continuation`, which binds the plan to one
combo pool and the next period's carries and returns a
`savings -> (expected value, expected marginal)` callable. The textbook step
(`_get_compute_node`) and the asset-row step (`_get_expected_continuation_value`)
read only that bound callable instead of each re-deriving the regime-transition
probabilities, child carry readers, and per-target aggregation.

Behavior-preserving: the bound callable performs the identical aggregation, and
it evaluates the probabilities and child reads from the combo pool *inside* the
builder, so the asset-row Euler-slot gradient still carries the direct
dP/da . EV terms (precomputing them outside the differentiated closure would
silently drop them).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…the GridSearch rename.

The re-rooted `SolverBuildContext` / `SolverKernels` are beartyped by the package
claw, so under PEP 649 their field annotations resolve to real objects when a
context is constructed at model build. `UserRegime` and `VInterpolationInfo`
(whose module imports `lcm.regime`) close an import cycle through the
`lcm.solvers` façade, so each is referenced via a two-form alias — precise for ty
under `TYPE_CHECKING`, a bare container for the beartype claw at runtime. The
other engine types import normally.

The `BruteForce` → `GridSearch` rename had not reached the DC-EGM oracle test
modules; propagate it, including the validation-message `match=` string.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
`test_model_with_dcegm_solver_builds` asserts that selecting the DC-EGM solver
builds the model rather than being rejected, but it placed the solver on the
stock iskhakov `working_life` regime, which lacks the resources, post-decision,
and inverse-marginal-utility functions the DC-EGM contract requires — so
validation correctly rejected it. Point the test at `build_dcegm_model`, whose
regimes satisfy the contract.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…r 2)

Remove the residual solver-type fork in the backward-induction loop. The
loop no longer branches `if regime.solution.egm_step is not None`; every
regime — grid search or DC-EGM — exposes one uniform per-period adapter that
wraps its shared jitted core, calls it with the solver's own argument layout,
and assembles a `KernelResult(V_arr, carry, sim_policy)` outside JIT. The only
branches left in the loop are on the optional generic outputs
(`result.carry is not None` / `result.sim_policy is not None`), never on
solver type.

Contract (`_lcm/solution/contract.py`):
- Add `KernelResult` (frozen dataclass) and `PeriodKernel` (runtime_checkable
  structural Protocol exposing `core`, `build_lower_args`, `with_fixed_params`,
  `__call__`).
- Reshape `SolverKernels` → `SolutionKernels(period_kernels, continuation_template)`,
  dropping the `max_Q_over_a` / `egm_step` / `egm_carry_template` /
  `egm_reachable_targets` quartet.
- Name the continuation channel solver-agnostically via `ContinuationPayload`
  (= `EGMCarry` today).

Solvers (`_lcm/solution/solvers.py`):
- `GridSearch`/`DCEGM.build_period_kernels` build per-period adapter closures
  (`_GridSearchPeriodKernel` / `_DCEGMPeriodKernel`) over the existing jitted
  core; the EGM core's reachable-carry filter and param projection move onto
  the DC-EGM adapter.

Engine wiring:
- `SolutionPhase` carries `period_kernels` + `continuation_template` (plus a
  `solves_via_dcegm` property); the cyclic `PeriodKernel` annotation uses the
  TYPE_CHECKING-precise / runtime-widened two-definition pattern so the
  beartype claw resolves the dataclass field at construction.
- The terminal-continuation publisher is composed engine-side
  (`_TerminalCarryPeriodKernel` in regime_building/processing.py) as an
  output decorator, not inside `GridSearch`.
- `model_processing._partial_fixed_params_into_regimes` binds fixed params via
  each adapter's `with_fixed_params`; `simulation/additional_targets.py` reads
  `solves_via_dcegm`.

Compilation reuse and the loop's memory/host-offload/gc discipline are
preserved: only the shared core is jitted and identity-deduped (by
`id(Q_and_F)` / function identity), no adapter is jitted per period, and the
solve loop's device eviction / carry-buffer release / reachable-carry subset
logic is untouched (dispatch re-routed only). Add
`test_period_kernels_sharing_a_config_reuse_one_compiled_core` asserting the
distinct-compiled-core count.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
In asset-row mode the per-node solve refines a full NaN-padded upper
envelope of static length n_pad and interpolates it at exactly one
query (resources_at_node) to publish a scalar (V_node, policy_node),
then discards it. Batched over combos x Euler-nodes, that n_pad scratch
axis is the binder that OOMs large solves.

Fold the single-query interpolation into the upper-envelope scan so the
n_pad rows never materialize:

- Extract `_interp_between_nodes` in interp.py — the pure two-bracket-node
  arithmetic plus the Hermite correction, shared by `interp_on_prepared_grid`
  and the streamed path. The existing interp unit tests pin it (behavior-
  preserving). This guarantees the streamed value cannot diverge from
  row-then-interp: only which two nodes differs.
- Add `refine_to_bracket` + `QueryBracket` in fues.py — a new scan driver
  reusing `_inspect_candidate` verbatim, changing only the emission sink: each
  step's up-to-3 emitted points fold into an O(1) bracket-capture carry
  (first/second, rolling prev/last, lo = latest emitted with grid <= q, hi =
  first with grid > q, running counts), no [n_input,3] stack and no [n_pad]
  rows. Post-scan, the bracket reproduces
  clip(searchsorted(side="right"), 1, max(n_kept-1, 1)) node-for-node: the
  side="right" tie-break at a duplicated kink (right copy wins the lower slot),
  the below-first clamp (first, second), and the at-or-above-last clamp
  (second-last, last; single-live falls back to the NaN-padded slot like the
  row path). Geometry only — no utility, borrowing limit, or floor.
- Add `publish_node_from_bracket` in asset_row.py — the scalar-bracket
  counterpart of `_publish_node_V_and_policy`: the shared `_interp_between_nodes`
  with the value Hermite slope = grad(utility) at the two bracket policies only,
  plus the constrained floor and the n_kept > n_pad overflow poison, preserved
  identically. Wire the asset-row node solve to refine_to_bracket ->
  publish_node_from_bracket and drop the n_pad envelope from that path.

Scope: asset-row mode only. Single-post-state mode publishes the refined
envelope AS its inter-period carry (queried later at many parent points), so
step_core's single-post-state path keeps calling `refine` unchanged.

The new equivalence test pins the streamed pair against refine +
_publish_node_V_and_policy for the same candidates + query, to fp tolerance:
smooth, kinked, multi-crossing, all-dead, single-live, overflow, and query
below-first / above-last / exactly on a duplicated kink abscissa.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The backward-induction loop forced a `gc.collect()` every period to return the
device pool from a rolled-off continuation carry (a registered pytree CPython
frees only on a cyclic collection). A grid-search period rolls no such carry, so
the collection bought nothing there while its fixed cost dominated the warm solve
of small / many-period models — deteriorating those benchmarks by up to ~25x
(the Iskhakov-et-al grid-search warm solve: 1.01s -> 0.35s).

Gate the collection on the loop's own per-period carry output
(`period_egm_carries`), empty exactly when no carry rolled. The backward-
induction loop stays backend-agnostic — it does not fork on solver type or read
`solution.solves_via_dcegm`; the clean separation of the solve backends is a
separate design task.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The splay/batch-size invariance assertions hard-coded float64 tolerances
(rtol/atol 1e-12 for the combo axes, 1e-9 for the stochastic-node reduction),
which the 32-bit CI job cannot meet: rescheduling the lax.map / reordering the
weighted-sum adds shifts results at single-precision scale. Key the tolerance
off conftest's X64_ENABLED, matching the established pattern in
test_economic_validation / test_grids.

Also stop test_egm_interp forcing dtype=jnp.float64 for its interpolation grid;
under --precision=32 that raises a float64-truncated-to-float32 UserWarning that
CI treats as an error. Use the canonical default float dtype instead.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
hmgaudecker and others added 2 commits June 21, 2026 18:48
The solve module's job is backward induction; name it for that. Renames
src/_lcm/solution/solve_brute.py → backward_induction.py and
tests/solution/test_solve_brute.py → test_backward_induction.py, updates
every import (model.py, simulation/compile.py, test_beartype_claw.py),
the two test-function names, and the doc/docstring references.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
A module-level jnp.array constant triggers the PREALLOCATE=true pool
reservation on device 0 at import time — before any solve, and even in
processes that only import the model code to schedule work. A second such
process then OOMs on device 0 at startup. Document the host-array (numpy)
pattern for module-level constants in the tuning guide's memory section.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Base automatically changed from feat/solver-seam to main June 21, 2026 16:57
Port the Rooftop-Cut (RFC) upper-envelope backend into the current DC-EGM
solver and wire it as a prime-time option alongside FUES.

- src/_lcm/egm/upper_envelope/rfc.py: the RFC backend (refine_envelope +
  _dominated_within_radius + _has_policy_jump), a parallel per-point dominance
  test that only deletes candidates (no crossing insertion). Drop-in: same
  refined-row contract as fues.refine_envelope, plus the candidate supgradient
  mu = dv/dR it reads to build each point's tangent.
- get_upper_envelope dispatches RFC when solver.upper_envelope == "rfc",
  mirroring the FUES branch and passing rfc_jump_thresh / rfc_search_radius.
  The UpperEnvelopeBackend Protocol now carries marginal_utility (FUES drops
  it; RFC uses it); step_core computes the candidate supgradient via the new
  _candidate_supgradient helper and passes it to pieces.refine.
- DCEGM config: upper_envelope Literal gains "rfc", plus rfc_jump_thresh and
  rfc_search_radius fields with field validators matching the fues_* style.
- Asset-row + RFC: get_bracket_finder gains a non-streamed RFC path. RFC has no
  streamed bracket finder, so the finder materializes the full refined envelope
  via refine_envelope_rfc and reads the same searchsorted(side="right")-clamped
  bracket the row path uses (_bracket_from_refined_row), so the published
  (V, policy) is identical to full-envelope-then-interpolate. RFC's asset-row
  path does NOT yet get refine-to-query's n_pad memory win (FUES-specific
  streaming; a streamed RFC finder is future work). asset_row passes the
  candidate supgradient through.
- Tests: tests/solution/test_rfc.py (RFC kernel spec + the new non-streamed
  bracket-vs-full-envelope parity cases) and tests/solution/test_rfc_fues_parity.py
  (the RFC == FUES V-parity oracle on the DC-EGM solve battery).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant