Add the DC-EGM solver (discrete-continuous endogenous grid method)#390
Draft
hmgaudecker wants to merge 25 commits into
Draft
Add the DC-EGM solver (discrete-continuous endogenous grid method)#390hmgaudecker wants to merge 25 commits into
hmgaudecker wants to merge 25 commits into
Conversation
Route the existing brute-force grid search through a per-regime solver configuration and a builder registry, with no change to the numerics. - `Regime.solver: BruteForce | DCEGM` (default `BruteForce()`), exported from `lcm` alongside the `DCEGM` configuration class. - `_lcm.solution.registry`: `SolverBuildContext`, `SolverKernels`, the `SolverKernelBuilder` protocol, and `_build_brute_force_kernels` (the former `_build_max_Q_over_a_per_period`), dispatched on `type(regime.solver)` via `SOLVER_KERNEL_BUILDERS`. - `DCEGM` is published as the final configuration surface (fields + field validation), but its engine is not yet wired in; a regime requesting it is rejected at model build with `NotImplementedError`. Behavior-preserving: the full test suite passes unchanged and an explicit `BruteForce()` yields the same value function as the default. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Points the benchmark feature's aca-model at feat/dcegm-solver's tip — main's audit fixes + the DC-EGM solver + smooth-share eligibility — the version the solver-seam work is developed against. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Replace the `type(solver)`-keyed builder registry with a polymorphic `Solver` ABC. The engine now calls `solver.validate(context)` then `solver.build_period_kernels(context)` — no `SOLVER_KERNEL_BUILDERS` dict, no `BruteForce | DCEGM` union, no standalone DC-EGM guard. - `_lcm/solution/contract.py` (new): the `Solver` ABC (abstract `build_period_kernels`, default no-op `validate`), `SolverBuildContext`, and `SolverKernels`. An engine leaf — imports nothing that reaches `lcm.solvers`, so the façade can re-export it without an import cycle. - `_lcm/solution/solvers.py` (new): `GridSearch(Solver)` (the relocated grid-search builder, with function-local `jax`/`get_max_Q_over_a` imports) and `DCEGM(Solver)` (the published config; `validate` raises the not-yet-available guard, so a regime requesting it is rejected at model build). - `lcm/solvers.py` → thin re-export façade; `registry.py` deleted; the `processing` dispatch and `Regime.solver` field updated. - Rename the default solver `BruteForce` → `GridSearch` (more descriptive; alpha permits the break). Faithful to dcegm-solver-seam-abc-design.md (ABC, not Protocol). Layer 2 (generic KernelResult) does not apply here — the stub seam has no EGM fork. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
Benchmark comparison (main → HEAD)Comparing
|
e0fecdd to
44e8522
Compare
The full discrete-continuous endogenous grid method, re-rooted as a single commit on top of the solver-selection seam (PR #388). The tree reproduces the reviewed feat/dcegm tip exactly, minus two root-level audit scratch files. On top of the seam this adds the `_lcm.egm` engine, DC-EGM build-time validation, the `_build_dcegm_kernels` builder and the EGM-specific SolverBuildContext / SolverKernels / SolutionPhase fields, and replaces the seam's build-time `NotImplementedError` guard with the real solver. To be split into reviewable sub-PRs at review time. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The child stochastic-node expectation built the full node-stacked `node_values`/`node_marginals` arrays and reduced them with a weighted sum downstream of the map. So `stochastic_node_batch_size` (a `lax.map`) shed only the per-node gather working-set while still materialising the full `(..., n_nodes)` output — half a fix, and at large scale the residual stack can bind and even raise peak. Fold the weighted sum into a `lax.scan` carry instead: each block reads only its nodes and accumulates the running expectation, so neither the per-node gather nor the full node-stacked output is materialised. `0` (or a size >= the mesh) keeps the single fused vmap + reduction. The block reduction reorders the floating-point adds, so the value function matches the fused solve to tight numerical tolerance rather than bit-identically; the node-batch invariance test asserts `allclose` at rtol/atol 1e-9. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- New `docs/credits.md` (linked into the docs nav and the README), crediting the methods pylcm builds on (Carroll EGM; Iskhakov–Jørgensen–Rust–Schjerning DC-EGM; Dobrescu–Shanker FUES; Fella generalized EGM; the Druedahl / Druedahl–Jørgensen / Dobrescu–Shanker multidimensional-EGM methods behind the reserved upper-envelope backend), the discretization methods (Tauchen, Rouwenhorst, Kopecky–Suen, Fella–Gallipoli–Pan), the replicated example models shipped in pylcm (IJRS 2017, Mahler–Yum 2024), QuantEcon, and the open-source ecosystem (OSE, NumEconCopenhagen, akshayshanker, JeppeDruedahl, fediskhakov). - Add the missing BibTeX entries (Carroll 2006, Fella 2014, Dobrescu–Shanker FUES 2022 and RFC 2024, Druedahl 2021, Druedahl–Jørgensen 2017). - Fix the Mahler & Yum (2024) page range in references.bib: 1697–1733 (Econometrica 92(5)), not 1307–1343. Every citation verified against the journal/DOI or the authors' repositories. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
pylcm's Fast Upper-Envelope Scan was adapted from the `OpenSourceEconomics/upper-envelope` package (Apache-2.0, © The Upper-Envelope Authors) and since substantially modified. Both projects are Apache-2.0, so the derivation is permitted; this records the upstream attribution and the modification notice (Apache-2.0 §4(b)/(c)) in the `fues.py` header and the Credits page, which previously credited only the method's paper. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
pylcm's Fast Upper-Envelope Scan is adapted from OpenSourceEconomics/upper-envelope (Apache-2.0). Add a root `NOTICE` recording that derivation and the upstream copyright, and vendor the upstream license text verbatim at `licenses/upper-envelope-LICENSE`. The in-source attribution in `fues.py` (which ships in the wheel via `src/`) already carries the modification notice; these files add the conventional repo/sdist-level attribution. No other part of the codebase derives from third-party source — the DC-EGM kernel and the NEGM Phase-0 toys are written against pylcm's own API — so only the upper-envelope derivation is recorded. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Move the pure spec-introspection helpers (discrete/passive/process state names, child state name, child discrete actions, child resources function and its arg names, regime-function concatenation) out of step.py into egm/regime_introspection.py. These read only a regime's static spec and hold no kernel state, so they form the dependency leaf that the kernel build, the continuation subsystem, and the kernel-scope checks all import from — breaking the otherwise-circular dependency between step.py and the scope checks that move next. Behavior-preserving pure move. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Move the four `_find_unsupported_*` functions — the build-time checks that name features outside the kernel's current scope — out of step.py into egm/kernel_scope.py. They read the processed kernel-build context (period carry targets, qualified flat params, the regime's VInterpolationInfo), which the model-time `validation` module does not have, so they belong with the kernel build rather than with the model-construction validators. step.py keeps calling `_find_unsupported_feature` to install the raising step. Behavior- preserving pure move. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Move the multi-target continuation machinery — target splitting, the per-child carry reader, per-row resources queries and gradients, the stochastic-node expectation, passive-state blending, discrete-choice aggregation (hard max or EV1 logsum), and the child-read build — out of step.py into egm/continuation.py. The core EGM step and the asset-row step keep calling `_get_child_carry_reader` / `_build_child_reads` / `get_egm_continuation_targets`, so step.py shrinks by ~870 lines and the textbook Euler inversion no longer sits in the same file as the multi-target, passive-state, taste-shock, and stochastic-node logic. Behavior-preserving pure move; the bound continuation contract is a follow-up. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Move the certifiable core algorithm — _EgmKernelPieces, _get_solve_one_combo, _compute_nodes_over_savings, _get_compute_node, _compute_constrained_candidates, _publish_V_and_carry_rows, and the constrained-offset constant — out of step.py into egm/step_core.py. It reads the expected continuation through continuation's per-target carry reader and depends on nothing in asset-row mode or the kernel orchestration. step.py keeps the orchestration (build, dispatch, combo map) and the asset-row path (extracted next). The one test importing _publish_V_and_carry_rows from step is repointed to step_core. Behavior- preserving pure move. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Move the per-Euler-node solve — _get_solve_one_combo_asset_rows, its differentiated continuation closure _get_expected_continuation_value, and the scalar-query publisher _publish_node_V_and_policy — out of step.py into egm/asset_row.py. It maps step_core's single-post-state pipeline over the Euler grid and consumes continuation's per-target reader; step.py keeps only the orchestration (build, dispatch, combo map, feasibility, raising step). This completes the split: the textbook core (step_core) and the asset-row extension (asset_row) are now separate, certifiable modules. Behavior-preserving pure move; the streaming refine-to-query feature lands here next. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Collapse the multi-target continuation aggregation into `ContinuationPlan` (build-time statics) plus `bind_continuation`, which binds the plan to one combo pool and the next period's carries and returns a `savings -> (expected value, expected marginal)` callable. The textbook step (`_get_compute_node`) and the asset-row step (`_get_expected_continuation_value`) read only that bound callable instead of each re-deriving the regime-transition probabilities, child carry readers, and per-target aggregation. Behavior-preserving: the bound callable performs the identical aggregation, and it evaluates the probabilities and child reads from the combo pool *inside* the builder, so the asset-row Euler-slot gradient still carries the direct dP/da . EV terms (precomputing them outside the differentiated closure would silently drop them). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…the GridSearch rename. The re-rooted `SolverBuildContext` / `SolverKernels` are beartyped by the package claw, so under PEP 649 their field annotations resolve to real objects when a context is constructed at model build. `UserRegime` and `VInterpolationInfo` (whose module imports `lcm.regime`) close an import cycle through the `lcm.solvers` façade, so each is referenced via a two-form alias — precise for ty under `TYPE_CHECKING`, a bare container for the beartype claw at runtime. The other engine types import normally. The `BruteForce` → `GridSearch` rename had not reached the DC-EGM oracle test modules; propagate it, including the validation-message `match=` string. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
`test_model_with_dcegm_solver_builds` asserts that selecting the DC-EGM solver builds the model rather than being rejected, but it placed the solver on the stock iskhakov `working_life` regime, which lacks the resources, post-decision, and inverse-marginal-utility functions the DC-EGM contract requires — so validation correctly rejected it. Point the test at `build_dcegm_model`, whose regimes satisfy the contract. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…r 2) Remove the residual solver-type fork in the backward-induction loop. The loop no longer branches `if regime.solution.egm_step is not None`; every regime — grid search or DC-EGM — exposes one uniform per-period adapter that wraps its shared jitted core, calls it with the solver's own argument layout, and assembles a `KernelResult(V_arr, carry, sim_policy)` outside JIT. The only branches left in the loop are on the optional generic outputs (`result.carry is not None` / `result.sim_policy is not None`), never on solver type. Contract (`_lcm/solution/contract.py`): - Add `KernelResult` (frozen dataclass) and `PeriodKernel` (runtime_checkable structural Protocol exposing `core`, `build_lower_args`, `with_fixed_params`, `__call__`). - Reshape `SolverKernels` → `SolutionKernels(period_kernels, continuation_template)`, dropping the `max_Q_over_a` / `egm_step` / `egm_carry_template` / `egm_reachable_targets` quartet. - Name the continuation channel solver-agnostically via `ContinuationPayload` (= `EGMCarry` today). Solvers (`_lcm/solution/solvers.py`): - `GridSearch`/`DCEGM.build_period_kernels` build per-period adapter closures (`_GridSearchPeriodKernel` / `_DCEGMPeriodKernel`) over the existing jitted core; the EGM core's reachable-carry filter and param projection move onto the DC-EGM adapter. Engine wiring: - `SolutionPhase` carries `period_kernels` + `continuation_template` (plus a `solves_via_dcegm` property); the cyclic `PeriodKernel` annotation uses the TYPE_CHECKING-precise / runtime-widened two-definition pattern so the beartype claw resolves the dataclass field at construction. - The terminal-continuation publisher is composed engine-side (`_TerminalCarryPeriodKernel` in regime_building/processing.py) as an output decorator, not inside `GridSearch`. - `model_processing._partial_fixed_params_into_regimes` binds fixed params via each adapter's `with_fixed_params`; `simulation/additional_targets.py` reads `solves_via_dcegm`. Compilation reuse and the loop's memory/host-offload/gc discipline are preserved: only the shared core is jitted and identity-deduped (by `id(Q_and_F)` / function identity), no adapter is jitted per period, and the solve loop's device eviction / carry-buffer release / reachable-carry subset logic is untouched (dispatch re-routed only). Add `test_period_kernels_sharing_a_config_reuse_one_compiled_core` asserting the distinct-compiled-core count. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
In asset-row mode the per-node solve refines a full NaN-padded upper envelope of static length n_pad and interpolates it at exactly one query (resources_at_node) to publish a scalar (V_node, policy_node), then discards it. Batched over combos x Euler-nodes, that n_pad scratch axis is the binder that OOMs large solves. Fold the single-query interpolation into the upper-envelope scan so the n_pad rows never materialize: - Extract `_interp_between_nodes` in interp.py — the pure two-bracket-node arithmetic plus the Hermite correction, shared by `interp_on_prepared_grid` and the streamed path. The existing interp unit tests pin it (behavior- preserving). This guarantees the streamed value cannot diverge from row-then-interp: only which two nodes differs. - Add `refine_to_bracket` + `QueryBracket` in fues.py — a new scan driver reusing `_inspect_candidate` verbatim, changing only the emission sink: each step's up-to-3 emitted points fold into an O(1) bracket-capture carry (first/second, rolling prev/last, lo = latest emitted with grid <= q, hi = first with grid > q, running counts), no [n_input,3] stack and no [n_pad] rows. Post-scan, the bracket reproduces clip(searchsorted(side="right"), 1, max(n_kept-1, 1)) node-for-node: the side="right" tie-break at a duplicated kink (right copy wins the lower slot), the below-first clamp (first, second), and the at-or-above-last clamp (second-last, last; single-live falls back to the NaN-padded slot like the row path). Geometry only — no utility, borrowing limit, or floor. - Add `publish_node_from_bracket` in asset_row.py — the scalar-bracket counterpart of `_publish_node_V_and_policy`: the shared `_interp_between_nodes` with the value Hermite slope = grad(utility) at the two bracket policies only, plus the constrained floor and the n_kept > n_pad overflow poison, preserved identically. Wire the asset-row node solve to refine_to_bracket -> publish_node_from_bracket and drop the n_pad envelope from that path. Scope: asset-row mode only. Single-post-state mode publishes the refined envelope AS its inter-period carry (queried later at many parent points), so step_core's single-post-state path keeps calling `refine` unchanged. The new equivalence test pins the streamed pair against refine + _publish_node_V_and_policy for the same candidates + query, to fp tolerance: smooth, kinked, multi-crossing, all-dead, single-live, overflow, and query below-first / above-last / exactly on a duplicated kink abscissa. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The backward-induction loop forced a `gc.collect()` every period to return the device pool from a rolled-off continuation carry (a registered pytree CPython frees only on a cyclic collection). A grid-search period rolls no such carry, so the collection bought nothing there while its fixed cost dominated the warm solve of small / many-period models — deteriorating those benchmarks by up to ~25x (the Iskhakov-et-al grid-search warm solve: 1.01s -> 0.35s). Gate the collection on the loop's own per-period carry output (`period_egm_carries`), empty exactly when no carry rolled. The backward- induction loop stays backend-agnostic — it does not fork on solver type or read `solution.solves_via_dcegm`; the clean separation of the solve backends is a separate design task. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The splay/batch-size invariance assertions hard-coded float64 tolerances (rtol/atol 1e-12 for the combo axes, 1e-9 for the stochastic-node reduction), which the 32-bit CI job cannot meet: rescheduling the lax.map / reordering the weighted-sum adds shifts results at single-precision scale. Key the tolerance off conftest's X64_ENABLED, matching the established pattern in test_economic_validation / test_grids. Also stop test_egm_interp forcing dtype=jnp.float64 for its interpolation grid; under --precision=32 that raises a float64-truncated-to-float32 UserWarning that CI treats as an error. Use the canonical default float dtype instead. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The solve module's job is backward induction; name it for that. Renames src/_lcm/solution/solve_brute.py → backward_induction.py and tests/solution/test_solve_brute.py → test_backward_induction.py, updates every import (model.py, simulation/compile.py, test_beartype_claw.py), the two test-function names, and the doc/docstring references. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
A module-level jnp.array constant triggers the PREALLOCATE=true pool reservation on device 0 at import time — before any solve, and even in processes that only import the model code to schedule work. A second such process then OOMs on device 0 at startup. Document the host-array (numpy) pattern for module-level constants in the tuning guide's memory section. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
# Conflicts: # tests/solution/test_backward_induction.py
Port the Rooftop-Cut (RFC) upper-envelope backend into the current DC-EGM solver and wire it as a prime-time option alongside FUES. - src/_lcm/egm/upper_envelope/rfc.py: the RFC backend (refine_envelope + _dominated_within_radius + _has_policy_jump), a parallel per-point dominance test that only deletes candidates (no crossing insertion). Drop-in: same refined-row contract as fues.refine_envelope, plus the candidate supgradient mu = dv/dR it reads to build each point's tangent. - get_upper_envelope dispatches RFC when solver.upper_envelope == "rfc", mirroring the FUES branch and passing rfc_jump_thresh / rfc_search_radius. The UpperEnvelopeBackend Protocol now carries marginal_utility (FUES drops it; RFC uses it); step_core computes the candidate supgradient via the new _candidate_supgradient helper and passes it to pieces.refine. - DCEGM config: upper_envelope Literal gains "rfc", plus rfc_jump_thresh and rfc_search_radius fields with field validators matching the fues_* style. - Asset-row + RFC: get_bracket_finder gains a non-streamed RFC path. RFC has no streamed bracket finder, so the finder materializes the full refined envelope via refine_envelope_rfc and reads the same searchsorted(side="right")-clamped bracket the row path uses (_bracket_from_refined_row), so the published (V, policy) is identical to full-envelope-then-interpolate. RFC's asset-row path does NOT yet get refine-to-query's n_pad memory win (FUES-specific streaming; a streamed RFC finder is future work). asset_row passes the candidate supgradient through. - Tests: tests/solution/test_rfc.py (RFC kernel spec + the new non-streamed bracket-vs-full-envelope parity cases) and tests/solution/test_rfc_fues_parity.py (the RFC == FUES V-parity oracle on the DC-EGM solve battery). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Adds the DC-EGM solver (discrete-continuous endogenous grid method) behind the
SolverABC seam introduced in #388. It inverts the Euler equation on an exogenousend-of-period grid instead of grid-searching the continuous action, with a FUES upper
envelope to drop dominated candidates. Selected per regime via
solver=DCEGM(...);GridSearch(the default) is untouched, and the engine dispatches polymorphically — nofork on solver type.
Commit structure (reviewable in steps)
envelope, the closed-form constrained segment, multi-target continuation, and
asset-row mode; wired through the ABC
DCEGM.validate/build_period_kernels.lax.scanblocks — a memory knob for the childstochastic-node mesh.
an acknowledgments page.
step.py(5 pure-move commits) —regime_introspection→kernel_scope→continuation→step_core→asset_row, taking the monolith from~2900 to ~800 lines so the simple single-post-state path reads cleanly apart from the
asset-row and continuation machinery.
savings → (expected_value, expected_marginal)callable (ContinuationPlan).package claw, so cyclic field types use the two-definition alias pattern (precise for
ty, bare container at runtime); plus a valid-DCEGM seam build test.
PeriodKernel/KernelResultadapter; the solve loopcalls one kernel and branches only on optional generic outputs, not on solver type.
FUES scan (
refine_to_bracket+publish_node_from_bracket), removing then_padintra-node envelope scratch. Single-post-state still publishes the full refined
envelope as its carry, unchanged.
Validation
-n 4, on a free GPU): 1405 passed, 15skipped, 2 xfailed. Covers the DC-EGM-vs-brute value-function-parity battery, the
19-case refine-to-query unit equivalence (streamed ≡ row-then-interp), and the Layer-2
distinct-compiled-cores regression.
tyandprekclean.Pending (does not block reviewing the engine)
n_padruntime win is only visible atACA scale — the in-suite oracles are too small to exercise it; handed to the aca side.
experiment this work unlocks, not settled here.
🤖 Generated with Claude Code