(WIP) Add per-link DEM error and velocity estimation#97
Open
scottstanie wants to merge 19 commits intoisce-framework:mainfrom
Open
(WIP) Add per-link DEM error and velocity estimation#97scottstanie wants to merge 19 commits intoisce-framework:mainfrom
scottstanie wants to merge 19 commits intoisce-framework:mainfrom
Conversation
- Add docs/theory/dem-error-velocity.md covering the linear phase model, design matrix, temporal coherence objective, grid search optimization, model-guided wrapping, and edge-to-pixel integration - Add docs/notebooks/dem_error_estimation.ipynb tutorial using Mexico City Capella X-band data to visualize DEM error estimation end-to-end - Add Ferretti2001 PSI reference to docs/references.bib - Update mkdocs.yml nav with Notebooks and Background Theory sections - Update background-theory.md and tutorials.md index pages - Add docs/notebooks/data/ to .gitignore for symlinked datasets - Fix mypy error in _output.py by adding assert for Optional narrowing - Fix ruff RUF059 in merge.py by renaming unused `info` to `_info` Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Rename incidence_rad/incidence_deg to look_angle_rad/look_angle_deg throughout (design matrix, settings, CLI, docs, tests) and change sin(theta) to cos(theta) to use the look angle convention - Remove velocity_slice/dem_error_slice properties from LinkModelSettings; inline slice(*range) at call site - Make link_params/link_coherence public attributes on EMCFSolver instead of private attrs behind @Property Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The DEM error phase sensitivity is B_perp / (R * sin(theta)) where theta is the incidence/look angle from nadir, per Ferretti et al. (2001). The code was incorrectly using cos(theta), which introduces a ~24% scaling error at typical Sentinel-1 incidence angles (~39 deg). Updated the formula in the implementation, docstring, test, and theory documentation. https://claude.ai/code/session_01XDTewmAhQC41xXnTVs9tbw
The brute-force grid search called neg_temporal_coherence once per grid point, resulting in thousands of small numpy operations per link. Replace with a single batched evaluation that computes residuals and coherence for all grid points at once via matrix multiplication and vectorized np.exp, giving ~10-50x speedup on the grid search portion. https://claude.ai/code/session_01XDTewmAhQC41xXnTVs9tbw
Three key performance optimizations: 1. Precompute grid + forward model + complex exponential (lines 44-68) - _grid_flat, _pred = matrix @ grid_flat.T, and _E = exp(1j * pred) are now computed once in __post_init__ - Previously, the grid was rebuilt inside every _vectorized_grid_search call (once per spatial link, so tens of thousands of times) 2. Batched grid search via single BLAS matmul (lines 196-223) - Instead of evaluating each link individually with multiprocessing, all links are evaluated simultaneously: coh_grid = E.T @ (wts * exp(-1j * wrapdata)) # (ngrid, nlinks) - This is a single matrix multiply that NumPy's BLAS handles with internal threading - Eliminates all multiprocessing overhead (pickling matrix/ranges per link, IPC send/recv) 3. Quadratic refinement replaces Nelder-Mead (lines 225-289) - For each link, extracts 3x3 coherence stencil around best grid point - Fits quadratic surface (9 points, 6 coefficients) via precomputed pseudoinverse - Solves 2x2 system analytically for the peak - Fully vectorized across all links — microseconds per link vs milliseconds for scipy.fmin - Falls back to grid-only for non-concave stencils Removed dead code - solve(), wrap_solve(), _bounded_fmin() — no longer called - scipy.optimize, multiprocessing, neg_temporal_coherence imports - Kept _vectorized_grid_search (directly tested) Expected speedup Based on the profiling data (4512s total model estimation across 12 processes): - Grid search construction overhead: eliminated entirely - Nelder-Mead (~2744s aggregated): replaced with vectorized quadratic refinement - Multiprocessing IPC overhead: eliminated - The batched matmul replaces N individual grid searches with one BLAS call
Switch all multiprocessing from fork to forkserver context so worker processes don't inherit the parent's full address space. With fork, each worker got a copy of the SLC data, gradients, and link model arrays — Python's reference counting dirtied COW pages, causing 8 workers to balloon to 108 GB on macOS. Use Pool/Executor initializers to share constant data (dual graph arrays, solver, cost) once per worker instead of per task. Also chunk the grid search matmul in estimate_model_many so the intermediate (ngrid, nlinks) coherence matrix stays under ~512 MB instead of allocating a single 49 GB array for large grids. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
write_link_params was always re-integrating tile gradients and overwriting velocity/DEM error/coherence rasters even on reruns. Now checks if all expected output files exist before doing any expensive tile integration work. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
(Opening to make others aware, but some cleanup should probably be done)
I've been playing with an addition to spurt that @piyushrpt described a long time ago, but we never added her: During the temporal unwrapping stage, a per-link phase velocity and dem-error can be estimated and removed before spatially unwrapping. The makes a noticeable difference on some high-res urban scenes with large sections of layover and big jumps due to DEM errors (e.g. the edge of the top of a sky scraper will be coherence, but the phase is totally unrelated to the next pixel over on the ground).
This has a lot in it, but the main things are
--baseline-csvfile listing the perpendicular baseline terms. this is the most open to reworking, since i just sort of picked a format that made sense and matched my own current processor.The fullest way to test would be to use my dolphin branch here, which added the plumbing to dolphin's unwrapper options.
ok now the claude summary:
What's new
Feature:
SLC[0] as reference.
sensitivity (rad per m). Uses sin(θ) for the DEM-error term (previously reviewed & corrected on-branch).
with a quadratic fit over the 3×3 coherence stencil around the best grid point. Fully vectorized across links.
New CLI args (all additive, all default to current behavior):
New settings type: LinkModelSettings with validation (wavelength > 0, look angle ∈ (0, 90), etc.).
Docs:
Tests added (~1100 lines):
Bundled fixes (called out so reviewers know they aren't DEM-error-specific)
solver state) are shared once per worker instead of inheriting the parent's entire address space. On macOS, COW sharing breaks down under Python refcounting, causing per-worker RSS to balloon. This is a
general-purpose fix and would be valuable independent of this PR.
How to use
python -m spurt.workflows.emcf
-i /path/to/phase-linked-stack
--baseline-csv /path/to/baselines.csv
--wavelength 0.031
--look-angle-deg 35
Outputs appear in the output folder:
Things reviewers should look at
_solver.py:178-180.