Bake numba JIT cache into model-worker#1398
Merged
Merged
Conversation
Merged
NUMBA_CPU_FEATURES= is Unlikely to slow exec (i hope) NUMBA_CPU_NAME=x86-64-v3 tells LLVM to compile targeting the x86-64-v3 microarchitecture level, which already implies AVX2, FMA, BMI1/BMI2, F16C, LZCNT, MOVBE — the SIMD instructions that matter for numeric/financial workloads. Those are still active. What NUMBA_CPU_FEATURES= suppresses is the host-specific extras that auto-detection adds on top: ┌─────────────────┬─────────────────────────────────────────────┐ │ Machine │ Auto-detected extras │ ├─────────────────┼─────────────────────────────────────────────┤ │ AMD Zen3 │ sha, sse4a, clzero, rdpru, mwaitx, wbnoinvd │ ├─────────────────┼─────────────────────────────────────────────┤ │ Intel (typical) │ avx512f, avx512bw, various Intel extensions │ └─────────────────┴─────────────────────────────────────────────┘ For oasislmf's pytools workloads (array arithmetic, loss calculations, stream I/O), none of those vendor-specific extensions are in the hot path — Numba wouldn't generate SHA instructions for financial model maths regardless. The meaningful speedup from AVX2 and FMA is preserved. The only scenario where you'd see a slowdown is if a future oasislmf function was explicitly written to exploit avx512f on Intel. At that point you'd want to reconsider the portability trade-off — but for now there's no practical impact.
sstruzik
approved these changes
Jun 1, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Bake numba JIT cache into model-worker
Notes:
src/utils/warmup-serial.pyis needed to avoid race condition by calling the oasislmf warmup withwarmup(max_workers=1)possible fix in Fixes for warm up code OasisLMF#2001 could mean its not needed/home/worker/.numba_jit_cacheusingNUMBA_CACHE_DIRNUMBA_CPU_FEATURESis set to blankscripts/test-jit-cache.shto check that an running an image will used the cache files by re-running warm-up and checking the cache for regenerated files*.pyfiles in oasislmf MUST BE timestamp normalized to work. this happens automatically when building on an image pulled from a reg. To apply the same locally we used touch to zero the timestamp,RUN find /root/.local -exec touch -d @0 {} +Numba JIT Cache — Baked into Docker Image
Problem
The
model_workerimage was not using pre-compiled Numba JIT cache files. Although.nbi/.nbcfiles were present in the image, Numba consideredthem invalid at runtime and recompiled everything from scratch on first use, adding significant startup latency.
Root Cause
Numba validates its cache by comparing the mtime of the source
.pyfile recorded in the.nbiindex against the actual mtime of the file atruntime. If they differ, the cache is considered stale.
In a multi-stage Docker build,
pip installsets file timestamps to the current build time. When those files are copied between stages (e.g.COPY --from=jit-warmup), Docker can reset or drift the mtimes, causing a mismatch between what Numba recorded during warmup and what it sees at runtime.Registry-pulled images (e.g.
coreoasis/model_worker:2.5.3) don't have this problem because their layer timestamps are fixed at the time the image wasoriginally built and pushed — they never change.
Solution
Two changes to
Dockerfile.model_worker:1. Normalise pip-installed file timestamps to epoch 0
Added to the
build-packagesstage, after allpip installsteps: