Centralize IMAGE_TAG_PREFIX for Docker image tags#473
Conversation
All benchmark runners previously hardcoded SDK_SHORT_SHA for Docker
image tags, with each file duplicating an
os.getenv("SDK_SHORT_SHA", SDK_SHORT_SHA) override pattern. This
centralizes the tag prefix into a single constant.
- New IMAGE_TAG_PREFIX = os.getenv("IMAGE_TAG_PREFIX", SDK_SHORT_SHA)
in version.py
- All runners import and use IMAGE_TAG_PREFIX directly with no per-file
env var fallback
- EVAL_AGENT_SERVER_IMAGE in constants.py is now overridable via
OPENHANDS_EVAL_AGENT_SERVER_IMAGE env var
- modal_patches.py updated to propagate IMAGE_TAG_PREFIX (replaces
SDK_SHA/SDK_SHORT_SHA pair)
Extracted from #455.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
all-hands-bot
left a comment
There was a problem hiding this comment.
Taste Rating: 🟡 Acceptable
The refactoring itself is solid - eliminating code duplication with a single source of truth is the right pattern. However, this breaks existing env var usage without a deprecation path.
[CRITICAL ISSUES]
Breaking Change: The env var name changed from SDK_SHORT_SHA to IMAGE_TAG_PREFIX. Anyone setting SDK_SHORT_SHA=custom in their environment to override image tags will find it no longer works. This violates the "never break userspace" principle.
Modal Environment: SDK_SHORT_SHA is no longer propagated to modal execution environments - any code inside modal reading this env var will break.
[IMPROVEMENT OPPORTUNITIES]
Add backward compatibility: Read SDK_SHORT_SHA env var as fallback if IMAGE_TAG_PREFIX is not set, with a deprecation warning.
Complete test plan: The two env var override tests are unchecked - verify they work before merging.
[VERDICT]
❌ Needs rework: Add backward compatibility layer or explicitly document the breaking change in PR description and migration guide.
KEY INSIGHT: Good refactoring pattern, but environment variable rename breaks existing users without migration path.
Address review feedback: the env var rename from SDK_SHORT_SHA to IMAGE_TAG_PREFIX is a breaking change for existing users. - version.py: fall back to SDK_SHORT_SHA env var when IMAGE_TAG_PREFIX is not set, emitting a DeprecationWarning - modal_patches.py: continue propagating SDK_SHORT_SHA to modal execution environments alongside IMAGE_TAG_PREFIX - modal_patches.py: _get_image_tag_prefix() fallback also checks SDK_SHORT_SHA when IMAGE_TAG_PREFIX is unset Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This reverts commit 1d3f778.
Resolve conflict between IMAGE_TAG_PREFIX refactor (which removed the redundant remote_agent_image variable) and the image_exists → remote_image_exists rename from main. Keep both changes: use remote_image_exists with agent_server_image directly. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
all-hands-bot
left a comment
There was a problem hiding this comment.
Taste Rating: 🟡 Acceptable
The refactoring eliminates real code duplication with a single source of truth - that's solid engineering. However, this breaks userspace without a proper deprecation path.
[CRITICAL ISSUES]
The previous review thread mentioned "Added a deprecation warning in the PR description" - but documentation is not the same as runtime warning. Users with SDK_SHORT_SHA set in their environment will silently fail with no indication their config is being ignored. The code should warn them, not just the PR description.
See inline comments for specific fixes.
- version.py: fall back to SDK_SHORT_SHA env var when IMAGE_TAG_PREFIX is not set, emitting a DeprecationWarning so users know to migrate - modal_patches.py: _get_image_tag_prefix() fallback also checks SDK_SHORT_SHA when IMAGE_TAG_PREFIX is unset Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This reverts commit 80b38b2.
all-hands-bot
left a comment
There was a problem hiding this comment.
Taste Rating: 🟡 Acceptable
The refactoring itself is solid - eliminating ~7 instances of duplicated env var fallback logic with a single constant is good engineering. However, two previous review threads were marked "resolved" but the requested changes are missing from the code.
Co-authored-by: OpenHands Bot <contact@all-hands.dev>
- swebench/run_infer.py: change "image tag prefix:" to "tag prefix:" for consistency with all other benchmark runners - tests/test_version.py: add unit tests for IMAGE_TAG_PREFIX resolution covering default, env var override, deprecated SDK_SHORT_SHA fallback, and precedence when both env vars are set Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
85b48d2 to
44ff927
Compare
Regenerate lock file to match the current vendor/software-agent-sdk submodule version (1.10.0 → 1.11.5). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
all-hands-bot
left a comment
There was a problem hiding this comment.
Taste Rating: 🟡 Acceptable
The refactoring itself is solid engineering - eliminating ~7 instances of duplicated env var fallback logic with a single source of truth. Previous review concerns (missing tests, missing deprecation warning, inconsistent wording) have all been properly addressed.
However: breaking Modal userspace without a transition period violates the "never break userspace" principle, even if documented. One minor scope creep item noted below.
Co-authored-by: OpenHands Bot <contact@all-hands.dev>
Three lines were accidentally left-aligned at column 0 instead of being indented inside the _inject_modal_sitecustomize() function body. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
all-hands-bot
left a comment
There was a problem hiding this comment.
Taste Rating: 🟢 Good taste
This is solid refactoring work. Eliminates ~7 instances of duplicated env var fallback logic with a single source of truth. Clean, pragmatic solution.
What's Right:
- Real problem solved (code duplication)
- Simple implementation, no unnecessary complexity
- Proper deprecation path with runtime warnings
- Backward compatibility maintained (Modal still gets SDK_SHORT_SHA)
- Good test coverage that validates real behavior
Verdict: ✅ Worth merging
Key Insight: This is how you refactor infrastructure code - eliminate duplication, add tests, don't break userspace. Previous review concerns were properly addressed.
Resolve conflicts after sub-PRs #471, #472, #473 were merged to main: - Take main's SDK_SHORT_SHA deprecation handling in version.py - Take main's backward-compat SDK_SHORT_SHA in modal_patches.py - Take main's log message wording in swebench/run_infer.py - Remove redundant prompt_dir setup (superseded by add_prompt_path_argument) - Keep lazy imports for git-dependent modules (build_utils.py, swtbench/run_infer.py) - Take main's comment cleanup in swtbench/eval_infer.py Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Summary
Extracted from #455 — this is the
IMAGE_TAG_PREFIXcentralization refactor, isolated for easier review.All benchmark runners previously hardcoded
SDK_SHORT_SHAfor Docker image tags, with each file duplicating anos.getenv("SDK_SHORT_SHA", SDK_SHORT_SHA)override pattern in the remote workspace path. This centralizes the tag prefix into a single constant.IMAGE_TAG_PREFIXinversion.py— single source of truth for the Docker image tag prefix, withSDK_SHORT_SHAenv var honored as a deprecated fallbackIMAGE_TAG_PREFIXdirectly — no per-file env var fallback, removing ~7 duplicatedos.getenv()callsEVAL_AGENT_SERVER_IMAGEinconstants.pyis now overridable viaOPENHANDS_EVAL_AGENT_SERVER_IMAGEenv var (enables custom registries)modal_patches.pyupdated to propagateIMAGE_TAG_PREFIX—SDK_SHAis no longer propagated;SDK_SHORT_SHAis still set (to the value ofIMAGE_TAG_PREFIX) for backward compatibilityIMAGE_TAG_PREFIXresolution covering default, env var override, deprecated fallback, and precedenceSDK_SHORT_SHAenv var deprecated in favor ofIMAGE_TAG_PREFIXThe environment variable used to override Docker image tag prefixes has been renamed. The old
SDK_SHORT_SHAenv var is still honored as a fallback but emits aDeprecationWarningand will be removed in a future version.SDK_SHORT_SHA=custom_valueIMAGE_TAG_PREFIX=custom_valueWho is affected: Anyone setting
export SDK_SHORT_SHA=...in their environment to override image tags at runtime. TheSDK_SHORT_SHAPython constant inversion.pyis unchanged — only the env var override mechanism is renamed.Migration: Replace
SDK_SHORT_SHAwithIMAGE_TAG_PREFIXin your environment, CI scripts, or.envfiles:Modal environments:
SDK_SHAis no longer propagated as an env var into Modal execution environments.SDK_SHORT_SHAis still propagated for backward compatibility (set to the value ofIMAGE_TAG_PREFIX), but code running inside Modal should migrate to readingos.getenv("IMAGE_TAG_PREFIX")instead. TheSDK_SHORT_SHApropagation will be removed in a future version.Rationale: The old name (
SDK_SHORT_SHA) was misleading — it implied a git SHA but was being used as a generic image tag prefix. The new name (IMAGE_TAG_PREFIX) accurately describes its purpose and is no longer coupled to the SDK submodule naming.Files changed
benchmarks/utils/version.pyIMAGE_TAG_PREFIXconstant with deprecatedSDK_SHORT_SHAfallbackbenchmarks/utils/constants.pyEVAL_AGENT_SERVER_IMAGEoverridable via env varbenchmarks/utils/modal_patches.pyIMAGE_TAG_PREFIX; keepSDK_SHORT_SHAfor backward compat; dropSDK_SHAbenchmarks/commit0/run_infer.pyIMAGE_TAG_PREFIX, remove per-file env var fallbackbenchmarks/gaia/run_infer.pyIMAGE_TAG_PREFIX, remove per-file env var fallbackbenchmarks/multiswebench/run_infer.pyIMAGE_TAG_PREFIX, remove per-file env var fallbackbenchmarks/swebench/run_infer.pyIMAGE_TAG_PREFIX, remove per-file env var fallbackbenchmarks/swebenchmultimodal/run_infer.pyIMAGE_TAG_PREFIX, remove per-file env var fallbackbenchmarks/swefficiency/run_infer.pyIMAGE_TAG_PREFIX, remove per-file env var fallback + eliminate redundantremote_agent_imagevariablebenchmarks/swtbench/run_infer.pyIMAGE_TAG_PREFIX, remove per-file env var fallbacktests/test_version.pyIMAGE_TAG_PREFIXresolutionuv.lockValidation
These changes were validated as part of #455 with CI benchmarks (
eval_limit=1). Results from this comment:Test plan
eval_limit=1across 5 benchmarks (see table above)IMAGE_TAG_PREFIXresolution (default, env override, deprecated fallback, precedence)🤖 Generated with Claude Code