Skip to content

Centralize IMAGE_TAG_PREFIX for Docker image tags#473

Merged
simonrosenberg merged 12 commits intomainfrom
centralize-image-tag-prefix
Mar 3, 2026
Merged

Centralize IMAGE_TAG_PREFIX for Docker image tags#473
simonrosenberg merged 12 commits intomainfrom
centralize-image-tag-prefix

Conversation

@simonrosenberg
Copy link
Collaborator

@simonrosenberg simonrosenberg commented Mar 2, 2026

Summary

Extracted from #455 — this is the IMAGE_TAG_PREFIX centralization refactor, isolated for easier review.

All benchmark runners previously hardcoded SDK_SHORT_SHA for Docker image tags, with each file duplicating an os.getenv("SDK_SHORT_SHA", SDK_SHORT_SHA) override pattern in the remote workspace path. This centralizes the tag prefix into a single constant.

  • New IMAGE_TAG_PREFIX in version.py — single source of truth for the Docker image tag prefix, with SDK_SHORT_SHA env var honored as a deprecated fallback
  • All runners import and use IMAGE_TAG_PREFIX directly — no per-file env var fallback, removing ~7 duplicated os.getenv() calls
  • EVAL_AGENT_SERVER_IMAGE in constants.py is now overridable via OPENHANDS_EVAL_AGENT_SERVER_IMAGE env var (enables custom registries)
  • modal_patches.py updated to propagate IMAGE_TAG_PREFIXSDK_SHA is no longer propagated; SDK_SHORT_SHA is still set (to the value of IMAGE_TAG_PREFIX) for backward compatibility
  • Unit tests for IMAGE_TAG_PREFIX resolution covering default, env var override, deprecated fallback, and precedence

⚠️ Breaking change: SDK_SHORT_SHA env var deprecated in favor of IMAGE_TAG_PREFIX

The environment variable used to override Docker image tag prefixes has been renamed. The old SDK_SHORT_SHA env var is still honored as a fallback but emits a DeprecationWarning and will be removed in a future version.

Before After
SDK_SHORT_SHA=custom_value IMAGE_TAG_PREFIX=custom_value

Who is affected: Anyone setting export SDK_SHORT_SHA=... in their environment to override image tags at runtime. The SDK_SHORT_SHA Python constant in version.py is unchanged — only the env var override mechanism is renamed.

Migration: Replace SDK_SHORT_SHA with IMAGE_TAG_PREFIX in your environment, CI scripts, or .env files:

- export SDK_SHORT_SHA=my-custom-tag
+ export IMAGE_TAG_PREFIX=my-custom-tag

Modal environments: SDK_SHA is no longer propagated as an env var into Modal execution environments. SDK_SHORT_SHA is still propagated for backward compatibility (set to the value of IMAGE_TAG_PREFIX), but code running inside Modal should migrate to reading os.getenv("IMAGE_TAG_PREFIX") instead. The SDK_SHORT_SHA propagation will be removed in a future version.

Rationale: The old name (SDK_SHORT_SHA) was misleading — it implied a git SHA but was being used as a generic image tag prefix. The new name (IMAGE_TAG_PREFIX) accurately describes its purpose and is no longer coupled to the SDK submodule naming.

Files changed

File Change
benchmarks/utils/version.py New IMAGE_TAG_PREFIX constant with deprecated SDK_SHORT_SHA fallback
benchmarks/utils/constants.py EVAL_AGENT_SERVER_IMAGE overridable via env var
benchmarks/utils/modal_patches.py Propagate IMAGE_TAG_PREFIX; keep SDK_SHORT_SHA for backward compat; drop SDK_SHA
benchmarks/commit0/run_infer.py Use IMAGE_TAG_PREFIX, remove per-file env var fallback
benchmarks/gaia/run_infer.py Use IMAGE_TAG_PREFIX, remove per-file env var fallback
benchmarks/multiswebench/run_infer.py Use IMAGE_TAG_PREFIX, remove per-file env var fallback
benchmarks/swebench/run_infer.py Use IMAGE_TAG_PREFIX, remove per-file env var fallback
benchmarks/swebenchmultimodal/run_infer.py Use IMAGE_TAG_PREFIX, remove per-file env var fallback
benchmarks/swefficiency/run_infer.py Use IMAGE_TAG_PREFIX, remove per-file env var fallback + eliminate redundant remote_agent_image variable
benchmarks/swtbench/run_infer.py Use IMAGE_TAG_PREFIX, remove per-file env var fallback
tests/test_version.py Unit tests for IMAGE_TAG_PREFIX resolution
uv.lock Regenerated for SDK submodule v1.11.5

Validation

These changes were validated as part of #455 with CI benchmarks (eval_limit=1). Results from this comment:

Benchmark Status Run
gaia actions/runs/22590769837
swebench actions/runs/22590769394
swtbench actions/runs/22590775404
commit0 actions/runs/22590778789
swebenchmultimodal actions/runs/22590785406

Test plan

  • CI validation with eval_limit=1 across 5 benchmarks (see table above)
  • Unit tests for IMAGE_TAG_PREFIX resolution (default, env override, deprecated fallback, precedence)
  • Pre-commit checks pass (ruff format, ruff lint, pycodestyle, pyright)

🤖 Generated with Claude Code

All benchmark runners previously hardcoded SDK_SHORT_SHA for Docker
image tags, with each file duplicating an
os.getenv("SDK_SHORT_SHA", SDK_SHORT_SHA) override pattern. This
centralizes the tag prefix into a single constant.

- New IMAGE_TAG_PREFIX = os.getenv("IMAGE_TAG_PREFIX", SDK_SHORT_SHA)
  in version.py
- All runners import and use IMAGE_TAG_PREFIX directly with no per-file
  env var fallback
- EVAL_AGENT_SERVER_IMAGE in constants.py is now overridable via
  OPENHANDS_EVAL_AGENT_SERVER_IMAGE env var
- modal_patches.py updated to propagate IMAGE_TAG_PREFIX (replaces
  SDK_SHA/SDK_SHORT_SHA pair)

Extracted from #455.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Taste Rating: 🟡 Acceptable

The refactoring itself is solid - eliminating code duplication with a single source of truth is the right pattern. However, this breaks existing env var usage without a deprecation path.

[CRITICAL ISSUES]

Breaking Change: The env var name changed from SDK_SHORT_SHA to IMAGE_TAG_PREFIX. Anyone setting SDK_SHORT_SHA=custom in their environment to override image tags will find it no longer works. This violates the "never break userspace" principle.

Modal Environment: SDK_SHORT_SHA is no longer propagated to modal execution environments - any code inside modal reading this env var will break.

[IMPROVEMENT OPPORTUNITIES]

Add backward compatibility: Read SDK_SHORT_SHA env var as fallback if IMAGE_TAG_PREFIX is not set, with a deprecation warning.

Complete test plan: The two env var override tests are unchecked - verify they work before merging.

[VERDICT]
Needs rework: Add backward compatibility layer or explicitly document the breaking change in PR description and migration guide.

KEY INSIGHT: Good refactoring pattern, but environment variable rename breaks existing users without migration path.

simonrosenberg and others added 3 commits March 3, 2026 06:36
Address review feedback: the env var rename from SDK_SHORT_SHA to
IMAGE_TAG_PREFIX is a breaking change for existing users.

- version.py: fall back to SDK_SHORT_SHA env var when IMAGE_TAG_PREFIX
  is not set, emitting a DeprecationWarning
- modal_patches.py: continue propagating SDK_SHORT_SHA to modal
  execution environments alongside IMAGE_TAG_PREFIX
- modal_patches.py: _get_image_tag_prefix() fallback also checks
  SDK_SHORT_SHA when IMAGE_TAG_PREFIX is unset

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Resolve conflict between IMAGE_TAG_PREFIX refactor (which removed the
redundant remote_agent_image variable) and the image_exists →
remote_image_exists rename from main. Keep both changes: use
remote_image_exists with agent_server_image directly.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Taste Rating: 🟡 Acceptable

The refactoring eliminates real code duplication with a single source of truth - that's solid engineering. However, this breaks userspace without a proper deprecation path.

[CRITICAL ISSUES]

The previous review thread mentioned "Added a deprecation warning in the PR description" - but documentation is not the same as runtime warning. Users with SDK_SHORT_SHA set in their environment will silently fail with no indication their config is being ignored. The code should warn them, not just the PR description.

See inline comments for specific fixes.

- version.py: fall back to SDK_SHORT_SHA env var when IMAGE_TAG_PREFIX
  is not set, emitting a DeprecationWarning so users know to migrate
- modal_patches.py: _get_image_tag_prefix() fallback also checks
  SDK_SHORT_SHA when IMAGE_TAG_PREFIX is unset

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
all-hands-bot

This comment was marked as outdated.

Copy link
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Taste Rating: 🟡 Acceptable

The refactoring itself is solid - eliminating ~7 instances of duplicated env var fallback logic with a single constant is good engineering. However, two previous review threads were marked "resolved" but the requested changes are missing from the code.

simonrosenberg and others added 2 commits March 3, 2026 10:50
Co-authored-by: OpenHands Bot <contact@all-hands.dev>
- swebench/run_infer.py: change "image tag prefix:" to "tag prefix:"
  for consistency with all other benchmark runners
- tests/test_version.py: add unit tests for IMAGE_TAG_PREFIX resolution
  covering default, env var override, deprecated SDK_SHORT_SHA fallback,
  and precedence when both env vars are set

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@simonrosenberg simonrosenberg force-pushed the centralize-image-tag-prefix branch from 85b48d2 to 44ff927 Compare March 3, 2026 13:58
simonrosenberg and others added 2 commits March 3, 2026 11:01
Regenerate lock file to match the current vendor/software-agent-sdk
submodule version (1.10.0 → 1.11.5).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Taste Rating: 🟡 Acceptable

The refactoring itself is solid engineering - eliminating ~7 instances of duplicated env var fallback logic with a single source of truth. Previous review concerns (missing tests, missing deprecation warning, inconsistent wording) have all been properly addressed.

However: breaking Modal userspace without a transition period violates the "never break userspace" principle, even if documented. One minor scope creep item noted below.

Co-authored-by: OpenHands Bot <contact@all-hands.dev>
all-hands-bot

This comment was marked as outdated.

Three lines were accidentally left-aligned at column 0 instead of being
indented inside the _inject_modal_sitecustomize() function body.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Taste Rating: 🟢 Good taste

This is solid refactoring work. Eliminates ~7 instances of duplicated env var fallback logic with a single source of truth. Clean, pragmatic solution.

What's Right:

  • Real problem solved (code duplication)
  • Simple implementation, no unnecessary complexity
  • Proper deprecation path with runtime warnings
  • Backward compatibility maintained (Modal still gets SDK_SHORT_SHA)
  • Good test coverage that validates real behavior

Verdict: ✅ Worth merging

Key Insight: This is how you refactor infrastructure code - eliminate duplication, add tests, don't break userspace. Previous review concerns were properly addressed.

@simonrosenberg simonrosenberg merged commit a61f388 into main Mar 3, 2026
3 checks passed
simonrosenberg added a commit that referenced this pull request Mar 3, 2026
Resolve conflicts after sub-PRs #471, #472, #473 were merged to main:
- Take main's SDK_SHORT_SHA deprecation handling in version.py
- Take main's backward-compat SDK_SHORT_SHA in modal_patches.py
- Take main's log message wording in swebench/run_infer.py
- Remove redundant prompt_dir setup (superseded by add_prompt_path_argument)
- Keep lazy imports for git-dependent modules (build_utils.py, swtbench/run_infer.py)
- Take main's comment cleanup in swtbench/eval_infer.py

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants