Fix fork+threading deadlock in SWE-Bench image builds by juanmichelini · Pull Request #486 · OpenHands/benchmarks

juanmichelini · 2026-03-05T13:48:00Z

Summary

This PR fixes the root cause of issue #476 where SWE-Bench image builds get stuck for 5+ hours during the batch build process.

Root Cause

Commit 2bfcc6c (#456) introduced the following import in benchmarks/utils/image_utils.py:

from openhands.sdk import get_logger
logger = get_logger(__name__)

When this module is imported, the SDK logger auto-initializes with a RichHandler (from the rich library), which uses locks and potentially threads for console rendering. The issue occurs when:

build_utils.py imports image_utils, triggering logger initialization in the parent process
build_all_images() creates a ProcessPoolExecutor using fork (default on Linux)
Fork copies the parent process, including Rich logger's locks in their current state
Child processes deadlock waiting for locks that will never be released

This is the exact same fork+threading deadlock that commit 744df225 (#459) fixed in evaluation.py, but that fix only applied to evaluation.py and not to build_utils.py.

Evidence

From commit 744df225 (#459):

Root cause: ProcessPoolExecutor uses fork() by default on Linux, which is unsafe when the parent process has threads. When fork() copies a process, it copies locks in their current state. If a thread holds a lock during fork(), the child process deadlocks waiting for that lock forever.

Evidence from Datadog logs:

Warning: 'This process is multi-threaded, use of fork() may lead to deadlocks'

Workers stuck in futex_wait_queue (mutex wait)

84/500 instances succeeded before deadlock (timing-dependent)

The same pattern applies to image building:

Multiple workers (12 by default) get stuck in deadlock
The number of images built before deadlock is timing-dependent
Builds hang for hours without completing

Solution

Use spawn multiprocessing context instead of fork in build_all_images(), matching the fix from commit 744df225 for evaluation.py.

The spawn method starts fresh Python processes instead of forking, avoiding the fork+threads deadlock entirely.

Changes

Added import multiprocessing to build_utils.py
Modified ProcessPoolExecutor instantiation to use mp_context=multiprocessing.get_context("spawn")
Added explanatory comments matching the pattern from evaluation.py

Testing

✅ All pre-commit checks pass
✅ Type checking passes
✅ Linting passes

The fix should be verified by running a full 500-image build on GitHub Actions, which was previously getting stuck.

Fixes SWE-Bench image builds stuck for 5+ hours, blocking evaluation jobs #476
Follows the same pattern as commit 744df225 (Fix fork+threading deadlock by switching to spawn multiprocessing #459)
Related to PR Optimize SWE-Bench image builds: add local Docker check before remote registry #481 (which addressed symptoms but not the root cause)

Root cause: Commit 2bfcc6c added 'from openhands.sdk import get_logger' to image_utils.py, which auto-initializes RichHandler with locks/threads. When build_utils.py uses ProcessPoolExecutor with fork (default on Linux), it copies the process with Rich logger's locks in their current state, causing child processes to deadlock waiting for locks that will never be released. Solution: Use 'spawn' multiprocessing context instead of 'fork' in build_all_images(), matching the fix in commit 744df22 for evaluation.py. This prevents the fork+threading deadlock by starting fresh Python processes instead of forking processes with inherited thread state. Fixes #476 Co-authored-by: openhands <openhands@all-hands.dev>

openhands-ai bot mentioned this pull request Mar 5, 2026

SWE-Bench image builds stuck for 5+ hours, blocking evaluation jobs #476

Open

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix fork+threading deadlock in SWE-Bench image builds#486

Fix fork+threading deadlock in SWE-Bench image builds#486
juanmichelini wants to merge 1 commit intomainfrom
openhands/fix-image-build-deadlock

juanmichelini commented Mar 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

juanmichelini commented Mar 5, 2026

Summary

Root Cause

Evidence

Solution

Changes

Testing

Related

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants