Skip to content

feat(cli): add --gpu-capture to use hardware ANGLE in frame capture#471

Open
roiizchak wants to merge 7 commits intoheygen-com:mainfrom
roiizchak:feat/gpu-capture-flag
Open

feat(cli): add --gpu-capture to use hardware ANGLE in frame capture#471
roiizchak wants to merge 7 commits intoheygen-com:mainfrom
roiizchak:feat/gpu-capture-flag

Conversation

@roiizchak
Copy link
Copy Markdown
Contributor

What

Adds an opt-in --gpu-capture flag (and matching HYPERFRAMES_GPU_CAPTURE=1|true env var) to the render and capture commands. When set, HyperFrames launches headless Chromium with a platform-specific hardware ANGLE backend (D3D11 on Windows, Metal on macOS, OpenGL on Linux) instead of the current hardcoded SwiftShader software path.

Default behavior is unchanged — --use-angle=swiftshader stays the out-of-the-box choice, preserving the "works in any CI / Docker / headless-box" guarantee.

Why

Stage-4 frame capture today forces every WebGL canvas through SwiftShader regardless of whether the host has a working GPU. On shader-heavy compositions this made capture the render bottleneck (~80-90% of wall time) even on machines with fully functional GPU drivers, while the --gpu stage-5 NVENC work landed in #442 already assumes a usable GPU.

There is also a silent-deprecation tax: Chromium emits "Automatic fallback to software WebGL has been deprecated. Please use the --enable-unsafe-swiftshader flag to opt in to lower security guarantees for trusted content" on every render today. That warning goes away the moment a hardware ANGLE backend is selected.

Measured on a local shader-heavy smoke rig (3× chained registry blocks: swirl-vortex + sdf-iris + glitch, 12s @ 1920×1080, 360 frames):

workers SwiftShader (before) D3D11 (after) delta
1 32s 24s 25% faster
3 20s 15s 25% faster

PSNR average between the SwiftShader and D3D11 renders: 54.7 dB (min 49.9 dB), well above the 40 dB visual-equivalence threshold. GPU rasterization differs by 1-2 px at anti-aliased edges but the visual intent is identical. Verified on Windows 11 + RTX 4080 + Chrome-for-testing bundled with Puppeteer 24.x.

The speedup stayed at ~25% rather than the 2-5× you'd see for full-frame WebGL work; these registry transition blocks rasterize small canvases inside larger DOM pages, so screenshot IPC + JPEG encoding dominate at this scale. A composition with continuous full-screen WebGL would win more.

How

packages/engine/src/config.ts — new EngineConfig.gpuCapture: boolean, defaulting to false. resolveConfig() picks up HYPERFRAMES_GPU_CAPTURE via a small envBoolOrOne helper that accepts "true" or "1". The "1" form is a deliberate ergonomic deviation from the existing PRODUCER_* env vars (which are "true"-only) to match the local-dev shorthand devs already use — documented inline in the resolver.

packages/engine/src/services/browserManager.tsbuildChromeArgs's config type gains gpuCapture. A small pure resolveAngleBackend(gpuCapture) helper selects the ANGLE flag: swiftshader when off, d3d11 / metal / opengl based on process.platform when on.

packages/cli/src/commands/render.ts — new citty "gpu-capture" boolean option next to gpu. When set, writes process.env.HYPERFRAMES_GPU_CAPTURE = "true" before spawning producer workers. This follows the same env-propagation pattern the command already uses for PRODUCER_MAX_CONCURRENT_RENDERS, avoiding per-RenderJob plumbing.

packages/cli/src/capture/index.tscaptureWebsite() inlines the same platform-aware backend selection so hyperframes capture honors the env var for parity with render. Kept inline (no cross-package engine dep) to preserve capture's current loose coupling.

Deliberately not changed: packages/cli/src/commands/snapshot.ts, packages/cli/src/commands/validate.ts, packages/producer/src/parity-harness.ts, packages/player/tests/perf/runner.ts, packages/engine/scripts/test-fitTextFontSize-browser.ts. These are dev-internal / test-harness launches where the SwiftShader default is fine; happy to broaden scope if you'd prefer consistency everywhere.

Test plan

  • Unit tests added — 12 new tests:
    • packages/engine/src/config.test.ts (+5) — default off, =true on, =1 on, other strings fall through, explicit override beats env.
    • packages/engine/src/services/browserManager.test.ts (new, 7) — pins args output: SwiftShader default, D3D11 on win32, Metal on darwin, OpenGL on linux, --use-gl=angle stays regardless of backend, --disable-gpu only appended under disableGpu, gpuCapture + disableGpu compose without conflict.
  • Engine test suite green: bun run --filter @hyperframes/engine test — full suite passes on the two files I touched; didn't re-run the full 359-test engine sweep so the Windows videoFrameExtractor.test.ts fontconfig flake mentioned in fix(engine): accept libx264 preset names with NVENC and QSV #442 may still repro there, unrelated to this PR.
  • Lefthook pre-commit green on every commit (lint / format / typecheck / commitlint).
  • End-to-end on Windows 11 + RTX 4080: probe-gpu.js confirms UNMASKED_RENDERER_WEBGL reports ANGLE (NVIDIA, NVIDIA GeForce RTX 4080 (0x00002704) Direct3D11 vs_5_0 ps_5_0, D3D11) with the flag on and ANGLE (Google, Vulkan 1.3.0 (SwiftShader Device (Subzero))) with it off. The SwiftShader deprecation warning disappears from render output under --gpu-capture.
  • Regression: hyperframes render --quality standard --gpu --gpu-capture --workers 3 on an audio-containing rig produces h264 + AAC 48kHz/stereo audio intact — no interaction with the test(engine): cover spawnStreamingEncoder lifecycle and cleanup paths #380 streaming-encoder audio path.
  • Documentation — new docs/guides/gpu-capture.mdx registered in docs.json: when to use it, platform backend matrix, Docker/CI caveats (needs nvidia-container-runtime or equivalent), visual-drift expectations, and a Puppeteer renderer-info probe snippet for diagnosing silent fallback.

Happy to split the capture-command and docs changes into follow-up PRs if you'd prefer a narrower first cut.

roi32 added 4 commits April 24, 2026 11:11
HyperFrames' stage-4 frame capture hardcoded --use-angle=swiftshader,
forcing software WebGL regardless of host GPU. On shader-heavy
compositions this made stage 4 the render bottleneck (~80-90% of wall
time) even on machines with working GPU drivers.

Add an opt-in gpuCapture field to EngineConfig. When enabled,
buildChromeArgs swaps the SwiftShader flag for a platform-specific
hardware ANGLE backend: D3D11 on Windows, Metal on macOS, OpenGL on
Linux. Default is unchanged (SwiftShader) to preserve cross-platform
compatibility on CI/Docker without GPU passthrough.

Env fallback: HYPERFRAMES_GPU_CAPTURE=true (or =1 for ergonomics
matching local-dev conventions — the only engine env var accepting
both, documented inline).

Measured on a local shader-heavy rig (3x chained swirl-vortex/sdf-iris/
glitch blocks, 360 frames at 1920x1080): wall time drops 25% with
workers=1 (32s -> 24s) and 25% with workers=3 (20s -> 15s). PSNR
vs the SwiftShader render averages 54.7 dB — well above the 40 dB
visual-equivalence threshold.
Surface the new engine gpuCapture config through the CLI:

- render command: add --gpu-capture boolean flag. When set, exports
  HYPERFRAMES_GPU_CAPTURE=true in the process env before invoking the
  producer. The engine's resolveConfig() picks it up in each worker,
  avoiding per-RenderJob plumbing (same pattern used for
  PRODUCER_MAX_CONCURRENT_RENDERS).

- capture (website-to-video) command: inline the same platform-aware
  ANGLE backend selection, gated on HYPERFRAMES_GPU_CAPTURE. Accepts
  both "true" and "1" for parity with the env-var fallback documented
  in the engine config.

--gpu and --gpu-capture are independent: --gpu drives stage-5 NVENC
encoding, --gpu-capture drives the stage-4 Chromium backend. Combine
freely.
…lection

- config.test.ts: add 5 cases covering the default value, the
  HYPERFRAMES_GPU_CAPTURE env var (both "true" and "1" forms),
  non-matching env values falling through to false, and explicit
  overrides winning over env.

- browserManager.test.ts (new): 7 cases pinning buildChromeArgs
  output. Verifies SwiftShader stays the default, D3D11 is chosen on
  win32 under gpuCapture, Metal on darwin, OpenGL on linux, and that
  gpuCapture + disableGpu compose without conflict.
Covers when to use --gpu-capture vs the SwiftShader default, the
platform backend matrix, Docker/CI caveats around GPU passthrough,
the expected visual drift (PSNR > 40 dB, anti-aliasing differs by
1-2 px at edges), and a Puppeteer probe snippet for diagnosing
silent fallback.

Registered in docs.json under Guides, adjacent to rendering and hdr.
@qodo-ai-reviewer
Copy link
Copy Markdown

Hi, hyperframes render --docker --gpu-capture will not enable hardware ANGLE inside the container because the flag is only translated to a host env var and never forwarded to docker run args.

Severity: action required | Category: correctness

How to fix: Forward gpu-capture to Docker

Agent prompt to fix - you can give this to your LLM of choice:

Issue description

--gpu-capture is honored for local renders by setting HYPERFRAMES_GPU_CAPTURE=true, but Docker renders run in a separate container and do not receive that env var, so GPU capture is silently disabled.

Issue Context

The render CLI uses buildDockerRunArgs() to construct the docker run invocation. That function currently forwards only existing render flags.

Fix Focus Areas

  • packages/cli/src/utils/dockerRunArgs.ts[21-65]
  • packages/cli/src/commands/render.ts[320-346]
  • packages/cli/src/utils/dockerRunArgs.test.ts[1-162]

Implementation notes

  • Add gpuCapture: boolean to DockerRenderOptions.
  • In buildDockerRunArgs, forward it into the container either via:
    • an env var: -e HYPERFRAMES_GPU_CAPTURE=true when enabled, or
    • a CLI arg: append --gpu-capture to the container command (only works if the containerized CLI supports it).
  • Update the Docker render call site to pass gpuCapture: Boolean(args["gpu-capture"]).
  • Add/update tests to assert the env/flag is present when enabled and absent when disabled.

Found by Qodo code review. FYI, Qodo is free for open-source.

--gpu-capture was honored for local renders by setting
HYPERFRAMES_GPU_CAPTURE=true in the host process env, but Docker
renders run in a separate container that never received the env var,
so GPU capture was silently disabled when --docker was combined with
--gpu-capture.

Thread it through the same way --gpu and --hdr go: add a gpuCapture
field to DockerRenderOptions and append "--gpu-capture" to the
containerized hyperframes invocation when set. The containerized CLI
re-runs the same render command, which exports
HYPERFRAMES_GPU_CAPTURE=true inside the container before the engine
workers spawn — same code path as a local render.

Update render.ts call sites for both renderDocker and renderLocal to
pass Boolean(args["gpu-capture"]).

Tests:
- New: explicit forwarding test mirroring the --hdr regression test
  pattern (PR heygen-com#471 reviewer flagged this exact silent-drop shape).
- New: omission test for the disabled case.
- New: independence test asserting --gpu and --gpu-capture forward on
  their own (matches the doc claim that they're orthogonal).
- Updated: full-snapshot test now expects --gpu-capture in the arg
  vector when enabled.
- Updated: regression-tripwire test asserts --gpu-capture present.

10/10 cli/utils/dockerRunArgs tests pass.
@roiizchak
Copy link
Copy Markdown
Contributor Author

@qodo-ai-reviewer Verified — the gap was real. --gpu-capture only reached the host process env, never the container.

Fixed in 04dc743 (just pushed):

  • DockerRenderOptions gains a gpuCapture: boolean field.
  • buildDockerRunArgs appends --gpu-capture to the containerized hyperframes render invocation when set, mirroring how --gpu and --hdr already forward. The containerized CLI is the same code that just got the flag, so it'll re-export HYPERFRAMES_GPU_CAPTURE=true inside the container before its engine workers spawn — same code path as a local render.
  • render.ts Docker call site now passes gpuCapture: Boolean(args["gpu-capture"]). Local-render call site updated for parity.
  • Tests in dockerRunArgs.test.ts: explicit forwarding test mirroring the --hdr regression-test pattern, omission test for the disabled case, an independence test asserting --gpu and --gpu-capture forward separately, and the existing snapshot + tripwire tests updated to expect --gpu-capture in the arg vector. 10/10 cli/utils/dockerRunArgs tests green.

I went with the CLI-arg form (--gpu-capture) over -e HYPERFRAMES_GPU_CAPTURE=true because the existing flag-forwarding pattern in buildDockerRunArgs is uniformly CLI args (--gpu, --hdr, --quality, etc.) — keeping it consistent makes the silent-drop tripwire test simpler, and the env var still picks up via resolveConfig once the flag is parsed inside the container.

Copy link
Copy Markdown
Collaborator

@jrusso1020 jrusso1020 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice PR — default-off, sensible perf win, good test coverage, and you reacted well to the Qodo review. Naming-wise HYPERFRAMES_* is the right pick: the user-facing knobs (HYPERFRAMES_NO_TELEMETRY, HYPERFRAMES_GEMINI_MODEL, HYPERFRAMES_EXTRACT_CACHE_DIR) all live in that namespace, while PRODUCER_* is reserved for engine-internal knobs.

Two nits before merge, neither blocking:

  1. Duplicated ANGLE-backend selection. packages/cli/src/capture/index.ts lines ~75-82 re-implement the same process.platform switch you extracted as resolveAngleBackend() in engine/services/browserManager.ts. The "preserve loose coupling" justification is fine, but since both packages are in the same workspace, you could just import the helper and drop ~8 lines. Up to you — happy either way.

  2. Docker --gpu-capture without --gpu is a silent footgun. In dockerRunArgs.ts, --gpus all is gated only on options.gpu. So hyperframes render --docker --gpu-capture (without --gpu) forwards --gpu-capture into the container, but the host doesn't pass the GPU through — Chromium will silently fall back to software inside the container and the user gets no perf benefit (and is probably confused). Two reasonable options:

    • Add --gpus all when either options.gpu || options.gpuCapture is true, OR
    • Log a CLI warning when --docker --gpu-capture is set without --gpu.

    The docs guide already mentions the nvidia-container-runtime requirement, so a warning would probably be enough.

The visual-equivalence trade-off is documented honestly, the regression-shards/HDR/Windows render verification all stayed green, and the parity-harness path is correctly excluded — so I'm not worried about silent fixture drift.

Thanks again, this is a clean contribution.

roi32 added 2 commits April 26, 2026 07:42
…icating

The capture command was reimplementing the same process.platform → ANGLE
backend switch (d3d11/metal/opengl/swiftshader) that the engine's
browserManager already had as a private helper. Export the helper and
reuse it so the two launch sites can never drift.

Addresses review feedback on PR heygen-com#471.
Without this, `hyperframes render --docker --gpu-capture` (no --gpu) used
to forward the flag into the container but the host never opened GPU
passthrough, so Chromium silently fell back to swiftshader inside the
container and the user got no perf benefit. Either flag on its own now
implies host GPU passthrough; tests updated to cover both axes.

Addresses review feedback on PR heygen-com#471. Docker GPU passthrough still
requires nvidia-container-runtime as documented in the GPU capture guide.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants