feat(cli): add --gpu-capture to use hardware ANGLE in frame capture#471
feat(cli): add --gpu-capture to use hardware ANGLE in frame capture#471roiizchak wants to merge 7 commits intoheygen-com:mainfrom
Conversation
HyperFrames' stage-4 frame capture hardcoded --use-angle=swiftshader, forcing software WebGL regardless of host GPU. On shader-heavy compositions this made stage 4 the render bottleneck (~80-90% of wall time) even on machines with working GPU drivers. Add an opt-in gpuCapture field to EngineConfig. When enabled, buildChromeArgs swaps the SwiftShader flag for a platform-specific hardware ANGLE backend: D3D11 on Windows, Metal on macOS, OpenGL on Linux. Default is unchanged (SwiftShader) to preserve cross-platform compatibility on CI/Docker without GPU passthrough. Env fallback: HYPERFRAMES_GPU_CAPTURE=true (or =1 for ergonomics matching local-dev conventions — the only engine env var accepting both, documented inline). Measured on a local shader-heavy rig (3x chained swirl-vortex/sdf-iris/ glitch blocks, 360 frames at 1920x1080): wall time drops 25% with workers=1 (32s -> 24s) and 25% with workers=3 (20s -> 15s). PSNR vs the SwiftShader render averages 54.7 dB — well above the 40 dB visual-equivalence threshold.
Surface the new engine gpuCapture config through the CLI: - render command: add --gpu-capture boolean flag. When set, exports HYPERFRAMES_GPU_CAPTURE=true in the process env before invoking the producer. The engine's resolveConfig() picks it up in each worker, avoiding per-RenderJob plumbing (same pattern used for PRODUCER_MAX_CONCURRENT_RENDERS). - capture (website-to-video) command: inline the same platform-aware ANGLE backend selection, gated on HYPERFRAMES_GPU_CAPTURE. Accepts both "true" and "1" for parity with the env-var fallback documented in the engine config. --gpu and --gpu-capture are independent: --gpu drives stage-5 NVENC encoding, --gpu-capture drives the stage-4 Chromium backend. Combine freely.
…lection - config.test.ts: add 5 cases covering the default value, the HYPERFRAMES_GPU_CAPTURE env var (both "true" and "1" forms), non-matching env values falling through to false, and explicit overrides winning over env. - browserManager.test.ts (new): 7 cases pinning buildChromeArgs output. Verifies SwiftShader stays the default, D3D11 is chosen on win32 under gpuCapture, Metal on darwin, OpenGL on linux, and that gpuCapture + disableGpu compose without conflict.
Covers when to use --gpu-capture vs the SwiftShader default, the platform backend matrix, Docker/CI caveats around GPU passthrough, the expected visual drift (PSNR > 40 dB, anti-aliasing differs by 1-2 px at edges), and a Puppeteer probe snippet for diagnosing silent fallback. Registered in docs.json under Guides, adjacent to rendering and hdr.
|
Hi, Severity: action required | Category: correctness How to fix: Forward gpu-capture to Docker Agent prompt to fix - you can give this to your LLM of choice:
Found by Qodo code review. FYI, Qodo is free for open-source. |
--gpu-capture was honored for local renders by setting HYPERFRAMES_GPU_CAPTURE=true in the host process env, but Docker renders run in a separate container that never received the env var, so GPU capture was silently disabled when --docker was combined with --gpu-capture. Thread it through the same way --gpu and --hdr go: add a gpuCapture field to DockerRenderOptions and append "--gpu-capture" to the containerized hyperframes invocation when set. The containerized CLI re-runs the same render command, which exports HYPERFRAMES_GPU_CAPTURE=true inside the container before the engine workers spawn — same code path as a local render. Update render.ts call sites for both renderDocker and renderLocal to pass Boolean(args["gpu-capture"]). Tests: - New: explicit forwarding test mirroring the --hdr regression test pattern (PR heygen-com#471 reviewer flagged this exact silent-drop shape). - New: omission test for the disabled case. - New: independence test asserting --gpu and --gpu-capture forward on their own (matches the doc claim that they're orthogonal). - Updated: full-snapshot test now expects --gpu-capture in the arg vector when enabled. - Updated: regression-tripwire test asserts --gpu-capture present. 10/10 cli/utils/dockerRunArgs tests pass.
|
@qodo-ai-reviewer Verified — the gap was real. Fixed in 04dc743 (just pushed):
I went with the CLI-arg form ( |
jrusso1020
left a comment
There was a problem hiding this comment.
Nice PR — default-off, sensible perf win, good test coverage, and you reacted well to the Qodo review. Naming-wise HYPERFRAMES_* is the right pick: the user-facing knobs (HYPERFRAMES_NO_TELEMETRY, HYPERFRAMES_GEMINI_MODEL, HYPERFRAMES_EXTRACT_CACHE_DIR) all live in that namespace, while PRODUCER_* is reserved for engine-internal knobs.
Two nits before merge, neither blocking:
-
Duplicated ANGLE-backend selection.
packages/cli/src/capture/index.tslines ~75-82 re-implement the sameprocess.platformswitch you extracted asresolveAngleBackend()inengine/services/browserManager.ts. The "preserve loose coupling" justification is fine, but since both packages are in the same workspace, you could just import the helper and drop ~8 lines. Up to you — happy either way. -
Docker
--gpu-capturewithout--gpuis a silent footgun. IndockerRunArgs.ts,--gpus allis gated only onoptions.gpu. Sohyperframes render --docker --gpu-capture(without--gpu) forwards--gpu-captureinto the container, but the host doesn't pass the GPU through — Chromium will silently fall back to software inside the container and the user gets no perf benefit (and is probably confused). Two reasonable options:- Add
--gpus allwhen eitheroptions.gpu || options.gpuCaptureis true, OR - Log a CLI warning when
--docker --gpu-captureis set without--gpu.
The docs guide already mentions the
nvidia-container-runtimerequirement, so a warning would probably be enough. - Add
The visual-equivalence trade-off is documented honestly, the regression-shards/HDR/Windows render verification all stayed green, and the parity-harness path is correctly excluded — so I'm not worried about silent fixture drift.
Thanks again, this is a clean contribution.
…icating The capture command was reimplementing the same process.platform → ANGLE backend switch (d3d11/metal/opengl/swiftshader) that the engine's browserManager already had as a private helper. Export the helper and reuse it so the two launch sites can never drift. Addresses review feedback on PR heygen-com#471.
Without this, `hyperframes render --docker --gpu-capture` (no --gpu) used to forward the flag into the container but the host never opened GPU passthrough, so Chromium silently fell back to swiftshader inside the container and the user got no perf benefit. Either flag on its own now implies host GPU passthrough; tests updated to cover both axes. Addresses review feedback on PR heygen-com#471. Docker GPU passthrough still requires nvidia-container-runtime as documented in the GPU capture guide.
What
Adds an opt-in
--gpu-captureflag (and matchingHYPERFRAMES_GPU_CAPTURE=1|trueenv var) to therenderandcapturecommands. When set, HyperFrames launches headless Chromium with a platform-specific hardware ANGLE backend (D3D11 on Windows, Metal on macOS, OpenGL on Linux) instead of the current hardcoded SwiftShader software path.Default behavior is unchanged —
--use-angle=swiftshaderstays the out-of-the-box choice, preserving the "works in any CI / Docker / headless-box" guarantee.Why
Stage-4 frame capture today forces every WebGL canvas through SwiftShader regardless of whether the host has a working GPU. On shader-heavy compositions this made capture the render bottleneck (~80-90% of wall time) even on machines with fully functional GPU drivers, while the
--gpustage-5 NVENC work landed in #442 already assumes a usable GPU.There is also a silent-deprecation tax: Chromium emits
"Automatic fallback to software WebGL has been deprecated. Please use the --enable-unsafe-swiftshader flag to opt in to lower security guarantees for trusted content"on every render today. That warning goes away the moment a hardware ANGLE backend is selected.Measured on a local shader-heavy smoke rig (3× chained registry blocks:
swirl-vortex+sdf-iris+glitch, 12s @ 1920×1080, 360 frames):PSNR average between the SwiftShader and D3D11 renders: 54.7 dB (min 49.9 dB), well above the 40 dB visual-equivalence threshold. GPU rasterization differs by 1-2 px at anti-aliased edges but the visual intent is identical. Verified on Windows 11 + RTX 4080 + Chrome-for-testing bundled with Puppeteer 24.x.
The speedup stayed at ~25% rather than the 2-5× you'd see for full-frame WebGL work; these registry transition blocks rasterize small canvases inside larger DOM pages, so screenshot IPC + JPEG encoding dominate at this scale. A composition with continuous full-screen WebGL would win more.
How
packages/engine/src/config.ts— newEngineConfig.gpuCapture: boolean, defaulting tofalse.resolveConfig()picks upHYPERFRAMES_GPU_CAPTUREvia a smallenvBoolOrOnehelper that accepts"true"or"1". The"1"form is a deliberate ergonomic deviation from the existingPRODUCER_*env vars (which are"true"-only) to match the local-dev shorthand devs already use — documented inline in the resolver.packages/engine/src/services/browserManager.ts—buildChromeArgs's config type gainsgpuCapture. A small pureresolveAngleBackend(gpuCapture)helper selects the ANGLE flag:swiftshaderwhen off,d3d11/metal/openglbased onprocess.platformwhen on.packages/cli/src/commands/render.ts— new citty"gpu-capture"boolean option next togpu. When set, writesprocess.env.HYPERFRAMES_GPU_CAPTURE = "true"before spawning producer workers. This follows the same env-propagation pattern the command already uses forPRODUCER_MAX_CONCURRENT_RENDERS, avoiding per-RenderJobplumbing.packages/cli/src/capture/index.ts—captureWebsite()inlines the same platform-aware backend selection sohyperframes capturehonors the env var for parity withrender. Kept inline (no cross-package engine dep) to preserve capture's current loose coupling.Deliberately not changed:
packages/cli/src/commands/snapshot.ts,packages/cli/src/commands/validate.ts,packages/producer/src/parity-harness.ts,packages/player/tests/perf/runner.ts,packages/engine/scripts/test-fitTextFontSize-browser.ts. These are dev-internal / test-harness launches where the SwiftShader default is fine; happy to broaden scope if you'd prefer consistency everywhere.Test plan
packages/engine/src/config.test.ts(+5) — default off,=trueon,=1on, other strings fall through, explicit override beats env.packages/engine/src/services/browserManager.test.ts(new, 7) — pins args output: SwiftShader default, D3D11 on win32, Metal on darwin, OpenGL on linux,--use-gl=anglestays regardless of backend,--disable-gpuonly appended underdisableGpu,gpuCapture + disableGpucompose without conflict.bun run --filter @hyperframes/engine test— full suite passes on the two files I touched; didn't re-run the full 359-test engine sweep so the WindowsvideoFrameExtractor.test.tsfontconfig flake mentioned in fix(engine): accept libx264 preset names with NVENC and QSV #442 may still repro there, unrelated to this PR.lint/format/typecheck/commitlint).probe-gpu.jsconfirmsUNMASKED_RENDERER_WEBGLreportsANGLE (NVIDIA, NVIDIA GeForce RTX 4080 (0x00002704) Direct3D11 vs_5_0 ps_5_0, D3D11)with the flag on andANGLE (Google, Vulkan 1.3.0 (SwiftShader Device (Subzero)))with it off. The SwiftShader deprecation warning disappears from render output under--gpu-capture.hyperframes render --quality standard --gpu --gpu-capture --workers 3on an audio-containing rig produces h264 + AAC 48kHz/stereo audio intact — no interaction with the test(engine): cover spawnStreamingEncoder lifecycle and cleanup paths #380 streaming-encoder audio path.docs/guides/gpu-capture.mdxregistered indocs.json: when to use it, platform backend matrix, Docker/CI caveats (needsnvidia-container-runtimeor equivalent), visual-drift expectations, and a Puppeteer renderer-info probe snippet for diagnosing silent fallback.Happy to split the capture-command and docs changes into follow-up PRs if you'd prefer a narrower first cut.