Skip to content

browse: make it the canonical Chromium for all agents (incl. offline local rendering — Excalidraw/card/og-image), so skills stop bundling their own puppeteer #1906

@garrytan-agents

Description

@garrytan-agents

Summary

Make browse the single, canonical Chromium for every agent on a box — including offline, local-render workloads, not just live web navigation. Today agents that need to rasterize their own HTML/JSON (Excalidraw diagrams, tweet/quote cards, og-image generation) each npm i puppeteer and download a second bundled Chromium, so a single machine ends up with 2+ Chromium installs that drift in version and break independently (see the install-revision pain in #1829, #1902, #913). browse is already a persistent Playwright Chromium with the right primitives; it should be the one install everything points at.

Concrete motivating case

A skill renders Excalidraw .excalidraw JSON → PNG, pixel-identical to excalidraw.com, by loading Excalidraw's own export bundle and calling its render function. Current implementation (standalone puppeteer):

const browser = await puppeteer.launch({ executablePath: PUPPETEER_EXECUTABLE_PATH, headless: 'new' });
const page = await browser.newPage();
await page.setContent(html, { waitUntil: 'networkidle0' });
await page.waitForFunction(() => document.getElementById('done')?.textContent === 'ready');
const dataUrl = await page.evaluate((s) => window.__render(s), scene); // returns a base64 PNG data URL
fs.writeFileSync(out, Buffer.from(dataUrl.split(',')[1], 'base64'));

This is 100% offline (zero network) — it has no business going through a SOCKS5/Xvfb/anti-bot headed path, but it also shouldn't need its own Chromium. browse headless is exactly right for it.

What browse already covers

Per the puppeteer→browse cheatsheet, the mapping is nearly 1:1:

needs browse today
page.setContent(html) load-html <file> (uses page.setContent()) ✅
page.goto(file://...) goto file://<abs>
viewport + 2x scale viewport WxH --scale 2
page.evaluate(...) js "<expr>"
element screenshot screenshot <path> --selector .x

So the migration is almost free. Two real gaps:

Gaps to close

  1. js returning large payloads. The Excalidraw path gets a multi-MB base64 data URL back from page.evaluate. js returns its result over the daemon protocol — confirm it handles large (multi-MB) string returns without truncation, or add a first-class render-to-file primitive (e.g. js --out <file> or eval-to-file) that writes the returned string/dataURL straight to disk and never serializes megabytes through the command channel. This is the one technical unknown.

  2. A documented offline render mode. A blessed path for "I just want to rasterize my own local HTML/JSON with no proxy, no Xvfb, no stealth" — i.e. plain headless, shared Chromium, deterministic wait-for-ready handshake. Today the offline render story is implicit; make it explicit so skill authors stop reaching for puppeteer.

The infra win

Once js/render-to-file covers the evaluate-and-capture pattern, every local-render skill drops its puppeteer dependency and its private Chromium, and browse becomes the one Chromium per box: one version to pin, one install to fix (kills the #1829/#1902 drift class), one daemon's lifecycle to manage. Agents stop each carrying their own browser.

Acceptance

  • A worked example in the browse SKILL: local HTML/JSON → load-htmljs/render-to-file → PNG on disk, offline, no proxy.
  • js documented (and tested) for multi-MB string returns, or a --out/eval-to-file primitive added.
  • Note in setup docs: skills should not bundle their own puppeteer/Chromium; route local rendering through browse.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions