Original Request
Set up E2E test suite with Playwright for critical review flows
Context: The CI pipeline has an e2e-tests job that currently just echoes "not yet configured". The project has 32 unit/integration test files but zero E2E coverage. Critical user flows that should be E2E tested include: (1) Web app review flow — load feed, view diff, submit judgment, (2) Extension popup — click Review Now, navigate to review page, (3) UserScript — inject review modal on Wikipedia article pages, (4) OAuth login flow and token refresh. The project already has Playwright as a dev dependency. Setting this up would catch the kinds of visual/functional regressions that were frequently found during deploy-debug cycles (per insights: 56 buggy_code friction instances).
Agent's Two Cents (could be wrong)
Everything below is the AI agent's best guess based on the current codebase.
Take with a grain of salt — the original request above is the only thing that came from a human.
Problem / Motivation
The repo has solid unit/integration coverage (32 test files across all 5 packages) but zero end-to-end tests. The CI pipeline already has a scaffolded e2e-tests job that installs Playwright browsers but then just echoes a placeholder message. Visual and functional regressions have been a recurring source of friction during deploy-debug cycles (56 reported instances), and E2E tests are the most effective way to catch these before they reach production.
Proposed Solution
Wire up Playwright E2E tests that exercise the four critical user flows end-to-end: the web app review cycle (feed -> diff -> judgment), the browser extension popup flow, the userscript injection on Wikipedia pages, and the OAuth login/token-refresh sequence. Integrate these into the existing CI e2e-tests job so they run on every PR.
Dependencies & Potential Blockers
- Playwright
^1.58.2 is already a dev dependency — no new installation needed.
- The CI workflow already installs Playwright Chromium browsers (
pnpm exec playwright install --with-deps chromium).
- OAuth E2E tests will need mock credentials or a test OAuth provider to avoid depending on live Wikipedia OAuth in CI.
- Extension popup tests may require Playwright's browser extension loading support (Chromium-based contexts with
--load-extension).
- UserScript tests need a strategy for injecting the script into a page context (could use Playwright's
addInitScript or load via a test extension).
How to Validate
Scope Estimate
Medium
Key Files/Modules Likely Involved
.github/workflows/ci.yml (lines 110-130 — the e2e-tests job)
package.json (root — Playwright config, test scripts)
packages/web/ (web app pages and components under test)
packages/extension/src/ (popup and background scripts)
packages/userscript/src/ (injection logic)
Rough Implementation Sketch
- Create a
playwright.config.ts at the repo root with projects for web, extension, and userscript
- Add an
e2e/ directory with subdirectories per flow (e2e/web/, e2e/extension/, e2e/userscript/, e2e/auth/)
- Write test specs for each of the four critical flows
- Mock external dependencies (MediaWiki API, OAuth provider) using Playwright's route interception (
page.route())
- Update the CI
e2e-tests job to run pnpm exec playwright test instead of the placeholder echo
- Add a
test:e2e script to the root package.json
Open Questions
- Should E2E tests run against a locally started dev server or a deployed preview URL? (Dev server is more self-contained; preview URL is closer to production.)
- How deep should the OAuth flow test go — full redirect-based flow with mocked provider, or just verify token handling with pre-seeded tokens?
- Should we test multiple browsers (Chromium + Firefox) from the start, or begin with Chromium-only to keep CI fast?
- For the extension popup, should we use Playwright's Chromium extension loading or test the popup HTML in isolation?
Potential Risks or Gotchas
- E2E tests are inherently slower and more flaky than unit tests — need careful use of
waitFor and explicit assertions to avoid timing issues.
- The extension popup flow requires loading an unpacked extension in Chromium, which has specific Playwright configuration requirements (
chromium.launchPersistentContext with --load-extension).
- UserScript injection in Playwright may behave differently than Tampermonkey/Greasemonkey — tests should verify the injection mechanism works in a vanilla browser context.
- CI runners have limited resources — Playwright tests with video/screenshots can be memory-heavy; may need to be selective about trace collection.
Original Request
Agent's Two Cents (could be wrong)
Problem / Motivation
The repo has solid unit/integration coverage (32 test files across all 5 packages) but zero end-to-end tests. The CI pipeline already has a scaffolded
e2e-testsjob that installs Playwright browsers but then just echoes a placeholder message. Visual and functional regressions have been a recurring source of friction during deploy-debug cycles (56 reported instances), and E2E tests are the most effective way to catch these before they reach production.Proposed Solution
Wire up Playwright E2E tests that exercise the four critical user flows end-to-end: the web app review cycle (feed -> diff -> judgment), the browser extension popup flow, the userscript injection on Wikipedia pages, and the OAuth login/token-refresh sequence. Integrate these into the existing CI
e2e-testsjob so they run on every PR.Dependencies & Potential Blockers
^1.58.2is already a dev dependency — no new installation needed.pnpm exec playwright install --with-deps chromium).--load-extension).addInitScriptor load via a test extension).How to Validate
pnpm exec playwright testruns and passes locallye2e-testsjob completes green (no more placeholder echo)Scope Estimate
Medium
Key Files/Modules Likely Involved
.github/workflows/ci.yml(lines 110-130 — thee2e-testsjob)package.json(root — Playwright config, test scripts)packages/web/(web app pages and components under test)packages/extension/src/(popup and background scripts)packages/userscript/src/(injection logic)Rough Implementation Sketch
playwright.config.tsat the repo root with projects for web, extension, and userscripte2e/directory with subdirectories per flow (e2e/web/,e2e/extension/,e2e/userscript/,e2e/auth/)page.route())e2e-testsjob to runpnpm exec playwright testinstead of the placeholder echotest:e2escript to the rootpackage.jsonOpen Questions
Potential Risks or Gotchas
waitForand explicit assertions to avoid timing issues.chromium.launchPersistentContextwith--load-extension).