perf: reuse Apple runner cache across version bumps by thymikee · Pull Request #900 · callstack/agent-device

thymikee · 2026-06-27T12:04:29Z

Summary

Reduce Apple runner cold-build and first-use work while keeping the cache reliability boundary intact.

Area	Before: global `agent-device@0.18.0`	After: PR #900 cumulative
npm/user runner build	Runtime command plus in-bundle Swift unit-test methods compiled in the same UI-test target	Runtime command only; Swift unit-test methods are behind `AGENT_DEVICE_RUNNER_UNIT_TESTS`
Maintainer Swift coverage	Coupled to `build:all`	Separate macOS CI compile job for the guarded Swift unit-test surface
Asset catalog	`actool` runs for tiny branding assets on every cold runner build	`Assets.xcassets` and bundled branding images are removed from repo/npm/project inputs
Cache key	Reusable across package version bumps; invalidates on toolchain/build metadata	Same, plus build metadata records unit-test Swift flags without asset-catalog-only settings
Maintainer/CI script destination	Defaulted to `generic/platform=iOS Simulator`, which built arm64 + x86_64 locally	Picks a concrete available iOS/tvOS simulator with a 3s fallback to generic, so `build:xcuitest` can build one active simulator arch

Fresh measurements on Xcode 26.2, iPhone 17 Pro Max simulator, alternating runs, fresh DerivedData per run:

Scenario	Before runs	Before median / mean	After runs	After median / mean	Delta
`xcodebuild build-for-testing` runner build	8.69s, 8.07s, 7.12s, 7.15s, 8.06s	8.06s / 7.82s	7.82s, 7.76s, 7.74s, 7.82s, 7.59s	7.76s / 7.75s	-3.7% median
End-to-end first use: `open settings` then first `snapshot -i` with no runner cache	31.5s	31.5s	21.4s	21.4s	-32% wall time

The build-only number is intentionally modest because Xcode warmup and shared compiler caches dominate after the first run, but the baseline still runs asset catalog work on every clean DerivedData build and the PR branch does not. The first-use CLI comparison is the more representative user path for "runner is not installed yet".

I tested lazy-loading the screen recording Swift surface after these measurements. It saved about 0.54s median on default clean runner builds, but it required a second runner build variant, feature-specific cache keys, session reuse handling for fixed DerivedData paths, and a more surprising first record path. That tradeoff was not worth shipping, so the lazy-recording commit was reverted in b82124170 and recording remains part of the normal runner.

Latest additional A/B matrix for low-complexity build-setting levers, all run sequentially with fresh DerivedData per build on Xcode 26.2 / iPhone simulator:

Variant	Runs	Wall median / mean	Result
Current PR baseline	5	6.128s / 6.632s	Baseline for this run
`ENABLE_TESTABILITY=NO`	5	5.888s / 5.904s	Reject for runtime: only ~0.24s median and changes Swift testability semantics
`SWIFT_SERIALIZE_DIAGNOSTICS=NO`	5	6.042s / 6.073s	Reject: command-line override did not remove `-serialize-diagnostics`
`SWIFT_EMIT_MODULE_SEPARATELY=NO`	5	6.016s / 5.971s	Reject: command-line override did not remove `-experimental-emit-module-separately`
`SWIFT_ENABLE_INCREMENTAL_COMPILATION=NO`	5	6.110s / 6.023s	Reject: removed `-incremental`, no reliable wall-time win
Testability off + diagnostics off	5	5.922s / 6.067s	Reject
Testability off + emit-module off	5	5.975s / 5.976s	Reject
Testability off + incremental off	5	5.876s / 5.897s	Reject: best median here, still only ~0.25s and changes testability semantics

One missed lever did pay off for maintainer/CI script builds:

Script destination	Runs	Wall median / mean	Archs compiled	Result
`generic/platform=iOS Simulator`	5	8.210s / 8.042s	arm64 + x86_64	Old script default
Concrete iOS simulator UDID	5	5.767s / 5.825s	arm64 only	-2.443s median / -29.8%

Runtime/user runner builds already use concrete simulator destinations, so this does not change the npm ios-prepare path. It fixes scripts/build-xcuitest-apple.sh / pnpm build:xcuitest by probing CoreSimulator for an available concrete iOS/tvOS simulator with a 3s timeout, and falling back to the generic destination if discovery fails.

Earlier profiling experiments that informed the patch:

Experiment	Wall time	SwiftCompile	Asset catalog	Result
Default single-file cold-ish	10.62s	7.22s	5.57s	First `actool` run is noisy
Wholemodule after warmup	6.43s	1.53s	0.67s	Less repeated Swift work, but not lower warm wall time
Single-file after warmup	5.61s	7.30s	0.68s	Kept default; parallelism wins wall-clock
Asset baseline	7.93s	6.91s	4.09s	`actool` still noisy
Asset catalog removed	5.58s	7.34s	n/a	Removes the unstable `actool` source input and drops packaged branding assets

I tried running the guarded Swift unit tests via xcodebuild test-without-building on the macOS UI-test target. Even an allowlist of three device-free methods still launched the UI-test host and was slow/problematic locally, so this PR only compiles that surface in a separate CI job. Actually running those cheaply requires a future target split away from the UI-test runner.

Validation

pnpm build: rebuilt local dist before CLI measurements.
pnpm exec vitest run src/platforms/ios/__tests__/runner-client.test.ts src/platforms/ios/__tests__/runner-xctestrun.test.ts src/platforms/ios/__tests__/runner-icon.test.ts: 79 tests passed after reverting lazy recording.
pnpm check:quick: lint and typecheck passed after reverting lazy recording.
pnpm build:xcuitest: passed for iOS and macOS after reverting lazy recording.
node ./node_modules/oxfmt/bin/oxfmt --write ...: completed cleanly earlier; direct invocation used because pnpm format tried to verify/fetch pnpm@11.1.2 without network in this sandbox.
npm pack --dry-run --ignore-scripts --json --cache /private/tmp/agent-device-npm-cache: package has 161 files, 568 KB packed / 1.97 MB unpacked, and no Assets.xcassets, logo.jpg, or powered-by.png entries.
xcodebuild build-for-testing benchmark: 5 alternating clean DerivedData runs for global 0.18 and PR perf: reuse Apple runner cache across version bumps #900; baseline logs contain Assets.xcassets, PR logs do not.
End-to-end first-use CLI benchmark: global 0.18 31.5s, PR perf: reuse Apple runner cache across version bumps #900 21.4s with isolated runner DerivedData/state.
Local tweet video artifacts generated from the measured first-use timings:
- .tmp/runner-install-comparison-20260627/videos/agent-device-0.18-first-snapshot.mp4
- .tmp/runner-install-comparison-20260627/videos/agent-device-pr900-first-snapshot.mp4
- .tmp/runner-install-comparison-20260627/videos/agent-device-runner-first-snapshot-comparison.mp4
AGENT_DEVICE_XCUITEST_INCLUDE_UNIT_TESTS=1 AGENT_DEVICE_IOS_RUNNER_DERIVED_PATH=/private/tmp/agent-device-swift-unit-compile-derived pnpm build:xcuitest:macos: passed earlier; command line showed -D AGENT_DEVICE_RUNNER_UNIT_TESTS.
Latest pushed SHA bb032737a; CI is re-running for this push.

github-actions · 2026-06-27T12:06:40Z

Size Report

Metric	Base	Current	Diff
JS raw	1.4 MB	1.4 MB	+167 B
JS gzip	445.5 kB	445.6 kB	+52 B
npm tarball	584.7 kB	545.9 kB	-38.8 kB
npm unpacked	2.0 MB	1.9 MB	-38.9 kB

Startup median (7 runs, lower is better):

Scenario	Base	Current	Diff
CLI --version	23.5 ms	24.8 ms	+1.3 ms
CLI --help	42.8 ms	43.4 ms	+0.6 ms

Top changed chunks:

Chunk	Raw diff	Gzip diff
`dist/src/9722.js`	+167 B	+52 B

github-actions · 2026-06-27T12:53:53Z

PR Preview Action v1.8.1
🚀 View preview at https://callstack.github.io/agent-device/pr-preview/pr-900/
Built to branch `gh-pages` at 2026-06-27 12:53 UTC. Preview will be ready when the GitHub Pages deployment is complete.

This reverts commit 4ee1620.

thymikee · 2026-06-28T06:48:47Z

Reviewed the latest head, including the new concrete-simulator destination commit.

I do not see a blocker. The default iOS/tvOS script path now prefers an available concrete simulator id, but keeps the generic simulator fallback when simctl lookup fails. Cache metadata still normalizes the destination back to the simulator family, so choosing a specific UDID should not churn runner cache keys. The unit-test Swift flag is now explicitly represented in both the shell build path and metadata comparison, so runtime and unit-test runner variants stay separated. Current CI is green, including Swift Runner Unit Compile, typecheck, unit, integration, smoke, and iOS runner compatibility.

Residual risk: this is still Apple runner build/cache behavior, so I would treat the PR body’s local Xcode measurements and first-use validation as the device-facing evidence rather than relying on fixture tests alone. With that evidence plus green CI, this is ready for maintainer merge judgment.

thymikee force-pushed the perf/apple-runner-build-time branch 2 times, most recently from e2897b2 to 1799d0d Compare June 27, 2026 13:37

thymikee added 6 commits June 27, 2026 21:04

perf: reuse Apple runner cache across version bumps

237efc8

perf: remove unused Apple runner symbols

ff9fdfc

perf: keep Swift runner unit tests out of runtime builds

4d24f3f

perf: skip Apple runner asset catalog in runtime builds

fd64667

perf: lazy-load apple runner recording support

f7ba091

Revert "perf: lazy-load apple runner recording support"

10ef1ef

This reverts commit 4ee1620.

thymikee force-pushed the perf/apple-runner-build-time branch from b821241 to 10ef1ef Compare June 27, 2026 19:05

perf: use concrete simulator for xcuitest script builds

bb03273

thymikee added the ready-for-human Valid work that needs human implementation, judgment, or maintainer merge label Jun 28, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

perf: reuse Apple runner cache across version bumps#900

perf: reuse Apple runner cache across version bumps#900
thymikee wants to merge 7 commits into
mainfrom
perf/apple-runner-build-time

thymikee commented Jun 27, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Jun 27, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Jun 27, 2026

Built to branch `gh-pages` at 2026-06-27 12:53 UTC.
Preview will be ready when the GitHub Pages deployment is complete.

Uh oh!

thymikee commented Jun 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

thymikee commented Jun 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Validation

Uh oh!

github-actions Bot commented Jun 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Size Report

Uh oh!

github-actions Bot commented Jun 27, 2026

Built to branch gh-pages at 2026-06-27 12:53 UTC. Preview will be ready when the GitHub Pages deployment is complete.

Uh oh!

thymikee commented Jun 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

thymikee commented Jun 27, 2026 •

edited

Loading

github-actions Bot commented Jun 27, 2026 •

edited

Loading

Built to branch `gh-pages` at 2026-06-27 12:53 UTC.
Preview will be ready when the GitHub Pages deployment is complete.