Skip to content

CI: Re-land file-hash cache keys with fix for Intel macOS libunwind regression#1700

Draft
zlav wants to merge 3 commits intomasterfrom
zach/reland-cache-keys-with-mac-fix
Draft

CI: Re-land file-hash cache keys with fix for Intel macOS libunwind regression#1700
zlav wants to merge 3 commits intomasterfrom
zach/reland-cache-keys-with-mac-fix

Conversation

@zlav
Copy link
Copy Markdown
Member

@zlav zlav commented Apr 23, 2026

Overview

Re-lands #1666 (file-hash + release-tag-rotation CI cache keys) together with a fix for the Intel macOS dynamic-linking regression that caused the original PR to be reverted in #1675 / v3.16.4.

Opening as a draft until the Intel-macOS artifact has been verified on a clean Mac without Homebrew LLVM.

What changed

Commit 1 — revert of the revert. Re-applies the PR #1666 diff verbatim: hashFiles(...)-based cabal-store keys, release-tag-rotated dist-newstyle keys, and deletion of compute_cache_key.sh. See the original #1666 description for motivation (cache-pressure reduction + build-time variance on master).

Commit 2 — three fixes carried forward:

  1. Git tag fallback under fetch-depth: 2 (build-all.yml, bench.yml, integrations-test.yml). git describe --tags --abbrev=0 requires the tagged commit to be reachable in local history, which fetch-depth: 2 breaks — it always resolved to "none", defeating tag-based cache rotation. Switched to git tag --sort=-v:refname | head -1 || true, which reads fetched tag refs regardless of shallow depth and stays compatible with bash pipefail.

  2. Normalize Intel macOS dylib paths before codesign (build-all.yml). On macos-latest-large, ghcup's GHC links the final binary against libunwind from Homebrew's llvm@18 at /usr/local/opt/llvm@18/lib/libunwind.1.dylib. End-user Macs without Homebrew LLVM don't have that path, so the v3.16.3 binary failed to launch with dyld: Library not loaded — this is the exact regression called out in the v3.16.4 changelog ("revert caching changes as they caused a problem with missing libs on macos"). Rewrite the load command to /usr/lib/libunwind.1.dylib (system libunwind, present on every macOS 11+ install) using install_name_tool, before codesign runs so the re-sign covers the patched binary.

  3. Regression guard for Homebrew dylib deps (build-all.yml). New CI step runs otool -L on each macOS binary and fails the build if any LC_LOAD_DYLIB references /usr/local/opt/ or /opt/homebrew/. Runs on both Intel and arm64 matrix entries so future regressions on either architecture are caught in PR CI instead of reaching a release.

Evidence that the fix addresses the regression

Comparison of the released artifacts (no other build env differences):

$ otool -L fossa_3.16.3_darwin_amd64 | grep -E '/usr/local|/opt/homebrew'
  /usr/local/opt/llvm@18/lib/libunwind.1.dylib (compatibility version 1.0.0, current version 1.0.0)

$ otool -L fossa_3.16.4_darwin_amd64 | grep -E '/usr/local|/opt/homebrew'
  (no output)

Both binaries are otherwise identical (same size 112150704 bytes, same load commands, same SDK / minos). The libunwind entry is the only functional difference between them.

Acceptance criteria

  • Intel macOS artifact built from this branch has no /usr/local/opt/* or /opt/homebrew/* entries under otool -L.
  • arm64 macOS artifact is unchanged (already clean today).
  • The new "Assert no Homebrew dylib dependencies" CI step runs green on this branch and would fail if the rewrite step were removed.
  • Released binaries launch on a bare Intel Mac without Homebrew installed.

Testing plan

CI-side

  • CI runs on this branch should pass on both macOS matrix entries, exercising the new guard.
  • Temporarily remove the install_name_tool step on a scratch branch to confirm the guard catches the regression (can do this out-of-band to avoid polluting this PR's history).

Binary-side (required before un-drafting)

  1. Pull the macOS-intel-binaries artifact from this branch's build-all run.
  2. otool -L fossa | grep -E '/usr/local|/opt/homebrew' — expect no output.
  3. On a clean Intel Mac (VM or dev box with Homebrew uninstalled): ./fossa --version. Expect the normal version string, not a dyld: Library not loaded error.
  4. Repeat on the macOS-arm64-binaries artifact — same expectations.

Regression history checks

  • Confirm the originally reported tag-fetch issue is gone: a release-tag workflow run should succeed end-to-end without the refs/tags/... fetch conflict (actions/checkout@v6 is already on master and carries forward).

Risks

  • libunwind ABI compatibility. We're rewriting the load command from LLVM's libunwind to Apple's system libunwind. Both adhere to the standard _Unwind_* C ABI that Haskell RTS / GHC rely on. If GHC starts using LLVM-specific unwinding intrinsics, the rewrite would break at runtime. Mitigation: the regression guard plus bare-Mac smoke test would catch this.
  • Cache behavior. Same risks as the original Replace plan-hash cache keys with file-hash and tag rotation #1666 — staleness between releases, bounded by the 7-day cache TTL and release frequency. Unchanged from the original PR.
  • Runner image drift. If a future macos-latest-large image moves libunwind to a different Homebrew path (e.g., llvm@19), the install_name_tool -change won't match and the guard will fire. That is the intended behavior — we want to be notified, not silently ship a broken binary.

References

Checklist

  • CI workflow change — validated via the workflow run on this branch.
  • No user-visible behavior change (internal CI + build packaging).
  • No .fossa.yml / fossa-deps / subcommand changes.
  • Bare-Intel-Mac launch verification (blocker for un-drafting).

zlav and others added 3 commits April 22, 2026 13:59
Three fixes carried forward from the post-#1666 CI churn so the
re-land doesn't re-introduce the same bugs.

1. Git tag fallback under fetch-depth: 2
   git describe --tags --abbrev=0 requires the tagged commit to be
   reachable in local history; with fetch-depth: 2 it always resolves
   to "none", defeating tag-based cache rotation. Switch to
   git tag --sort=-v:refname | head -1 || true, which reads fetched
   tag refs regardless of shallow clone depth and stays compatible
   with bash pipefail.

2. Normalize Intel macOS dylib paths before codesign
   On macos-latest-large, ghcup's GHC links the final binary against
   libunwind from Homebrew's llvm@18 (/usr/local/opt/llvm@18/lib/
   libunwind.1.dylib). End-user Macs without Homebrew LLVM don't
   have that path, so the v3.16.3 binary failed to launch with
   "dyld: Library not loaded" — this is the exact regression called
   out in the v3.16.4 changelog. Rewrite the load command to
   /usr/lib/libunwind.1.dylib (which ships with every macOS 11+
   install) using install_name_tool, before codesign runs, so the
   re-sign covers the patched binary.

3. Regression guard for Homebrew dylib deps
   New CI step runs otool -L on each macOS binary and fails the
   build if any LC_LOAD_DYLIB references /usr/local/opt/ or
   /opt/homebrew/. Runs on both Intel and arm64 matrix entries so
   future regressions on either architecture are caught in PR CI
   instead of reaching a release.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Complements the otool Homebrew guard. The guard proves load commands
do not reference Homebrew paths; the smoke-launch proves the binary
actually boots (catching install_name_tool corruption, init-time
failures, etc.). rendergraph is skipped because it reads piped JSON
from stdin and errors without input — if fossa/diagnose/millhone
launch, rendergraph loads through the same dyld path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant