Skip to content

fix(gbrain): strip .git when the remote URL has a trailing slash#1896

Open
jbetala7 wants to merge 1 commit into
garrytan:mainfrom
jbetala7:oss/fix-canonicalize-remote-trailing-slash
Open

fix(gbrain): strip .git when the remote URL has a trailing slash#1896
jbetala7 wants to merge 1 commit into
garrytan:mainfrom
jbetala7:oss/fix-canonicalize-remote-trailing-slash

Conversation

@jbetala7
Copy link
Copy Markdown
Contributor

@jbetala7 jbetala7 commented Jun 7, 2026

Problem

canonicalizeRemote() (lib/gstack-memory-helpers.ts) is the canonical host/org/repo dedup key — no scheme, no trailing .git. When a configured origin URL ends with a trailing slash (e.g. git remote add origin https://github.com/garrytan/gstack.git/, which is valid and clones fine) the .git suffix was not stripped:

canonicalizeRemote('https://github.com/garrytan/gstack.git/')
  => 'github.com/garrytan/gstack.git'   // expected 'github.com/garrytan/gstack'

Root cause

.git was stripped before trailing slashes, and the .git$ regex is anchored on end-of-string:

s = s.replace(/\.git$/i, "");   // string ends in "/", so this no-ops
s = s.replace(/\/+$/, "");      // slash removed → "...gstack.git" left behind

This key feeds gbrain code source-ids (deriveCodeSourceId / deriveLegacyCodeSourceId in bin/gstack-gbrain-sync.ts, and bin/gstack-memory-ingest.ts), which join the last two path segments. So the same repo produced different source-ids across machines depending on whether origin carried a trailing slash (gstack-code-garrytan-gstack-<hash> vs gstack-code-garrytan-gstack-git-<hash> after id sanitization) — splitting federated cross-source search, the exact thing the canonical key exists to prevent.

Fix

Strip trailing slash(es) first, then the .git suffix (a reorder of the two existing steps). No behavior change for any input that previously canonicalized correctly.

Testing

bun test test/gstack-memory-helpers.test.ts — 27 pass (was 25; +2 new cases).

Added regression tests for the .git/ combination, proven to fail on main before the fix:

✗ strips .git even when the URL has a trailing slash
    Expected: "github.com/garrytan/gstack"
    Received: "github.com/garrytan/gstack.git"
✗ produces the same key with or without a trailing slash

Also ran the downstream consumer suite bun test test/gstack-gbrain-sync.test.ts — 37 pass, 0 fail.

Fixes #1895

canonicalizeRemote() stripped the trailing `.git` suffix before stripping
trailing slashes. Because the `.git$` match is anchored on end-of-string,
a remote written with a trailing slash (e.g. `.../repo.git/`) skipped the
`.git` strip and canonicalized to `.../repo.git` instead of `.../repo`.

That key is the cross-machine dedup key and feeds gbrain code source-ids
(deriveCodeSourceId / deriveLegacyCodeSourceId, gstack-memory-ingest), so
the same repo produced different source-ids depending on whether origin had
a trailing slash, splitting federated cross-source search.

Reorder so trailing slash(es) are stripped first, then `.git`. Adds
regression tests for the `.git/` combination (proven to fail pre-fix).

Fixes garrytan#1895

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@trunk-io
Copy link
Copy Markdown

trunk-io Bot commented Jun 7, 2026

Merging to main in this repository is managed by Trunk.

  • To merge this pull request, check the box to the left or comment /trunk merge below.

After your PR is submitted to the merge queue, this comment will be automatically updated with its status. If the PR fails, failure details will also be posted here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

canonicalizeRemote keeps the .git suffix when the remote URL has a trailing slash, splitting the dedup/source-id key

1 participant