Skip to content

emyann/repolore

Repository files navigation

repolore

Your codebase has lore — the decisions, gotchas, and why-things-are-the-way-they-are that live in nobody's head for long. repolore makes your agent write it down, cite it, and keep it honest.

It's Karpathy's llm-wiki pattern applied to a codebase — the immutable raw-source layer is the repo itself. A Claude Code plugin bootstraps and maintains an LLM-derived wiki: an orientation layer of concepts, architecture, feature histories, decision records, and hard-won gotchas, distilled from the code by your agent, for your agent (and the humans on your team).

What makes it different from DeepWiki / Code Wiki / memory banks:

  • In-repo, plain markdown, human-editable. The wiki is committed content you own and review — not a hosted artifact that rewrites itself overnight.
  • Per-claim grounding. Every concrete claim carries an inline citation to the source file it came from; unverifiable claims are demoted to > TODO-VERIFY: rather than asserted.
  • Mechanical drift detection. Each page records the git blob SHA of every file it covers; a stdlib-only script re-hashes and flags exactly which pages went stale, and which files made them stale. LLM judgment is never the staleness trigger — LLMs systematically miss implementation drift; hashes don't.
  • Coverage inversion. A second check enumerates in-scope source files no page covers — so a whole feature can't silently ship without a page.
  • Diff-driven refresh, reviewable commits. The refresh skill feeds the exact diff since the page was last verified, triages no-op / edit / rewrite, and produces a normal commit you review — never silent mutation.
  • Curated, not generated. A page plan you approve, a soft page budget, and a distillation rule ("if an agent can find it in 30 seconds, it doesn't belong here") keep it an orientation layer, not 400 pages of paraphrased code.
  • Minimal footprint. Outside the wiki itself, the tool's entire presence is one hidden .repolore/ directory (manifest + stdlib-only scripts) and a ≤10-line pointer appended to agent context files that already exist — init never creates a context file without asking.

Install

As a Claude Code plugin — namespaced commands, versioned updates:

/plugin marketplace add emyann/repolore     # or a local path while testing
/plugin install repolore@repolore

As a standalone skill, in any agent (Claude Code, Cursor, Copilot, Windsurf, …) via the skills CLI:

npx skills add emyann/repolore

That installs the single repolore umbrella skill; invoke it in plain words — "set up a repolore wiki here", "run repolore's check", "refresh the wiki". Both forms run the same procedures and scripts (see Packaging below).

Updating

Releases ship as version bumps of plugin.json (see the releases page).

Claude Code plugin — manually:

/plugin marketplace update repolore     # refresh the catalog
/plugin update repolore@repolore        # update the installed plugin

Claude Code plugin — automatically. Auto-update is off by default for third-party marketplaces; enable it once via /pluginMarketplaces → repolore → Enable auto-update, and new versions install at startup. (Launch-update semantics: the session that fetches a new version keeps running on the snapshot it already loaded — the update applies from the next session, or after /reload-plugins.) Better for teams: declare it in the project's .claude/settings.json and every teammate gets the plugin, pre-enabled and auto-updating, with zero setup:

{
  "extraKnownMarketplaces": {
    "repolore": {
      "source": { "source": "github", "repo": "emyann/repolore" },
      "autoUpdate": true
    }
  },
  "enabledPlugins": { "repolore@repolore": true }
}

Standalone skill: npx skills update re-pulls installed skills from their source repos.

The vendored layer is updated separately, on purpose. Updating the plugin/skill does not touch the .repolore/scripts/ copies committed to your repos — your wiki tooling never changes under you silently. The check workflow detects newer tooling and offers /repolore:update, which applies it safely: pristine files are regenerated, your locally-edited ones are skipped and reported (overwritten only with explicit --force consent), and the manifest SHAs stay truthful.

Use

Command What it does
/repolore:init One-time bootstrap: detect the stack → agree scope (with the in-scope file count shown up front) + a page plan with you → vendor the wiki skeleton, schema doc (AGENTS.md), check scripts and templates in one mechanical shot (bootstrap.mjs) → seed architecture/overview.md (default, declinable) → append pointer blocks to whichever agent context files already exist → optionally wire team auto-update into .claude/settings.json (see Updating) → finish with a single docs: commit (with your consent, asked once up front).
/repolore:check Health report: stale pages, uncovered «page-worthy» code clusters, index drift, page budget. Read-only, never blocks.
/repolore:refresh Bring stale pages back in line with the code: diff-driven triage, citation re-verification, re-stamp, regenerate index, reviewable commit.
/repolore:setup Activate optional capabilities with one consented question: the post-commit nudge (per-machine, or team-wide via husky/prepare script), team auto-update in project settings. Detects what is available-but-inactive so you never have to know to ask.
/repolore:update Bring the repo's vendored tooling up to the installed repolore version: deterministic classification (up-to-date / outdated-pristine / locally modified / missing / new), safe regeneration, manifest SHAs + version refreshed. Never overwrites your local edits without explicit --force consent.

With the standalone skill the same workflows are invoked in plain words ("set up a repolore wiki", "run repolore check/refresh") instead of slash commands.

Day-to-day, pages get drafted on demand ("draft features/refund-pipeline from the wiki plan") and updated as part of the change that altered the behaviour — the pointer block installed by init tells every agent session to do exactly that.

Packaging

One source of truth, two distributions: the procedures live in references/ and the assets in scripts/ + templates/. The root SKILL.md is the standalone umbrella skill the skills CLI installs (root-SKILL.md discovery means npx skills add sees exactly one skill), and the plugin's skills/init|check|refresh|update are thin shims that execute the same reference procedures — so the namespaced /repolore:* commands and the standalone skill can never drift apart.

What init vendors into your repo

docs/wiki/                  # location configurable
  index.md                  # GENERATED catalog (llms.txt-shaped)
  AGENTS.md                 # the schema doc: taxonomy, frontmatter, citation + refresh rules
  CLAUDE.md -> AGENTS.md    # symlink so Claude Code auto-loads it in the wiki tree
  GLOSSARY.md  log.md       # vocabulary + append-only journal
  wiki.config.yml           # machine-read config: scope globs, tunables, page plan
  architecture/ concepts/ features/ flows/ decisions/ gotchas/ howto/
  _templates/               # page.md + decision.md
.repolore/                  # the ONLY footprint outside the wiki — one hidden dir
  manifest.json             # wiki location + vendored-file hashes (for future safe updates)
  scripts/                  # vendored, stdlib-only (node + git), zero npm install
    wiki-check.mjs          # blob-SHA freshness: fresh / stale / unmanaged / malformed
    wiki-coverage.mjs       # in-scope files no page covers; --since <ref> new-page nudge
    wiki-stamp.mjs          # writes covers SHAs + generated_at_commit (never hand-compute)
    wiki-index.mjs          # regenerates index.md; --check for drift
    wiki-hook.mjs           # non-blocking post-commit nudge (always exit 0)
    wiki-install-hook.mjs   # one-liner per contributor; chains existing hooks

Pages are the product and always committed; tooling is regenerable; check state is never committed — freshness status is recomputed in under a second, so there is no status: field to dirty your tree or conflict in merges.

Why is tooling committed to my repo at all? Because the wiki's promise — anyone can verify any page's freshness in under a second — has to hold for people and pipelines with nothing installed: teammates without the plugin, CI on a bare checkout, other agents (Cursor, Copilot, Codex), and future maintainers if this project ever dies. Vendoring also keeps the schema doc, its parser, and your pages in version lockstep — a plugin-side checker auto-updates, and "fresh" must never change meaning under your repo overnight. The cost is ~30KB of dependency-free files and an occasional one-command, consent-gated /repolore:update. Full reasoning: ADR-006.

Design

The design is the synthesis of a multi-angle research pass over the auto-wiki landscape (DeepWiki, Google Code Wiki, mutable.ai, CodeWiki, RepoAgent…), agent memory-bank patterns (Cline et al.), doc-structure frameworks (Diátaxis, arc42, ADRs), packaging precedents (BMAD-METHOD, superpowers), and a production reference implementation that ran this system for months on a real integration platform. See docs/RESEARCH.md for the full report — landscape comparison, the five documented failure modes of auto-wikis, and the rationale behind every default.

Cost note: an agentic refresh run on a typical stale set costs on the order of $0.50–2.00 of tokens (published recipes); the deterministic checks are free and instant.

Testing

Two layers (see tests/README.md): deterministic node --test coverage of the bootstrap vendoring path (runs in CI), and an on-demand agentic workflow (.claude/workflows/test-init-ux.js) that executes /repolore:init end to end in three fixture repos, mechanically validates the end state, and grades the onboarding experience against a UX rubric.

Roadmap

The roadmap lives in docs/ROADMAP.md — what's in flight (flows v2: diff-scoped flow-refresh, the audit workflow + findings-inbox v2), the v0.5+ horizon (connector contract, llms.txt, Quartz site, per-harness emitters), and the full version history. Release notes per version are on the GitHub Releases page.

License

MIT

About

Code-derived LLM wiki for any repo — Claude Code plugin: cited claims, blob-SHA drift checking, coverage auditing, generated index

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors