A note-taking app where every feature is LLM-native. Obsidian-compatible vault + voice agent + task delegation + git sync + diff review — all in one tab.
Inspired by Andrej Karpathy's tweet on LLM knowledge bases. Built because I couldn't find a single tool that unified Obsidian, ChatGPT, Claude Code, and GitHub into one workflow.
Clicking the button provisions a Railway service running this repo's Dockerfile. You'll be prompted for the environment variables below before it boots.
file.edit.mp4
Create a note, see every edit land in a GitHub-style diff view, revert per file or push to your vault — all from the browser.
voice.agent.qa.mp4
"What's actor-critic in RL?" — the voice agent searches your vault, makes a couple of tool calls to pull the relevant passages, then explains it back with clickable quote cards that jump to source.
voice.agent.podcast.mp4
The voice agent reads your log.md, finds recently ingested topics, and synthesizes a podcast-style walk-through. Earbuds in, 20 minutes of review during the commute.
manual.knowledge.ingest.mp4
Drop a paper into the uploader. A Claude Code skill parses it, writes a summary note, and files it in the right folder of your vault — no manual organizing. You can also just ask the voice agent to research a topic online and ingest the findings.
neo4j.mp4
Neo4j-style force-directed view of every note + every wiki-link between them. See where your thinking is dense vs thin, and spot islands worth connecting.
- Obsidian-compatible vault — markdown, wiki-links, backlinks, tags, callouts. Your notes are plain files in a git repo.
- Git sync — point at any git repo, vault syncs across every device. GitHub-style diff review before anything writes back.
- Intelligent sidebar —
Cmd-Kto search, ask questions, run commands. Answers are grounded in your notes via grep retrieval. - Voice agent — planner + OpenAI TTS. Walks you through recent notes podcast-style, answers with cited quote cards, handles interrupts. Karaoke transcript syncs words to audio playback.
- Task delegation + cron — voice agent or command bar can queue Claude Code tasks ("rewrite my Kubernetes notes with the latest mTLS guidance") and schedule recurring work (daily digests, weekly cleanup).
- Knowledge ingest — drop a PDF / DOCX / URL into the uploader; a Claude Code skill parses, summarizes, and files it into the right folder.
- Knowledge graph — Neo4j-style force-directed view of every note + every wiki-link. Click a node to jump to the source.
- Embedded Claude Code terminal — xterm.js piped through a server-side pty. Auto-approves tool permissions so work runs without interruption.
See .env.example for the full list with inline documentation. The short version:
| Variable | Required | Purpose |
|---|---|---|
GOOGLE_CLIENT_ID / GOOGLE_CLIENT_SECRET |
Yes | Google OAuth |
ALLOWED_EMAIL |
Yes | Only this email can sign in |
ANTHROPIC_API_KEY |
Yes | Claude Code + text chat |
OPENAI_API_KEY |
Yes | STT + planner + TTS |
NOTES_REPO |
No | Git repo containing your vault |
GITHUB_TOKEN |
No | Credential for pushing to NOTES_REPO |
Single-user by design — ALLOWED_EMAIL is the only account that can sign in. Multi-tenant is not supported.
# 1. Clone + install
git clone https://github.com/USER/REPO.git
cd REPO
npm install
# 2. Configure env
cp .env.example .env
# then edit .env with your credentials
# 3. Dev server with hot reload
npm run dev
# → http://localhost:3000Prerequisites:
- Node 20+
@anthropic-ai/claude-codeinstalled globally:npm install -g @anthropic-ai/claude-codegit,grep,findavailable onPATH(standard on macOS / Linux)- Optional:
libreofficefor in-browser PPTX/DOCX previews - Optional:
poppler-utilsfor PDF asset handling
Production build:
npm run build # compiles TypeScript to dist/
npm start # runs the compiled serverBrowser (public/js/*)
├─ Voice pipeline — mic capture → WS → server → STT → planner → TTS → audio playback + karaoke
├─ Notes browser — tree, markdown rendering, tabs, TOC, inline edit, diff review
├─ Command bar — Cmd-K, grounded Q&A, /commands, graph search
├─ Knowledge graph — Neo4j-style force-directed canvas
├─ Tasks — queued + running + history of Claude Code tasks, cron jobs
└─ Terminal — xterm.js embedded in the intel panel
Server (src/*)
├─ voice/ — STT WebSocket, TTS streaming, planner (GPT-5.4 + tool loop), session orchestration
├─ routes/ — /api/notes/*, /api/chat, /api/tasks, /api/cron
├─ pty.ts — Claude Code pty + auto-onboarding
├─ ws-handler.ts — WebSocket upgrade, message dispatch, fs watcher
├─ auth.ts — Google OAuth + session cookies
└─ task-events.ts — task queue worker + cron scheduler wiring
Storage (mounted volume)
└─ WORKSPACE_DIR
├─ notes/ — your vault (git-backed if NOTES_REPO is set)
├─ .home/ — Claude Code config + sessions
├─ .sessions.json — logged-in session cookies
├─ .task-queue.json — queued + completed tasks
└─ .cron-jobs.json — scheduled recurring work
Single Node process. No database — everything is plain files on a mounted volume.
- Go to https://console.cloud.google.com/apis/credentials.
- Create an OAuth 2.0 Client (type: Web application).
- Under Authorized redirect URIs, add:
https://<your-railway-domain>/auth/callback(production)http://localhost:3000/auth/callback(local dev)
- Copy the Client ID + Client Secret into
.env. - Set
ALLOWED_EMAILto the Google account you'll log in with.
Create a private GitHub repo for your vault (or point at an existing Obsidian vault repo). Generate a personal access token with repo scope. Set:
NOTES_REPO=https://github.com/you/your-vault.git
GITHUB_TOKEN=ghp_...
GIT_USER_NAME=Your Name
GIT_USER_EMAIL=you@example.comOn boot, the server clones the repo to $WORKSPACE_DIR/notes, auto-pulls every 5 min, and the diff review UI lets you commit + push from the browser.
The voice agent's planner and the ingest skill depend on a specific layout. If your vault doesn't follow it, the agent will still work (search / Q&A are just grep) but ingest quality, podcast mode, and "Daily Briefing" will degrade. The layout is an Obsidian-style three-layer design:
- Raw sources (
_raw/) — immutable source material. Papers, PDFs, clipped articles. The ingest skill reads from here. - Wiki (
<Topic>/wiki/) — LLM-generated markdown articles. Concept pages, paper summaries, comparisons. This is what the voice agent searches and cites. - Schema (
CLAUDE.mdat the vault root) — operating instructions the ingest skill reads first, every time. Defines conventions and workflows.
Required at the vault root:
| File / folder | Role |
|---|---|
CLAUDE.md |
Schema — first thing the ingest skill reads |
index.md |
Master index of all knowledge areas |
log.md |
Reverse-chronological log of every ingest / update. The voice agent reads this for podcast + daily briefing modes |
_raw/<topic>/ |
Where source files live |
<Topic>/index.md |
One-line summary of every page in the topic |
<Topic>/wiki/... |
The actual wiki articles |
_templates/ (optional) |
Templates for new articles and topics |
Starter files are in docs/:
docs/vault-schema.md— copy to your vault root asCLAUDE.md. Defines layer boundaries, YAML frontmatter format, linking conventions, tag hierarchy, and operation workflows.docs/templates/new-article.md— copy to_templates/new-article.mdin your vault. The ingest skill uses it to seed new pages.docs/templates/new-topic.md— copy to_templates/new-topic.mdin your vault. Used when bootstrapping a new knowledge area.
Bootstrap an empty vault:
cd <your-vault>
mkdir -p _raw _templates
curl -o CLAUDE.md https://raw.githubusercontent.com/xd00099/pocket-intelligence/main/docs/vault-schema.md
curl -o _templates/new-article.md https://raw.githubusercontent.com/xd00099/pocket-intelligence/main/docs/templates/new-article.md
curl -o _templates/new-topic.md https://raw.githubusercontent.com/xd00099/pocket-intelligence/main/docs/templates/new-topic.md
echo "# Knowledge Base" > index.md
echo "# Log" > log.mdAfter that, list your knowledge areas at the bottom of CLAUDE.md (the "Current Knowledge Areas" section) so the ingest skill knows where to file new sources. Everything else the LLM maintains for you.
- The planner prompt references
log.md+ topicindex.md→ if those exist, "Daily Briefing" and "Podcast" modes know what you've been reading lately. - The
ingestskill readsCLAUDE.mdat every ingest → it files new PDFs into the right_raw/<topic>/path and writes summaries into the right<Topic>/wiki/path. - YAML frontmatter (
tags,sources,updated) → what makes the knowledge graph meaningful and what the voice agent cites as quote cards. - Obsidian
[[wikilinks]]→ the edges in the knowledge graph visualization and the backlinks rendered under every note.
You can deviate — edit CLAUDE.md to describe your own conventions. But whatever layout you pick must be documented there, because every ingest starts by reading it.