Pocket Intelligence

A note-taking app where every feature is LLM-native. Obsidian-compatible vault + voice agent + task delegation + git sync + diff review — all in one tab.

Inspired by Andrej Karpathy's tweet on LLM knowledge bases. Built because I couldn't find a single tool that unified Obsidian, ChatGPT, Claude Code, and GitHub into one workflow.

Deploy your own

Clicking the button provisions a Railway service running this repo's Dockerfile. You'll be prompted for the environment variables below before it boots.

Demos

Git-backed editing with diff review

file.edit.mp4

Create a note, see every edit land in a GitHub-style diff view, revert per file or push to your vault — all from the browser.

Voice Q&A grounded in your notes

voice.agent.qa.mp4

"What's actor-critic in RL?" — the voice agent searches your vault, makes a couple of tool calls to pull the relevant passages, then explains it back with clickable quote cards that jump to source.

Podcast mode for the commute

voice.agent.podcast.mp4

The voice agent reads your log.md, finds recently ingested topics, and synthesizes a podcast-style walk-through. Earbuds in, 20 minutes of review during the commute.

Automated knowledge ingest

manual.knowledge.ingest.mp4

Drop a paper into the uploader. A Claude Code skill parses it, writes a summary note, and files it in the right folder of your vault — no manual organizing. You can also just ask the voice agent to research a topic online and ingest the findings.

Knowledge graph

neo4j.mp4

Neo4j-style force-directed view of every note + every wiki-link between them. See where your thinking is dense vs thin, and spot islands worth connecting.

Features

Obsidian-compatible vault — markdown, wiki-links, backlinks, tags, callouts. Your notes are plain files in a git repo.
Git sync — point at any git repo, vault syncs across every device. GitHub-style diff review before anything writes back.
Intelligent sidebar — Cmd-K to search, ask questions, run commands. Answers are grounded in your notes via grep retrieval.
Voice agent — planner + OpenAI TTS. Walks you through recent notes podcast-style, answers with cited quote cards, handles interrupts. Karaoke transcript syncs words to audio playback.
Task delegation + cron — voice agent or command bar can queue Claude Code tasks ("rewrite my Kubernetes notes with the latest mTLS guidance") and schedule recurring work (daily digests, weekly cleanup).
Knowledge ingest — drop a PDF / DOCX / URL into the uploader; a Claude Code skill parses, summarizes, and files it into the right folder.
Knowledge graph — Neo4j-style force-directed view of every note + every wiki-link. Click a node to jump to the source.
Embedded Claude Code terminal — xterm.js piped through a server-side pty. Auto-approves tool permissions so work runs without interruption.

Environment variables

See .env.example for the full list with inline documentation. The short version:

Variable	Required	Purpose
`GOOGLE_CLIENT_ID` / `GOOGLE_CLIENT_SECRET`	Yes	Google OAuth
`ALLOWED_EMAIL`	Yes	Only this email can sign in
`ANTHROPIC_API_KEY`	Yes	Claude Code + text chat
`OPENAI_API_KEY`	Yes	STT + planner + TTS
`NOTES_REPO`	No	Git repo containing your vault
`GITHUB_TOKEN`	No	Credential for pushing to `NOTES_REPO`

Single-user by design — ALLOWED_EMAIL is the only account that can sign in. Multi-tenant is not supported.

Local development

# 1. Clone + install
git clone https://github.com/USER/REPO.git
cd REPO
npm install

# 2. Configure env
cp .env.example .env
# then edit .env with your credentials

# 3. Dev server with hot reload
npm run dev
# → http://localhost:3000

Prerequisites:

Node 20+
@anthropic-ai/claude-code installed globally: npm install -g @anthropic-ai/claude-code
git, grep, find available on PATH (standard on macOS / Linux)
Optional: libreoffice for in-browser PPTX/DOCX previews
Optional: poppler-utils for PDF asset handling

Production build:

npm run build      # compiles TypeScript to dist/
npm start          # runs the compiled server

Architecture

Browser (public/js/*)
  ├─ Voice pipeline       — mic capture → WS → server → STT → planner → TTS → audio playback + karaoke
  ├─ Notes browser        — tree, markdown rendering, tabs, TOC, inline edit, diff review
  ├─ Command bar          — Cmd-K, grounded Q&A, /commands, graph search
  ├─ Knowledge graph      — Neo4j-style force-directed canvas
  ├─ Tasks                — queued + running + history of Claude Code tasks, cron jobs
  └─ Terminal             — xterm.js embedded in the intel panel

Server (src/*)
  ├─ voice/               — STT WebSocket, TTS streaming, planner (GPT-5.4 + tool loop), session orchestration
  ├─ routes/              — /api/notes/*, /api/chat, /api/tasks, /api/cron
  ├─ pty.ts               — Claude Code pty + auto-onboarding
  ├─ ws-handler.ts        — WebSocket upgrade, message dispatch, fs watcher
  ├─ auth.ts              — Google OAuth + session cookies
  └─ task-events.ts       — task queue worker + cron scheduler wiring

Storage (mounted volume)
  └─ WORKSPACE_DIR
      ├─ notes/            — your vault (git-backed if NOTES_REPO is set)
      ├─ .home/            — Claude Code config + sessions
      ├─ .sessions.json    — logged-in session cookies
      ├─ .task-queue.json  — queued + completed tasks
      └─ .cron-jobs.json   — scheduled recurring work

Single Node process. No database — everything is plain files on a mounted volume.

Google OAuth setup

Go to https://console.cloud.google.com/apis/credentials.
Create an OAuth 2.0 Client (type: Web application).
Under Authorized redirect URIs, add:
- https://<your-railway-domain>/auth/callback (production)
- http://localhost:3000/auth/callback (local dev)
Copy the Client ID + Client Secret into .env.
Set ALLOWED_EMAIL to the Google account you'll log in with.

Git-synced vault

Create a private GitHub repo for your vault (or point at an existing Obsidian vault repo). Generate a personal access token with repo scope. Set:

NOTES_REPO=https://github.com/you/your-vault.git
GITHUB_TOKEN=ghp_...
GIT_USER_NAME=Your Name
GIT_USER_EMAIL=you@example.com

On boot, the server clones the repo to $WORKSPACE_DIR/notes, auto-pulls every 5 min, and the diff review UI lets you commit + push from the browser.

How your vault needs to be structured

The voice agent's planner and the ingest skill depend on a specific layout. If your vault doesn't follow it, the agent will still work (search / Q&A are just grep) but ingest quality, podcast mode, and "Daily Briefing" will degrade. The layout is an Obsidian-style three-layer design:

Raw sources (_raw/) — immutable source material. Papers, PDFs, clipped articles. The ingest skill reads from here.
Wiki (<Topic>/wiki/) — LLM-generated markdown articles. Concept pages, paper summaries, comparisons. This is what the voice agent searches and cites.
Schema (CLAUDE.md at the vault root) — operating instructions the ingest skill reads first, every time. Defines conventions and workflows.

Required at the vault root:

File / folder	Role
`CLAUDE.md`	Schema — first thing the ingest skill reads
`index.md`	Master index of all knowledge areas
`log.md`	Reverse-chronological log of every ingest / update. The voice agent reads this for podcast + daily briefing modes
`_raw/<topic>/`	Where source files live
`<Topic>/index.md`	One-line summary of every page in the topic
`<Topic>/wiki/...`	The actual wiki articles
`_templates/` (optional)	Templates for new articles and topics

Starter files are in docs/:

docs/vault-schema.md — copy to your vault root as CLAUDE.md. Defines layer boundaries, YAML frontmatter format, linking conventions, tag hierarchy, and operation workflows.
docs/templates/new-article.md — copy to _templates/new-article.md in your vault. The ingest skill uses it to seed new pages.
docs/templates/new-topic.md — copy to _templates/new-topic.md in your vault. Used when bootstrapping a new knowledge area.

Bootstrap an empty vault:

cd <your-vault>
mkdir -p _raw _templates
curl -o CLAUDE.md https://raw.githubusercontent.com/xd00099/pocket-intelligence/main/docs/vault-schema.md
curl -o _templates/new-article.md https://raw.githubusercontent.com/xd00099/pocket-intelligence/main/docs/templates/new-article.md
curl -o _templates/new-topic.md https://raw.githubusercontent.com/xd00099/pocket-intelligence/main/docs/templates/new-topic.md
echo "# Knowledge Base" > index.md
echo "# Log" > log.md

After that, list your knowledge areas at the bottom of CLAUDE.md (the "Current Knowledge Areas" section) so the ingest skill knows where to file new sources. Everything else the LLM maintains for you.

What each piece actually drives

The planner prompt references log.md + topic index.md → if those exist, "Daily Briefing" and "Podcast" modes know what you've been reading lately.
The ingest skill reads CLAUDE.md at every ingest → it files new PDFs into the right _raw/<topic>/ path and writes summaries into the right <Topic>/wiki/ path.
YAML frontmatter (tags, sources, updated) → what makes the knowledge graph meaningful and what the voice agent cites as quote cards.
Obsidian [[wikilinks]] → the edges in the knowledge graph visualization and the backlinks rendered under every note.

You can deviate — edit CLAUDE.md to describe your own conventions. But whatever layout you pick must be documented there, because every ingest starts by reading it.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
docs		docs
public		public
skills		skills
src		src
.env.example		.env.example
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json
railway.toml		railway.toml
startup.sh		startup.sh
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Pocket Intelligence

Deploy your own

Demos

Git-backed editing with diff review

Voice Q&A grounded in your notes

Podcast mode for the commute

Automated knowledge ingest

Knowledge graph

Features

Environment variables

Local development

Architecture

Google OAuth setup

Git-synced vault

How your vault needs to be structured

What each piece actually drives

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Pocket Intelligence

Deploy your own

Demos

Git-backed editing with diff review

Voice Q&A grounded in your notes

Podcast mode for the commute

Automated knowledge ingest

Knowledge graph

Features

Environment variables

Local development

Architecture

Google OAuth setup

Git-synced vault

How your vault needs to be structured

What each piece actually drives

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages