Cross-language dependency intelligence for polyglot codebases.
Catch silent breaks across SQL → Python → TypeScript → React — before they reach production.
Most dev tools understand one language. Your codebase speaks four.
When you rename a database column, your ORM model breaks. The Pydantic serializer breaks. The TypeScript interface breaks. The React component breaks. None of these failures happen in the same file, or even in the same language — and no linter, type-checker, or IDE is watching across that full chain.
DepGraph.ai is. It builds a unified knowledge graph of your entire polyglot stack, maps every cross-language dependency edge, scores the blast radius of any change, and generates a safe, ordered migration plan with exact before/after diffs.
- Clone the repo and start the backend
- Enter
./sample_repoin the path input and click Analyze - Watch the 6-layer pipeline run live in the terminal
- Click
user_emailin the left sidebar - See: 4 languages affected, severity = CRITICAL
- Open the Migrate tab → type
email→ get a complete diff plan across every file
WebGL-powered, force-directed graph built on react-force-graph-3d and Three.js. The engine is fully 3D (real sphere meshes, camera orbit, WebGL rendering) but a Z-axis flattening force keeps nodes in a layered 2D plane — giving you a clean, readable layout while retaining depth, glow effects, and smooth camera controls. Nodes are color-coded by layer (amber = SQL, violet = Python, sky = TypeScript/React). Cross-language edges are rendered with distinct colors per relationship type — MAPS_TO, SERIALIZES_TO, BREAKS_IF_RENAMED, and more.
Every node gets a severity score using the formula:
ImpactScore = weighted_dependents × api_multiplier × coverage_multiplier
Tiers: CRITICAL ≥ 8 · HIGH ≥ 4 · MEDIUM ≥ 1 · LOW < 1
Click any node to see its animated severity gauge, full downstream impact chain, per-hop confidence scores, and whether each dependent node will BREAK on rename.
Ask questions about your codebase in plain English. The chat engine retrieves relevant subgraph context, builds a focused prompt, and returns precise answers grounded in your actual code — with file names, line numbers, and relationship edges cited. Sessions are persisted and resumable.
You: "What breaks if I delete the user_email field?"
Bot: "user_email maps to User.email in models.py:34 via ORM_MAP (confidence 1.0),
which serializes to UserSerializer.email in serializers.py:18, which exposes
as userEmail in /api/users (TypeScript type UserResponse:12), which is
destructured in UserProfile.tsx:47 and SettingsView.tsx:93. All four will break."
Select a node, type a new name, and get a complete rename plan in seconds:
- Exact file paths and line numbers
- Old code vs. new code diff for every affected file
- Safe execution order: SQL → Python → TypeScript → React
- Apply directly to your local repo (in-place patch) or download as a
.zip
The planner uses a deterministic AST-based diff first, then optionally enriches with an LLM pass to catch casing rules (snake_case → camelCase at language boundaries).
See the full data journey of any symbol: from the database column it originates from, through every ORM field, serializer, API route, TypeScript type, and React prop it flows through — with transformation annotations at each layer boundary.
Block commits that introduce silent cross-language breaks:
python scripts/install_hooks.py /path/to/your/repoOn every git commit, the hook checks staged files against the knowledge graph. If any change carries a break_risk = HIGH or CRITICAL edge, the commit is blocked with an explanation. Override with --no-verify.
JWT-based authentication with per-user knowledge graphs. Each user gets their own isolated graph stored on disk — analyze multiple repos simultaneously with different accounts.
┌─────────────────────────────────────────────────────────────┐
│ Source Code │
│ .sql .py .ts .tsx .js .jsx │
└───────────────────────┬─────────────────────────────────────┘
│
▼
┌─────────────────────────┐
│ Layer 1: AST Parser │ tree-sitter + sqlglot
│ sql_parser.py │ Extracts CodeNode trees
│ python_parser.py │ from every file in repo
│ typescript_parser.py │
└────────────┬────────────┘
│
▼
┌─────────────────────────┐
│ Layer 2: Structural │ ORM_MAP, CONVENTION_MAP
│ Graph Builder │ IMPORTS, CALLS edges
│ graph/builder.py │ via name-match + AST
└────────────┬────────────┘
│
▼
┌─────────────────────────┐
│ Layer 3: Boundary │ AXA Language Detector
│ Zone Detector │ (ASE 2024 technique)
│ graph/boundary.py │ Finds cross-lang nodes
└────────────┬────────────┘
│
▼
┌─────────────────────────┐
│ Layer 4: LLM Semantic │ GLM-4.7 via Featherless
│ Resolution │ Batch-annotates boundary
│ graph/llm_resolver.py │ nodes in ≤4 API calls
└────────────┬────────────┘
│
▼
┌─────────────────────────┐
│ Layer 5: Knowledge │ NetworkX DiGraph
│ Graph Assembly │ Persisted as JSON
│ graph/pipeline.py │ Per-user on disk
└────────────┬────────────┘
│
▼
┌─────────────────────────┐
│ Layer 6: Query Engine │ Fast BFS impact query
│ query/engine.py │ LLM narration
│ │ Migration planner
└─────────────────────────┘
│
▼
┌─────────────────────────┐
│ FastAPI Backend │ REST + WebSocket
│ backend/main.py │ JWT auth
│ :8000 │ Per-user graph store
└────────────┬────────────┘
│
▼
┌─────────────────────────┐
│ React Frontend │ Vite + TypeScript
│ :5173 │ Framer Motion
│ │ react-force-graph-3d
└─────────────────────────┘
- Python 3.11+
- Node.js 18+
- A Featherless.ai API key (free tier available) — used for LLM annotation and RAG chat. The tool degrades gracefully to structural-only mode if no key is set.
git clone https://github.com/your-username/depgraph.git
cd depgraph
pip install -r requirements.txt# Create .env in the project root
cp .env.example .envEdit .env:
FEATHERLESS_API_KEY=your_key_here
FEATHERLESS_BASE_URL=https://api.featherless.ai/v1
FEATHERLESS_MODEL=zai-org/GLM-4.7
# JWT secret — change this in production
JWT_SECRET=change_me_in_productionpython -m uvicorn backend.main:app --reload --port 8000The API will be live at http://localhost:8000. Open http://localhost:8000/docs for the auto-generated Swagger UI.
cd frontend
npm install
npm run devOpen http://localhost:5173.
- Register an account on the login screen
- Paste a local path (e.g.
/home/you/projects/myapp) or a GitHub URL (e.g.https://github.com/org/repo) into the input - Click Analyze and watch the 6-layer pipeline run in real-time
- Explore the 2.5D graph, click nodes, chat with the codebase, and generate migration plans
| Variable | Required | Default | Description |
|---|---|---|---|
FEATHERLESS_API_KEY |
No* | — | API key for LLM features. Without this, the tool runs in structural-only mode (no RAG chat, no LLM migration enrichment). |
FEATHERLESS_BASE_URL |
No | https://api.featherless.ai/v1 |
LLM API base. Any OpenAI-compatible endpoint works (OpenRouter, local vLLM, etc). |
FEATHERLESS_MODEL |
No | zai-org/GLM-4.7 |
Model to use for annotation and chat. |
JWT_SECRET |
Yes | — | Secret for signing JWT tokens. Use a strong random string in production. |
VITE_API_BASE_URL |
No | http://localhost:8000 |
Frontend env var. Set to your deployed backend URL for production builds. |
All endpoints require a Bearer <token> header except /api/auth/login and /api/auth/register.
| Method | Endpoint | Description |
|---|---|---|
POST |
/api/auth/register |
Create a new user account |
POST |
/api/auth/login |
Login and receive a JWT token |
GET |
/api/auth/me |
Get current user info |
POST |
/api/analyze?repo_path= |
Trigger full repo analysis. Accepts local path or GitHub URL. Returns immediately; streams progress via WebSocket. |
GET |
/api/graph |
Full knowledge graph in React Flow node/edge format |
GET |
/api/impact/{node_id} |
Fast BFS impact analysis for a node (no LLM, instant) |
GET |
/api/narrate/{node_id} |
LLM-narrated plain-English explanation of the impact chain |
POST |
/api/chat |
RAG chat — ask a question about the graph, optionally anchored to a node |
POST |
/api/migrate |
Generate a cross-language rename migration plan |
POST |
/api/migrate/apply |
Apply migration patches directly to the local repo |
POST |
/api/migrate/download |
Download migration patches as a .zip |
GET |
/api/chains |
List all traced variable chains across language layers |
GET |
/api/routes |
List all detected API routes with their boundary metadata |
GET |
/api/sections |
Node counts per layer (DATABASE / BACKEND / FRONTEND) and cross-section edge count |
GET |
/api/git/impact?mode=staged |
Map staged git changes to affected graph nodes |
GET |
/api/repo-path |
Get the current repo path for the authenticated user |
POST |
/api/repo-path |
Set or override the repo path |
GET |
/api/user/status |
Check if a graph exists for the current user |
GET |
/api/health |
Health check |
WS |
/ws/progress |
WebSocket stream for real-time analysis progress (percent + log lines) |
depgraph/
├── backend/
│ ├── main.py # FastAPI app — all endpoints, user state, analysis runner
│ ├── core/
│ │ └── models.py # CodeNode dataclass (the AST node representation)
│ ├── parsers/
│ │ ├── dispatcher.py # parse_repo(), routes files to correct parser
│ │ ├── sql_parser.py # sqlglot-based SQL schema parser
│ │ ├── python_parser.py # tree-sitter Python parser
│ │ └── typescript_parser.py # tree-sitter TypeScript/TSX parser
│ ├── graph/
│ │ ├── builder.py # Constructs DiGraph from parsed file nodes
│ │ ├── boundary.py # Detects cross-language boundary nodes
│ │ ├── llm_resolver.py # Batched LLM semantic annotation
│ │ ├── pipeline.py # Orchestrates all 6 layers; chain/route detection
│ │ ├── schema.py # Knowledge graph vocabulary (node/edge types)
│ │ └── serializer.py # save_graph() / load_graph() (NetworkX ↔ JSON)
│ ├── query/
│ │ └── engine.py # get_impact(), generate_migration(), graph_rag_chat()
│ ├── git/
│ │ ├── cloner.py # GitHub repo cloner with SHA-based cache
│ │ ├── diff_reader.py # Reads staged/PR/last-commit diffs
│ │ └── pre_commit_hook.py # Hook entrypoint — blocks high-risk commits
│ └── auth/
│ └── db.py # SQLite user store, bcrypt passwords, JWT issuance
│
├── scripts/
│ └── install_hooks.py # Installs pre-commit hook into any target git repo
│
├── sample_repo/ # Demo polyglot project (SQL + Python + TypeScript)
│ ├── schema.sql
│ └── services/
│ ├── auth_service.py
│ ├── models.py
│ └── schemas.py
│
├── frontend/
│ └── src/
│ ├── api/client.ts # Typed API client, axios interceptors, all request functions
│ ├── context/
│ │ ├── AppContext.tsx # Global state: graph, impact, chains, WebSocket stream
│ │ └── AuthContext.tsx # JWT auth state, login/logout
│ ├── pages/
│ │ ├── LoginPage.tsx # Auth screen
│ │ ├── SetupPage.tsx # Repo connect screen
│ │ ├── AnalyzingPage.tsx# 6-layer pipeline progress UI
│ │ └── MainApp.tsx # Root layout
│ └── components/app/
│ ├── TopBar.tsx
│ ├── LeftSidebar.tsx # File tree + node list + search
│ ├── GraphCanvas.tsx # 2.5D WebGL graph (Three.js + react-force-graph-3d, Z-flattened)
│ ├── RightPanel.tsx # Impact / Chat / Migrate / Chains tabs
│ ├── Terminal.tsx # Collapsible log terminal
│ ├── VariableChain.tsx
│ └── tabs/
│ ├── ImpactTab.tsx # Severity gauge + dependency chain timeline
│ ├── ChatTab.tsx # RAG chat with markdown renderer + session history
│ └── MigrateTab.tsx # Rename planner + apply/download
│
├── requirements.txt
├── .env.example
└── README.md
Every .sql, .py, .ts, .tsx, .js, and .jsx file in the repo is parsed into a tree of CodeNode objects. SQL uses sqlglot; Python and TypeScript use tree-sitter grammars. Each node captures: name, type, language, file, line range, source lines, and parent/child relationships.
The GraphBuilder walks all parsed nodes and creates directed edges based on deterministic rules:
ORM_MAP— Python field name matches SQL column name in snake_caseCONVENTION_MAP— camelCase TypeScript prop matches snake_case Python fieldIMPORTS— static import analysis from ASTCALLS— function call detection within the same language
Each node is tested against a set of regex patterns (BOUNDARY_PATTERNS) tuned per language — Pydantic BaseModel, FastAPI route decorators, DRF Serializer, TypeScript interface *DTO, Zod schemas, Mongoose models, React prop access patterns. Nodes that match are flagged as is_boundary = True. These become the candidates for LLM enrichment in the next layer.
Boundary nodes are batched (all at once, ≤4 total API calls) and sent to GLM-4.7 via Featherless. The LLM annotates:
- Semantic relationship type (
MAPS_TO,SERIALIZES_TO,FLOWS_TO, etc.) - Transformation type (
snake_to_camel,direct,alias) - Break risk per edge (
CRITICAL,HIGH,MEDIUM,LOW) - Confidence score (0.0–1.0)
If the LLM API is unavailable, the pipeline continues with structural-only edges — no crash, no silent failure.
All nodes and edges are merged into a NetworkX DiGraph and persisted to disk as depgraph_<username>.json. The graph also detects variable chains (multi-hop data flows with named transformations) and API routes (FastAPI/Express endpoints with their input/output types).
Three query modes on the frozen graph:
- Fast BFS (
get_impact) — instant, no LLM, pure graph traversal. Returns the full downstream chain with per-hop confidence and break risk. - LLM Narration (
narrate_impact) — feeds the BFS result into a prompt and returns a developer-readable plain-English explanation. - Graph RAG Chat (
graph_rag_chat) — retrieves a relevant subgraph neighborhood for the user's question, builds a grounded system prompt, and runs a conversational Q&A loop with session history.
ImpactScore = Σ(dependents × edge_weight) × api_multiplier × coverage_multiplier
Where:
edge_weight— 3.0 forBREAKS_IF_RENAMED, 2.5 forMAPS_TO/SERIALIZES_TO, 1.5 forFLOWS_TO, 1.0 forIMPORTSapi_multiplier— 2.0 if the node is exposed via a public API route, else 1.0coverage_multiplier— 1.5 if the affected nodes have no test coverage signals, else 1.0
Tier thresholds: CRITICAL ≥ 8 · HIGH ≥ 4 · MEDIUM ≥ 1 · LOW < 1
# Install the hook into any git repo
python scripts/install_hooks.py /absolute/path/to/your/repo
# The hook runs automatically on every commit
git add models.py
git commit -m "rename user_email to email"
# → DepGraph.ai: checking cross-language impact...
# → BLOCKED: user_email has CRITICAL cross-language break risk
# → Affected: auth_service.py:34, serializers.py:18, UserProfile.tsx:47
# → Use git commit --no-verify to skip this check.The hook uses git diff --cached to get staged files, maps them to graph nodes, and traverses downstream edges. Only commits with break_risk = HIGH or CRITICAL are blocked — low/medium risk changes pass through with a warning.
Backend
- FastAPI + Uvicorn — async REST + WebSocket API
- NetworkX — directed graph engine
- tree-sitter — language-agnostic AST parsing (Python, TypeScript, JavaScript)
- sqlglot — SQL dialect-agnostic schema parsing
- OpenAI SDK (pointed at Featherless) — LLM annotation and RAG
- GitPython — repo cloning and git operations
- PyJWT + passlib — auth
Frontend
- React 18 + TypeScript + Vite
- react-force-graph-3d + Three.js — 2.5D WebGL knowledge graph (3D engine, Z-axis flattened for layered layout)
- Framer Motion — animations
- shadcn/ui + Radix UI — component library
- TanStack Query — server state
- Axios — HTTP client with auth interceptors
- Tailwind CSS — styling
- Analysis time scales with repo size. Large repos (>500 files) may take 2–5 minutes for the full LLM annotation pass. The structural-only mode (no API key) is always fast.
- LLM accuracy varies. The deterministic AST graph is always correct; LLM-enriched edges are best-effort. Confidence scores are shown on every edge.
- Apply / in-place patching requires that the analyzed repo path is set and accessible on the same machine as the backend. Remote-only usage should use the download-as-zip workflow instead.
- Python 2 syntax is not supported. tree-sitter Python grammar targets Python 3.
- Monorepos with complex build systems (nx, Turborepo workspaces) may need the repo root to be pointed at the relevant sub-package rather than the workspace root.
Pull requests are welcome. For major changes, open an issue first to discuss what you'd like to change.
# Run backend tests
pytest
# Run frontend tests
cd frontend && npm test
# Lint
cd frontend && npm run lintMIT