Skip to content

Code search MCP for Claude Code. Make entire codebase the context for any coding agent. Embeddings are created and stored locally. No API cost.

Notifications You must be signed in to change notification settings

forkni/claude-context-local

Β 
Β 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

  β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•— β–ˆβ–ˆβ•—       β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•—  β–ˆβ–ˆβ•—   β–ˆβ–ˆβ•— β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•—  β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•—
 β–ˆβ–ˆβ•”β•β•β•β•β• β–ˆβ–ˆβ•‘      β–ˆβ–ˆβ•”β•β•β–ˆβ–ˆβ•— β–ˆβ–ˆβ•‘   β–ˆβ–ˆβ•‘ β–ˆβ–ˆβ•”β•β•β–ˆβ–ˆβ•— β–ˆβ–ˆβ•”β•β•β•β•β•
 β–ˆβ–ˆβ•‘      β–ˆβ–ˆβ•‘      β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•‘ β–ˆβ–ˆβ•‘   β–ˆβ–ˆβ•‘ β–ˆβ–ˆβ•‘  β–ˆβ–ˆβ•‘ β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•—
 β–ˆβ–ˆβ•‘      β–ˆβ–ˆβ•‘      β–ˆβ–ˆβ•”β•β•β–ˆβ–ˆβ•‘ β–ˆβ–ˆβ•‘   β–ˆβ–ˆβ•‘ β–ˆβ–ˆβ•‘  β–ˆβ–ˆβ•‘ β–ˆβ–ˆβ•”β•β•β•
 β•šβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•— β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•— β–ˆβ–ˆβ•‘  β–ˆβ–ˆβ•‘ β•šβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•”β• β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•”β• β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•—
  β•šβ•β•β•β•β•β• β•šβ•β•β•β•β•β•β• β•šβ•β•  β•šβ•β•  β•šβ•β•β•β•β•β•  β•šβ•β•β•β•β•β•  β•šβ•β•β•β•β•β•β•

  β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•—  β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•—  β–ˆβ–ˆβ–ˆβ•—   β–ˆβ–ˆβ•— β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•— β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•— β–ˆβ–ˆβ•—  β–ˆβ–ˆβ•— β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•—
 β–ˆβ–ˆβ•”β•β•β•β•β• β–ˆβ–ˆβ•”β•β•β•β–ˆβ–ˆβ•— β–ˆβ–ˆβ–ˆβ–ˆβ•—  β–ˆβ–ˆβ•‘ β•šβ•β•β–ˆβ–ˆβ•”β•β•β• β–ˆβ–ˆβ•”β•β•β•β•β• β•šβ–ˆβ–ˆβ•—β–ˆβ–ˆβ•”β• β•šβ•β•β–ˆβ–ˆβ•”β•β•β•
 β–ˆβ–ˆβ•‘      β–ˆβ–ˆβ•‘   β–ˆβ–ˆβ•‘ β–ˆβ–ˆβ•”β–ˆβ–ˆβ•— β–ˆβ–ˆβ•‘    β–ˆβ–ˆβ•‘    β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•—    β•šβ–ˆβ–ˆβ–ˆβ•”β•     β–ˆβ–ˆβ•‘
 β–ˆβ–ˆβ•‘      β–ˆβ–ˆβ•‘   β–ˆβ–ˆβ•‘ β–ˆβ–ˆβ•‘β•šβ–ˆβ–ˆβ•—β–ˆβ–ˆβ•‘    β–ˆβ–ˆβ•‘    β–ˆβ–ˆβ•”β•β•β•    β–ˆβ–ˆβ•”β–ˆβ–ˆβ•—     β–ˆβ–ˆβ•‘
 β•šβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•— β•šβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•”β• β–ˆβ–ˆβ•‘ β•šβ–ˆβ–ˆβ–ˆβ–ˆβ•‘    β–ˆβ–ˆβ•‘    β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•— β–ˆβ–ˆβ•”β• β–ˆβ–ˆβ•—    β–ˆβ–ˆβ•‘
  β•šβ•β•β•β•β•β•  β•šβ•β•β•β•β•β•  β•šβ•β•  β•šβ•β•β•β•    β•šβ•β•    β•šβ•β•β•β•β•β•β• β•šβ•β•  β•šβ•β•    β•šβ•β•

 β–ˆβ–ˆβ•—       β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•—   β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•—  β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•—  β–ˆβ–ˆβ•—
 β–ˆβ–ˆβ•‘      β–ˆβ–ˆβ•”β•β•β•β–ˆβ–ˆβ•— β–ˆβ–ˆβ•”β•β•β•β•β• β–ˆβ–ˆβ•”β•β•β–ˆβ–ˆβ•— β–ˆβ–ˆβ•‘
 β–ˆβ–ˆβ•‘      β–ˆβ–ˆβ•‘   β–ˆβ–ˆβ•‘ β–ˆβ–ˆβ•‘      β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•‘ β–ˆβ–ˆβ•‘
 β–ˆβ–ˆβ•‘      β–ˆβ–ˆβ•‘   β–ˆβ–ˆβ•‘ β–ˆβ–ˆβ•‘      β–ˆβ–ˆβ•”β•β•β–ˆβ–ˆβ•‘ β–ˆβ–ˆβ•‘
 β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•— β•šβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•”β• β•šβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•— β–ˆβ–ˆβ•‘  β–ˆβ–ˆβ•‘ β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•—
 β•šβ•β•β•β•β•β•β•  β•šβ•β•β•β•β•β•   β•šβ•β•β•β•β•β• β•šβ•β•  β•šβ•β• β•šβ•β•β•β•β•β•β•

General-Purpose Semantic Code Search for Windows. Advanced hybrid search that combines semantic understanding with text matching, running 100% locally using EmbeddingGemma or BGE-M3. No API keys, no costs, your code never leaves your machine.

  • πŸ” Hybrid search: BM25 + semantic for best accuracy (44.4% precision, 100% MRR)
  • πŸ“ˆ Optimized search efficiency with sub-second response times (162-487ms)
  • πŸ”’ 100% local - completely private
  • πŸ’° Zero API costs - forever free
  • ⚑ 5-10x faster indexing with incremental updates
  • πŸͺŸ Windows-optimized for maximum performance and compatibility
  • πŸ”„ Instant model switching (<150ms) with per-model index storage
  • πŸ› οΈ 13 MCP tools for Claude Code integration (search, index, configure)

An intelligent code search system that uses Google's EmbeddingGemma or BAAI's BGE-M3 models and advanced multi-language chunking to provide semantic search capabilities across 22 file extensions and 11 programming languages, integrated with Claude Code via MCP (Model Context Protocol).

Status

  • 🚧 Active Development: This project is under active development. Some functionality may change as we continue to improve the system.
  • Core functionality fully operational
  • Windows-optimized installation with automated setup
  • All search modes working (semantic, BM25, hybrid)
  • Please report any issues!

Demo

Demo of local semantic code search

Features

πŸ” Advanced Search Capabilities

  • Hybrid search: BM25 + Semantic fusion combines text matching with semantic understanding
  • Three search modes: Semantic, BM25 text-based, and hybrid with RRF reranking
  • Proven search quality: 44.4% precision, 46.7% F1-score, 100% MRR (see benchmarks)
  • Sub-second performance: 162-487ms response times across all search modes
  • Configurable weights: Tune balance between text and semantic search
  • Auto-mode detection: System automatically chooses best search strategy

πŸš€ Core Features

  • Multi-language support: 11 programming languages with 22 file extensions
  • Intelligent chunking: AST-based (Python) + tree-sitter (JS/TS/JSX/TSX/Svelte/Go/Java/Rust/C/C++/C#/GLSL)
  • Semantic search: Natural language queries to find code across all languages
  • Rich metadata: File paths, folder structure, semantic tags, language-specific info
  • MCP integration: 13 tools for Claude Code - search, index, configure, and monitor
  • Local processing: All embeddings stored locally, no API calls required
  • Fast search: FAISS for efficient similarity search with GPU acceleration support
  • Incremental indexing: 5-10x faster updates with Merkle tree change detection

Why this

Claude’s code context is powerful, but sending your code to the cloud costs tokens and raises privacy concerns. This project keeps semantic code search entirely on your machine. It integrates with Claude Code via MCP, so you keep the same workflowβ€”just faster, cheaper, and private.

Requirements

  • Python 3.11+ (tested with Python 3.11 and 3.12)
  • RAM: 4GB minimum (8GB+ recommended for large codebases)
  • Disk: 2-4GB free space (model cache + embeddings + indexes)
    • EmbeddingGemma: ~1.2GB
    • BGE-M3: ~2.2GB (optional upgrade)
  • Windows: Windows 10/11 with PowerShell
  • PyTorch: 2.6.0+ (automatically installed)
    • Required for BGE-M3 model support
    • Includes security fixes
  • Optional GPU: NVIDIA GPU with CUDA 11.8/12.4/12.6 for accelerated indexing (8.6x faster)
    • PyTorch 2.6.0+ with CUDA 11.8/12.4/12.6 support
    • FAISS GPU acceleration for vector search
    • CUDA acceleration for embedding generation
    • Everything works on CPU if GPU unavailable

Install & Update

Windows Installation

# 1. Clone the repository
git clone https://github.com/forkni/claude-context-local.git
cd claude-context-local

# 2. Run the unified Windows installer (auto-detects CUDA)
install-windows.bat

# 3. Verify installation
verify-installation.bat

# 4. (Optional) Configure Claude Code MCP integration
.\scripts\batch\manual_configure.bat

⚠️ Important: The installer will prompt for HuggingFace authentication during setup. You'll need a HuggingFace token to access the EmbeddingGemma model. Get your token at https://huggingface.co/settings/tokens and accept terms at https://huggingface.co/google/embeddinggemma-300m.

Windows Installer Features:

  • Smart CUDA Detection: Automatically detects your CUDA version and installs appropriate PyTorch
  • One-Click Setup: Complete installation with single command
  • Built-in Verification: Comprehensive testing with verify-installation.bat
  • Professional Organization: Clean, streamlined script structure

Update existing installation

Update by pulling latest changes:

# Navigate to your project directory
cd claude-context-local
git pull

# Re-run the Windows installer to update dependencies
install-windows.bat

# Verify the update
verify-installation.bat

The Windows installer will:

  • Update the code and dependencies automatically
  • Preserve your embeddings and indexed projects in ~/.claude_code_search
  • Update only changed components with intelligent caching
  • Maintain your existing MCP server configuration

What the Windows installer does

  • Detects and installs uv package manager if missing
  • Creates and manages the project virtual environment
  • Installs Python dependencies with optimized resolution using uv sync
  • Downloads the EmbeddingGemma model (~1.2–1.3 GB) if not already cached
  • Automatically detects CUDA and installs PyTorch 2.6.0+ with appropriate CUDA version
  • Configures faiss-gpu if an NVIDIA GPU is detected
  • Preserves all your indexed projects and embeddings across updates

Quick Start

1) Install and Setup

# Windows - One-click installation
install-windows.bat

# Verify everything is working
verify-installation.bat

# The installer automatically:
# - Detects your hardware (CUDA/CPU)
# - Installs appropriate PyTorch version
# - Sets up all dependencies
# - Creates virtual environment

2) Start the MCP Server

# Main entry point - Interactive menu with 8 functional options
start_mcp_server.bat

# Alternative launchers:
# Debug mode with enhanced logging
scripts\batch\start_mcp_debug.bat

# Simple mode with minimal output
scripts\batch\start_mcp_simple.bat

Optional: Configure Claude Code Integration

# One-time setup to register MCP server with Claude Code
.\scripts\batch\manual_configure.bat

# Manual registration (alternative)
claude mcp add code-search --scope user -- "F:\path\to\claude-context-local\.venv\Scripts\python.exe" -m mcp_server.server

3) Use in Claude Code

Essential Workflow

# 1. Index your project (one-time setup)
/index_directory "C:\path\to\your\project"

# 2. Search your code with natural language
/search_code "authentication functions"
/search_code "error handling patterns"
/search_code "database connection setup"
/search_code "API endpoint handlers"
/search_code "configuration loading"

Advanced Search Examples

# Find similar code to existing implementations
/find_similar_code "project_file.py:123-145:function:authenticate_user"

# Check system status and performance
/get_index_status
/get_memory_status

# Configure search modes for specific needs
/configure_search_mode "hybrid" 0.4 0.6 true
/get_search_config_status

# Project management
/list_projects
/switch_project "C:\different\project\path"

Practical Usage Tips

  • Start simple: Use natural language queries like "error handling" or "database connection"
  • Be specific: "React component with useState hook" vs just "React"
  • Use context: "authentication middleware" vs "auth" for better results
  • Try different modes: Switch between semantic, hybrid, and text search as needed
  • Clean up: Use /cleanup_resources when switching between large projects

No manual configuration needed - the system automatically uses the best search mode for your queries.

4) Setting Up CLAUDE.md for Your Project (Optional but Recommended)

To maximize efficiency when using Claude Code with this MCP server, create a CLAUDE.md file in your project root. This file instructs Claude to prioritize semantic search over traditional file reading, ensuring optimal token usage.

Why CLAUDE.md?

  • 93% Token Reduction: Enforces search-first workflow (400 tokens vs 5,600 tokens)
  • 10x Faster: Semantic search (3-5s) vs traditional file reading (30-60s)
  • Immediate Access: MCP tools visible to Claude without explaining each time
  • Project-Specific: Customize instructions for your codebase

Minimal CLAUDE.md Template

Create a CLAUDE.md file in your project root with this content:

# Project Instructions for Claude Code

## πŸ”΄ CRITICAL: Search-First Protocol

**MANDATORY**: For ALL codebase tasks, ALWAYS use semantic search FIRST before reading files.

### Workflow Sequence

1. **Index**: `/index_directory "C:\path\to\your\project"` - One-time setup
2. **Search**: `/search_code "natural language query"` - Find code instantly
3. **Edit**: Use `Read` tool ONLY after search identifies exact file

### Performance Impact

| Method | Tokens | Speed | Result |
|--------|--------|-------|--------|
| Traditional file reading | 5,600 tokens | 30-60s | Limited context |
| Semantic search | 400 tokens | 3-5s | Precision targeting |
| **Token savings** | **93%** | **10x faster** | **Cross-file relationships** |

### Critical Rules

- βœ… **ALWAYS**: `search_code()` for exploration/understanding
- βœ… **ALWAYS**: Index before searching: `index_directory(path)`
- ❌ **NEVER**: Read files without searching first
- ❌ **NEVER**: Use `Glob()` for code exploration
- ❌ **NEVER**: Grep manually for code patterns

**Every file read without search wastes 1,000+ tokens**

---

## Available MCP Tools (13)

| Tool | Priority | Purpose |
|------|----------|---------|
| **search_code** | πŸ”΄ **ESSENTIAL** | Find code with natural language |
| **index_directory** | πŸ”΄ **SETUP** | Index project (one-time) |
| find_similar_code | Secondary | Find alternative implementations |
| configure_search_mode | Config | Set search mode (hybrid/semantic/BM25) |
| get_search_config_status | Config | View current search configuration |
| get_index_status | Status | Check index health |
| get_memory_status | Monitor | Check RAM/VRAM usage |
| list_projects | Management | Show indexed projects |
| switch_project | Management | Change active project |
| clear_index | Reset | Delete current index |
| cleanup_resources | Cleanup | Free memory/caches |
| run_benchmark | Testing | Validate search quality |

### Quick Examples

```bash
# Essential workflow
/index_directory "C:\Projects\MyApp"
/search_code "authentication functions"
/search_code "error handling patterns"

# Advanced usage
/find_similar_code "auth.py:15-42:function:login"
/configure_search_mode "hybrid" 0.4 0.6
/get_index_status

Search Modes

  • hybrid (default) - BM25 + semantic fusion (best accuracy)
  • semantic - Dense vector search only (best for concepts)
  • bm25 - Sparse keyword search only (best for exact terms)
  • auto - Adaptive mode selection

πŸ“š Full Tool Reference: See docs/MCP_TOOLS_REFERENCE.md for complete documentation with all parameters and examples.


#### Customization Tips

1. **Copy the Template**: Save the content above to `CLAUDE.md` in your project root
2. **Adjust Paths**: Update the index_directory path to match your project
3. **Add Project Rules**: Include project-specific coding conventions, architecture notes, or common patterns
4. **Use Full Reference**: For complete tool documentation, copy content from `docs/MCP_TOOLS_REFERENCE.md`

#### How It Works

- Claude Code automatically reads `CLAUDE.md` from your project directory
- Instructions apply to all Claude sessions in that project
- MCP tools are immediately available without explanation
- Search-first workflow becomes automatic

#### Example Projects

This repository's own `CLAUDE.md` demonstrates advanced usage with:
- Comprehensive MCP tool documentation
- Project-specific architecture notes
- Model selection guidance
- Testing and benchmarking instructions

> **Note**: The `CLAUDE.md` in this repository is project-specific. Use the minimal template above for your own projects, then customize as needed.

## Running Benchmarks

The project includes comprehensive benchmarking tools to validate performance:

### Quick Start

```bash
# Windows - Interactive benchmark menu
run_benchmarks.bat

Available Options:

  1. Token Efficiency Benchmark (~10 seconds)

    • Validates 98.6% token reduction vs traditional file reading
    • Results saved to: benchmark_results/token_efficiency/
  2. Search Method Comparison (~2-3 minutes)

    • Automatically compares all 3 search methods (hybrid, BM25, semantic)
    • Uses current project directory for realistic evaluation
    • Results saved to: benchmark_results/method_comparison/
    • Generates comparison report with winner declaration
  3. Auto-Tune Search Parameters (~2 minutes)

    • Optimize BM25/Dense weights for your codebase
    • Tests 3 strategic configurations
    • Results saved to: benchmark_results/tuning/
  4. Run All Benchmarks (~4-5 minutes)

    • Complete test suite including auto-tuning
    • Comprehensive results across all metrics

Command Line Usage

# Method comparison (recommended)
.venv\Scripts\python.exe evaluation/run_evaluation.py method-comparison --project "." --k 5

# Token efficiency evaluation
.venv\Scripts\python.exe evaluation/run_evaluation.py token-efficiency

# Force CPU usage (if GPU issues)
.venv\Scripts\python.exe evaluation/run_evaluation.py token-efficiency --cpu

Results are saved to benchmark_results/ directory (gitignored for privacy). See docs/BENCHMARKS.md for detailed performance metrics.

Search Modes & Performance

Available Search Modes

Mode Description Best For Performance Quality Metrics Status
hybrid BM25 + Semantic with RRF reranking (default) General use, balanced accuracy 487ms, optimal accuracy 44.4% precision, 100% MRR βœ… Fully operational
semantic Dense vector search only Conceptual queries, code similarity 487ms, semantic understanding 38.9% precision, 100% MRR βœ… Fixed 2025-09-25
bm25 Text-based sparse search only Exact matches, error messages 162ms, fastest 33.3% precision, 61.1% MRR βœ… Fully operational
auto Automatically choose based on query Let system optimize Adaptive performance Context-dependent βœ… Fully operational

For detailed configuration options, see Hybrid Search Configuration Guide.

πŸ“Š Performance benchmarks and detailed metrics: View Benchmarks

Architecture

claude-context-local/
β”œβ”€β”€ chunking/                         # Multi-language chunking (22 extensions)
β”‚   β”œβ”€β”€ multi_language_chunker.py     # Unified orchestrator (Python AST + tree-sitter)
β”‚   β”œβ”€β”€ python_ast_chunker.py         # Python-specific chunking (rich metadata)
β”‚   └── tree_sitter.py                # Tree-sitter: JS/TS/JSX/TSX/Svelte/Go/Java/Rust/C/C++/C#/GLSL
β”œβ”€β”€ embeddings/
β”‚   └── embedder.py                   # EmbeddingGemma; device=auto (CUDAβ†’MPSβ†’CPU); offline cache
β”œβ”€β”€ search/
β”‚   β”œβ”€β”€ indexer.py                    # FAISS index (CPU by default; GPU when available)
β”‚   β”œβ”€β”€ searcher.py                   # Intelligent ranking & filters
β”‚   β”œβ”€β”€ incremental_indexer.py        # Merkle-driven incremental indexing
β”‚   β”œβ”€β”€ hybrid_searcher.py            # BM25 + semantic fusion
β”‚   β”œβ”€β”€ bm25_index.py                 # BM25 text search implementation
β”‚   β”œβ”€β”€ reranker.py                   # RRF (Reciprocal Rank Fusion) reranking
β”‚   └── config.py                     # Search configuration management
β”œβ”€β”€ merkle/
β”‚   β”œβ”€β”€ merkle_dag.py                 # Content-hash DAG of the workspace
β”‚   β”œβ”€β”€ change_detector.py            # Diffs snapshots to find changed files
β”‚   └── snapshot_manager.py           # Snapshot persistence & stats
β”œβ”€β”€ mcp_server/
β”‚   └── server.py                     # MCP tools for Claude Code (stdio/HTTP)
β”œβ”€β”€ tools/                            # Development utilities
β”‚   β”œβ”€β”€ index_project.py              # Interactive project indexing
β”‚   β”œβ”€β”€ search_helper.py              # Standalone search interface
β”‚   └── auto_tune_search.py           # Parameter optimization tool
β”œβ”€β”€ evaluation/                       # Comprehensive evaluation framework
β”‚   β”œβ”€β”€ base_evaluator.py             # Base evaluation framework
β”‚   β”œβ”€β”€ semantic_evaluator.py         # Search quality evaluation
β”‚   β”œβ”€β”€ token_efficiency_evaluator.py # Token usage measurement
β”‚   β”œβ”€β”€ parameter_optimizer.py        # Search parameter optimization
β”‚   β”œβ”€β”€ run_evaluation.py             # Evaluation orchestrator
β”‚   β”œβ”€β”€ datasets/                     # Evaluation datasets
β”‚   β”‚   β”œβ”€β”€ debug_scenarios.json      # Debug test scenarios
β”‚   β”‚   └── token_efficiency_scenarios.json # Token efficiency tests
β”‚   └── README.md                     # Evaluation documentation
β”œβ”€β”€ scripts/
β”‚   β”œβ”€β”€ batch/                        # Windows batch scripts
β”‚   β”‚   β”œβ”€β”€ install_pytorch_cuda.bat  # PyTorch CUDA installation
β”‚   β”‚   β”œβ”€β”€ mcp_server_wrapper.bat    # MCP server wrapper script
β”‚   β”‚   β”œβ”€β”€ start_mcp_debug.bat       # Debug mode launcher
β”‚   β”‚   └── start_mcp_simple.bat      # Simple mode launcher
β”‚   β”œβ”€β”€ powershell/                   # Windows PowerShell scripts
β”‚   β”‚   β”œβ”€β”€ configure_claude_code.ps1 # Claude Code MCP configuration
β”‚   β”‚   β”œβ”€β”€ hf_auth.ps1               # HuggingFace authentication helper
β”‚   β”‚   β”œβ”€β”€ install-windows.ps1       # Windows automated installer
β”‚   β”‚   └── start_mcp_server.ps1      # PowerShell MCP server launcher
β”‚   β”œβ”€β”€ git/                          # Git workflow automation (13 scripts: 10 .bat + 3 .sh)
β”‚   β”‚   β”œβ”€β”€ commit.bat                # Privacy-protected commits
β”‚   β”‚   β”œβ”€β”€ sync_branches.bat         # Branch synchronization
β”‚   β”‚   β”œβ”€β”€ restore_local.bat         # Local file recovery
β”‚   β”‚   β”œβ”€β”€ merge_docs.bat            # Documentation-only merge
β”‚   β”‚   β”œβ”€β”€ cherry_pick_commits.bat   # Selective commit merging
β”‚   β”‚   β”œβ”€β”€ commit_enhanced.bat       # Enhanced commit with validations
β”‚   β”‚   β”œβ”€β”€ merge_with_validation.bat # Full merge with .gitattributes support
β”‚   β”‚   β”œβ”€β”€ validate_branches.bat     # Branch state validation (Windows cmd.exe)
β”‚   β”‚   β”œβ”€β”€ check_lint.bat/.sh        # Lint validation (cmd.exe + Git Bash)
β”‚   β”‚   β”œβ”€β”€ fix_lint.bat/.sh          # Auto-fix lint issues (cmd.exe + Git Bash)
β”‚   β”‚   β”œβ”€β”€ validate_branches.sh      # Branch validation (Git Bash/Linux/macOS)
β”‚   β”‚   β”œβ”€β”€ sync_status.bat           # Check synchronization status
β”‚   β”‚   └── rollback_merge.bat        # Rollback failed merges
β”‚   β”œβ”€β”€ verify_installation.py        # Python verification system
β”‚   └── verify_hf_auth.py             # HuggingFace auth verification
β”œβ”€β”€ .claude/
β”‚   └── commands/                     # Custom Claude Code commands
β”‚       β”œβ”€β”€ create-pr.md              # Automated PR creation
β”‚       β”œβ”€β”€ run-merge.md              # Guided merge workflow
β”‚       └── validate-changes.md       # Pre-commit validation
β”œβ”€β”€ .github/
β”‚   └── workflows/                    # GitHub Actions automation
β”‚       β”œβ”€β”€ branch-protection.yml     # Validation, testing, linting
β”‚       β”œβ”€β”€ claude.yml                # Interactive @claude mentions
β”‚       β”œβ”€β”€ docs-validation.yml       # Documentation quality checks
β”‚       └── merge-development-to-main.yml # Manual merge workflow
β”œβ”€β”€ docs/
β”‚   β”œβ”€β”€ BENCHMARKS.md                 # Performance benchmarks
β”‚   β”œβ”€β”€ HYBRID_SEARCH_CONFIGURATION_GUIDE.md # Search configuration
β”‚   β”œβ”€β”€ INSTALLATION_GUIDE.md         # Installation instructions
β”‚   └── claude_code_config.md         # Claude Code integration
β”œβ”€β”€ CHANGELOG.md                      # Version history
β”œβ”€β”€ start_mcp_server.bat              # Main launcher (Windows)
β”œβ”€β”€ install-windows.bat               # Primary installer (Windows)
β”œβ”€β”€ verify-installation.bat           # Installation verification
β”œβ”€β”€ verify-hf-auth.bat                # HuggingFace auth verification
└── run_benchmarks.bat                # Benchmark launcher

Data flow

graph TD
    A["Claude Code (MCP client)"] -->|index_directory| B["MCP Server"]
    B --> C{IncrementalIndexer}
    C --> D["ChangeDetector<br/>(Merkle DAG)"]
    C --> E["MultiLanguageChunker"]
    E --> F["Code Chunks"]
    C --> G["CodeEmbedder<br/>(EmbeddingGemma)"]
    G --> H["Embeddings"]
    C --> I["CodeIndexManager<br/>(FAISS CPU/GPU)"]
    H --> I
    D --> J["SnapshotManager"]
    C --> J
    B -->|search_code| K["Searcher"]
    K --> I
Loading

Intelligent Chunking

The system uses advanced parsing to create semantically meaningful chunks across all supported languages:

Chunking Strategies

  • Python: AST-based parsing for rich metadata extraction
  • All other languages: Tree-sitter parsing with language-specific node type recognition

Chunk Types Extracted

  • Functions/Methods: Complete with signatures, docstrings, decorators
  • Classes/Structs: Full definitions with member functions as separate chunks
  • Interfaces/Traits: Type definitions and contracts
  • Enums/Constants: Value definitions and module-level declarations
  • Namespaces/Modules: Organizational structures
  • Templates/Generics: Parameterized type definitions
  • GLSL Shaders: Vertex, fragment, compute, geometry, tessellation shaders with uniforms and layouts

Rich Metadata for All Languages

  • File path and folder structure
  • Function/class/type names and relationships
  • Language-specific features (async, generics, modifiers, etc.)
  • Parent-child relationships (methods within classes)
  • Line numbers for precise code location
  • Semantic tags (component, export, async, etc.)

Configuration

Environment Variables

  • CODE_SEARCH_STORAGE: Custom storage directory (default: ~/.claude_code_search)

Embedding Models

The system supports multiple embedding models for different performance/accuracy trade-offs:

Available Models

Model Dimensions VRAM Context Best For
EmbeddingGemma-300m (default) 768 4-8GB 2048 tokens Fast, efficient, smaller projects
BGE-M3 1024 8-16GB 8192 tokens Higher accuracy (+13.6% F1), production systems

Switching Models

Via Interactive Menu:

start_mcp_server.bat
# Navigate: 3 (Search Configuration) β†’ 4 (Select Embedding Model)

Via Environment Variable:

set CLAUDE_EMBEDDING_MODEL=BAAI/bge-m3  # Switch to BGE-M3
set CLAUDE_EMBEDDING_MODEL=google/embeddinggemma-300m  # Switch to Gemma

See Model Migration Guide for detailed comparison and migration steps.

✨ Instant Model Switching

Zero re-indexing overhead when switching between models - switch in <150ms:

Performance:

  • First use: ~30-60s (indexing required)
  • Return to previous model: <150ms (instant!)
  • Time savings: 98% reduction (50-90s β†’ <1s)

How It Works:

  • Per-dimension storage: {project}_{hash}_{768d|1024d}/
    • Gemma (768d): project_abc123_768d/
    • BGE-M3 (1024d): project_abc123_1024d/
  • Independent Merkle snapshots per model dimension
  • Instant activation of existing indices when switching back

Example Workflow:

# Index with BGE-M3 (~30s first time)
/switch_embedding_model "BAAI/bge-m3"
/index_directory "C:\Projects\MyApp"

# Switch to Gemma (~20s first time)
/switch_embedding_model "google/embeddinggemma-300m"
/index_directory "C:\Projects\MyApp"

# Switch back to BGE-M3 (INSTANT - <150ms!)
/switch_embedding_model "BAAI/bge-m3"

# Compare search results instantly
/search_code "authentication"  # BGE-M3 results
/switch_embedding_model "google/embeddinggemma-300m"  # Instant switch!
/search_code "authentication"  # Gemma results

πŸ“š Technical details: See docs/PER_MODEL_INDICES_IMPLEMENTATION.md (development branch)

πŸš€ GPU Memory Optimization

Automatic memory cleanup keeps vRAM usage optimal during indexing:

Performance Impact:

  • Before optimization: 1.4GB β†’ 8GB during indexing (memory leak)
  • After optimization: 1.4GB β†’ 3-4GB during indexing (72% reduction)
  • Memory cleanup: Drops to 1.4GB baseline on demand

How It Works:

The system implements comprehensive GPU memory management:

  1. Python garbage collection: gc.collect() frees wrapper objects first
  2. CUDA cache cleanup: torch.cuda.empty_cache() releases GPU tensors
  3. Automatic triggers: Runs after every indexing operation (full or incremental)
  4. Manual cleanup: Use /cleanup_resources MCP tool anytime

Memory Lifecycle:

Baseline (idle):        1.4GB
↓ Index with Gemma:     3.0GB  (model + embeddings)
↓ Index with BGE-M3:    4.0GB  (larger model)
↓ Manual cleanup:       1.4GB  (back to baseline)

When to Use Manual Cleanup:

  • After large indexing operations
  • When switching between multiple projects
  • Before intensive operations requiring GPU memory
  • If you notice high vRAM usage

Example:

# Index a large project
/index_directory "C:\LargeProject"

# Check memory usage
/get_memory_status

# Clean up GPU memory
/cleanup_resources
# Actions: Index cleared, Embedder cleaned, GPU cache freed, 7000+ objects collected

# Verify cleanup
/get_memory_status  # Should show baseline ~1.4GB

πŸ“š Implementation details: Cleanup uses gc.collect() + torch.cuda.empty_cache() pattern recommended by PyTorch and ComfyUI communities for optimal memory management.

Model Configuration

The system supports two embedding models:

  • Default: google/embeddinggemma-300m (768 dimensions, 4-8GB VRAM)
  • Upgrade: BAAI/bge-m3 (1024 dimensions, 8-16GB VRAM, +13.6% F1-score)

Notes:

  • Download size: ~1.2GB (Gemma) or ~2.2GB (BGE-M3)
  • Device selection: auto (CUDA on NVIDIA, MPS on Apple Silicon, else CPU)
  • Models are cached after first download in ~/.cache/huggingface/hub
  • Cache detection implemented - models load instantly on subsequent uses
  • FAISS backend: CPU by default. If an NVIDIA GPU is detected, the installer attempts to install faiss-gpu-cu12 (or faiss-gpu-cu11) and the index will run on GPU automatically at runtime while saving as CPU for portability.

Hugging Face authentication (if prompted)

The google/embeddinggemma-300m model is hosted on Hugging Face and may require accepting terms and/or authentication to download.

  1. Visit the model page and accept any terms:

  2. Authenticate one of the following ways:

    • CLI (recommended):

      uv run huggingface-cli login
      # Paste your token from https://huggingface.co/settings/tokens
    • Environment variable:

      export HUGGING_FACE_HUB_TOKEN=hf_XXXXXXXXXXXXXXXXXXXXXXXX

After the first successful download, we cache the model under ~/.claude_code_search/models and prefer offline loads for speed and reliability.

Model Caching

Once downloaded, models are cached locally for instant loading:

  • Cache location: ~/.cache/huggingface/hub/
  • Offline mode: Automatically enabled when cached models detected
  • Load time: 2-5 seconds from cache (vs minutes for download)
  • No internet required: After initial download
  • Cache detection: Implemented in embedder for both Gemma and BGE-M3

Hybrid Search Configuration

The system supports multiple search modes with configurable parameters:

Quick Configuration via MCP Tools

# Configure hybrid search (recommended)
/configure_search_mode "hybrid" 0.4 0.6 true

# Check current configuration
/get_search_config_status

# Switch to semantic-only mode
/configure_search_mode "semantic" 0.0 1.0 true

# Switch to text-only mode
/configure_search_mode "bm25" 1.0 0.0 true

Environment Variable Configuration

# Windows (PowerShell)
$env:CLAUDE_SEARCH_MODE="hybrid"
$env:CLAUDE_ENABLE_HYBRID="true"
$env:CLAUDE_BM25_WEIGHT="0.4"
$env:CLAUDE_DENSE_WEIGHT="0.6"

Available Search Modes

Mode Description Best For Performance Quality Metrics Status
hybrid BM25 + Semantic with RRF reranking (default) General use, balanced accuracy 487ms, optimal accuracy 44.4% precision, 100% MRR βœ… Fully operational
semantic Dense vector search only Conceptual queries, code similarity 487ms, semantic understanding 38.9% precision, 100% MRR βœ… Fixed 2025-09-25
bm25 Text-based sparse search only Exact matches, error messages 162ms, fastest 33.3% precision, 61.1% MRR βœ… Fully operational
auto Automatically choose based on query Let system optimize Adaptive performance Context-dependent βœ… Fully operational

For detailed configuration options, see Hybrid Search Configuration Guide.

πŸ“Š Performance benchmarks and detailed metrics: View Benchmarks

MCP Tools Reference

The following MCP tools are available in Claude Code:

Core Search Tools

  • /search_code - Main search with hybrid capabilities
  • /index_directory - Index a project for searching
  • /find_similar_code - Find code similar to a specific chunk

Configuration Tools

  • /configure_search_mode - Configure hybrid search parameters
  • /get_search_config_status - View current configuration

Management Tools

  • /get_index_status - Check index statistics
  • /get_memory_status - Monitor memory usage
  • /cleanup_resources - Free memory and cleanup
  • /clear_index - Reset search index
  • /list_projects - List indexed projects
  • /switch_project - Switch between projects

Supported Languages & Extensions

Fully Supported (22 extensions across 10+ languages):

Language Extensions
Python .py
JavaScript .js, .jsx
TypeScript .ts, .tsx
Java .java
Go .go
Rust .rs
C .c
C++ .cpp, .cc, .cxx, .c++
C# .cs
Svelte .svelte
GLSL .glsl, .frag, .vert, .comp, .geom, .tesc, .tese

Total: 22 file extensions across 11 programming languages

Storage

Data is stored in the configured storage directory:

~/.claude_code_search/
β”œβ”€β”€ models/          # Downloaded models
β”œβ”€β”€ index/           # FAISS indices and metadata
β”‚   β”œβ”€β”€ code.index   # Vector index
β”‚   β”œβ”€β”€ metadata.db  # Chunk metadata (SQLite)
β”‚   β”œβ”€β”€ stats.json   # Index statistics
β”‚   └── bm25/        # BM25 text search index
β”‚       β”œβ”€β”€ bm25.index      # BM25 sparse index
β”‚       β”œβ”€β”€ bm25_docs.json  # Document storage
β”‚       └── bm25_metadata.json # BM25 metadata

Performance

  • Model size: ~1.2GB (EmbeddingGemma-300m and caches)
  • Embedding dimension: 768 (can be reduced for speed)
  • Index types: Flat (exact) or IVF (approximate) based on dataset size
  • Batch processing: Configurable batch sizes for embedding generation

Tips:

  • First index on a large repo will take time (model load + chunk + embed). Subsequent runs are incremental.
  • With GPU FAISS, searches on large indexes are significantly faster.
  • Embeddings automatically use CUDA (NVIDIA) or MPS (Apple) if available.

Troubleshooting

Quick Diagnostics

Run automated verification to identify issues:

# Comprehensive system check
verify-installation.bat

# HuggingFace authentication check
verify-hf-auth.bat

# Repair tool - Fix common issues
scripts\batch\repair_installation.bat

Repair Tool Options:

  1. Clear all Merkle snapshots (fixes stale change detection)
  2. Clear project indexes (reset search state)
  3. Reconfigure Claude Code integration
  4. Verify dependencies
  5. Full system reset (indexes + snapshots)
  6. Return to main menu

Installation Issues

  1. Import errors: Ensure all dependencies are installed

    cd claude-context-local
    uv sync
  2. UV not found: Install UV package manager first

    install-windows.bat  # Automatically installs UV
  3. PyTorch CUDA version mismatch or BGE-M3 errors:

    BGE-M3 requires PyTorch 2.6.0+ due to security improvements. If you have an older installation, reinstall using:

    # Reinstall entire environment with correct PyTorch version
    install-windows.bat

    Or manually upgrade PyTorch only:

    .venv\Scripts\uv.exe pip install "torch==2.6.0" "torchvision==0.21.0" "torchaudio==2.6.0" --index-url https://download.pytorch.org/whl/cu118

Model and Authentication Issues

  1. Model download fails: Check internet, disk space, and HuggingFace authentication

  2. "401 Unauthorized" error: HuggingFace authentication required

    # Authenticate with HuggingFace
    .venv\Scripts\python.exe -m huggingface_hub.commands.huggingface_cli login
  3. Force offline mode: Use cached models without internet

    $env:HF_HUB_OFFLINE="1"

Search and Indexing Issues

  1. No search results: Verify the codebase was indexed successfully

    • Check index status: /get_index_status in Claude Code
    • Verify project path is correct
    • Reindex with /index_directory "C:\path\to\project"
  2. "No changes detected" but files were modified: Stale Merkle snapshot issue

    • Use force reindex to bypass snapshot checking
    • Via menu: start_mcp_server.bat β†’ 5 (Project Management) β†’ 2 (Force Reindex Project)
    • Via tool: .venv\Scripts\python.exe tools\index_project.py --force
    • Or use repair tool: scripts\batch\repair_installation.bat β†’ Option 1
  3. Memory issues during indexing: System running out of RAM

    • Close other applications to free memory
    • Check available RAM: /get_memory_status
    • For large codebases (10,000+ files), ensure 8GB+ RAM available
  4. Indexing too slow: First-time indexing takes time

  • Expected: ~30-60 seconds for small projects (100 files)
  • Expected: ~5-10 minutes for large projects (10,000+ files)
  • GPU accelerates by 8.6x - verify CUDA available

GPU and Performance Issues

  1. FAISS GPU not used: Ensure CUDA drivers and nvidia-smi available

    # Check GPU availability
    nvidia-smi
    
    # Reinstall PyTorch with GPU support
    scripts\batch\install_pytorch_cuda.bat
    
    # Verify GPU detection
    .venv\Scripts\python.exe -c "import torch; print('CUDA:', torch.cuda.is_available())"
  2. "CUDA out of memory" error: GPU memory exhausted

    • Close other GPU applications
    • System will automatically fall back to CPU
    • Performance will be slower but functional

MCP Server Issues

  1. MCP server won't start: Check Python environment and dependencies

    # Test MCP server manually
    start_mcp_server.bat
    
    # Check for errors in output
  2. Claude Code can't find MCP tools: MCP server not registered

    # Register MCP server with Claude Code
    .\scripts\batch\manual_configure.bat
    
    # Verify configuration
    .\.venv\Scripts\python.exe scripts\manual_configure.py --validate-only
    
    # Run comprehensive MCP configuration validation (15 checks)
    .\tests\regression\test_mcp_configuration.ps1
  3. MCP server path verification fails: Invalid path in .claude.json

    • Verify configuration: .\.venv\Scripts\python.exe scripts\manual_configure.py --validate-only
    • Reconfigure if needed: .\scripts\batch\manual_configure.bat
    • Check that wrapper script exists at configured path
  4. MCP connection lost: Restart Claude Code and MCP server

    • Close Claude Code completely
    • Run start_mcp_server.bat in new terminal
    • Reopen Claude Code

Windows-Specific Issues

  1. "cannot be loaded because running scripts is disabled": PowerShell execution policy

    # Allow script execution (run as Administrator)
    Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUser
  2. Path too long errors: Windows path length limitation

    • Move project closer to drive root (e.g., C:\Projects\)

    • Enable long paths in Windows (requires admin):

      New-ItemProperty -Path "HKLM:\SYSTEM\CurrentControlSet\Control\FileSystem" -Name "LongPathsEnabled" -Value 1 -PropertyType DWORD -Force

Still Having Issues?

Ignored directories (for speed and noise reduction)

node_modules, .venv, venv, env, .env, .direnv, __pycache__, .pytest_cache, .mypy_cache, .ruff_cache, .pytype, .ipynb_checkpoints, build, dist, out, public, .next, .nuxt, .svelte-kit, .angular, .astro, .vite, .cache, .parcel-cache, .turbo, coverage, .coverage, .nyc_output, .gradle, .idea, .vscode, .docusaurus, .vercel, .serverless, .terraform, .mvn, .tox, target, bin, obj

Contributing

This is a research project focused on intelligent code chunking and search. Feel free to experiment with:

  • Different chunking strategies
  • Alternative embedding models
  • Enhanced metadata extraction
  • Performance optimizations

License

Licensed under the GNU General Public License v3.0 (GPL-3.0). See the LICENSE file for details.

Inspiration

This Windows-focused fork was adapted from FarhanAliRaza/claude-context-local, which provides cross-platform support for Linux and macOS.

Both projects draw inspiration from zilliztech/claude-context. We adapted the concepts to a Python implementation with fully local embeddings and Windows-specific optimizations.

About

Code search MCP for Claude Code. Make entire codebase the context for any coding agent. Embeddings are created and stored locally. No API cost.

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 75.7%
  • Batchfile 20.7%
  • PowerShell 2.2%
  • Shell 1.4%