ragd is a powerful, local-first RAG (Retrieval-Augmented Generation) daemon designed specifically for code understanding and intelligence. It provides semantic search, code analysis, and AI-powered insights for your codebase without sending your code to external services.
- π Cross-language Support: Works with Python, JavaScript, TypeScript, Go, Rust, Java, C++, and more
- π Semantic Search: Find code by meaning, not just keywords
- π Smart Chunking: Intelligent code parsing that respects language structure
- π Real-time Indexing: Automatic file watching and incremental updates
- π€ AI-Powered Codex: Generate comprehensive codebase summaries
- π HTTP API: RESTful API for integration with tools and IDEs
- π» Rich CLI: Beautiful command-line interface with progress indicators
- π Privacy-First: All processing happens locally
- π Vector Embeddings: Uses Google's Gemini for high-quality embeddings
- πΎ Persistent Storage: ChromaDB for efficient vector storage
- π Auto File Watcher: Monitors changes and updates index automatically
- π Metadata Indexing: Tracks file types, symbols, and relationships
- βοΈ Configurable: Extensive configuration options
- π Extensible: Plugin architecture for custom backends
-
Clone the repository:
git clone <repository-url> cd ragd/v1
-
Install dependencies:
pip install -r requirements.txt
-
Set up your Gemini API key:
export GEMINI_API_KEY="your-api-key-here"
Or create a
.envfile:echo "GEMINI_API_KEY=your-api-key-here" > .env
-
Initialize ragd in your project:
cd /path/to/your/project python -m ragd init -
Start the daemon:
python -m ragd start
-
Query your code:
python -m ragd query "authentication logic" -
Access the web interface: Open http://localhost:8000/api/v1/docs in your browser
ragd uses a .ragdconfig.yaml file for configuration. Here's an example:
# File processing
include_extensions:
- ".py"
- ".js"
- ".ts"
- ".go"
- ".rs"
- ".java"
- ".cpp"
- ".h"
- ".md"
ignore_directories:
- "node_modules"
- ".git"
- "__pycache__"
- "target"
- "build"
- "dist"
# Embedding settings
embedding_model: "models/embedding-001"
embedding_batch_size: 100
chunk_max_length: 1000
chunk_overlap: 200
# Vector database
vector_db_type: "chroma"
vector_db_path: ".ragdb/chroma"
# Codex generation
codex_update_frequency: 3600 # 1 hour
codex_model: "gemini-pro"
codex_max_tokens: 4000
# Server settings
server_host: "localhost"
server_port: 8000
api_prefix: "/api/v1"
# File watcher
watcher_debounce_seconds: 2.0
watcher_recursive: true
# Logging
log_level: "INFO"
log_file: ".ragdb/ragd.log"python -m ragd init [PROJECT_PATH] [--force]python -m ragd start [--config CONFIG_PATH] [--no-server] [--background]python -m ragd query "search term" [--limit 10] [--format table|json|markdown]python -m ragd status [--config CONFIG_PATH]python -m ragd reindex [--force] [--config CONFIG_PATH]python -m ragd codex [--regenerate] [--config CONFIG_PATH]python -m ragd config [--show] [--config CONFIG_PATH]When running, ragd provides a comprehensive REST API:
POST /api/v1/query- Semantic searchPOST /api/v1/similar-to-code- Find similar codePOST /api/v1/multi-query- Batch queries
GET /api/v1/file-list- List indexed filesGET /api/v1/file/{file_path}- Get file informationPOST /api/v1/compare-files- Compare files
GET /api/v1/symbol/{symbol_name}- Find symbol definitions
GET /api/v1/codex- Get project codexPOST /api/v1/codex/regenerate- Regenerate codex
GET /api/v1/stats- Get system statisticsGET /api/v1/health- Health check
Visit http://localhost:8000/api/v1/docs for interactive API documentation.
ragd is built with a modular architecture:
βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
β File Watcher βββββΆβ Code Chunker βββββΆβ Embedder β
βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
β
βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
β HTTP Server ββββββ Indexer ββββββ Vector Store β
βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
β β
βββββββββββββββββββ βββββββββββββββββββ
β CLI β β Codex Generator β
βββββββββββββββββββ βββββββββββββββββββ
- File Watcher: Monitors filesystem changes using
watchdog - Code Chunker: Parses code using
tree-sitterfor language-aware chunking - Embedder: Generates embeddings using Google's Gemini API
- Vector Store: Persists embeddings using ChromaDB
- Indexer: Manages the vector database and search operations
- Codex Generator: Creates AI-powered codebase summaries
- HTTP Server: FastAPI-based REST API
- CLI: Rich command-line interface using Typer
ragd/
βββ ragd/
β βββ __init__.py # Package initialization
β βββ config.py # Configuration management
β βββ watcher.py # File system watcher
β βββ parser.py # Code parsing and chunking
β βββ embedder.py # Embedding generation
β βββ indexer.py # Vector database management
β βββ codex_gen.py # Codex generation
β βββ server.py # HTTP API server
β βββ daemon.py # Main daemon orchestrator
β βββ cli.py # Command-line interface
β βββ __main__.py # Entry point
βββ requirements.txt # Python dependencies
βββ .ragdconfig.yaml # Default configuration
βββ README.md # This file
# Install test dependencies
pip install pytest pytest-asyncio
# Run tests
pytest tests/- Fork the repository
- Create a feature branch
- Make your changes
- Add tests
- Submit a pull request
- Local Processing: All code analysis happens on your machine
- Optional Cloud: Gemini API is only used for embeddings (configurable)
- No Code Upload: Your source code never leaves your environment
- Secure Storage: Local vector database with no external dependencies
- Incremental Updates: Only processes changed files
- Efficient Chunking: Smart code parsing respects language structure
- Batch Processing: Optimized embedding generation
- Fast Search: Vector similarity search with ChromaDB
- Memory Efficient: Streaming processing for large codebases
"No Gemini API key found"
- Set the
GEMINI_API_KEYenvironment variable - Or add it to a
.envfile in your project root
"No files found to index"
- Check your
include_extensionsin.ragdconfig.yaml - Ensure you're in the correct project directory
- Verify files aren't in
ignore_directories
"Server won't start"
- Check if port 8000 is already in use
- Modify
server_portin configuration - Check logs in
.ragdb/ragd.log
"Search returns no results"
- Ensure indexing completed successfully
- Check if files were processed (use
ragd status) - Try broader search terms
# Enable debug logging
export RAGD_LOG_LEVEL=DEBUG
python -m ragd startLogs are stored in .ragdb/ragd.log by default.
- File watcher and indexing
- Language-agnostic chunking
- Embedding system
- Vector storage
- HTTP API
- CLI interface
- Codex generation
- Symbol graph analysis
- Advanced query types
- Pluggable vector backends
- Codex templating
- Performance optimizations
- VSCode extension
- Web dashboard
- Git integration
- Tool-call adapters for AI agents
- Auto-bootstrap scripts
MIT License - see LICENSE file for details.
- π§ Email: [email protected]
- π Issues: GitHub Issues
- π¬ Discussions: GitHub Discussions
- tree-sitter for language parsing
- ChromaDB for vector storage
- FastAPI for the web framework
- Typer for the CLI
- Rich for beautiful terminal output
- Google Gemini for embeddings
Made with β€οΈ for developers who love their code