🔍 ragd - Local-first RAG Daemon for Code Understanding

ragd is a powerful, local-first RAG (Retrieval-Augmented Generation) daemon designed specifically for code understanding and intelligence. It provides semantic search, code analysis, and AI-powered insights for your codebase without sending your code to external services.

✨ Features

Core Capabilities

🌍 Cross-language Support: Works with Python, JavaScript, TypeScript, Go, Rust, Java, C++, and more
🔍 Semantic Search: Find code by meaning, not just keywords
📊 Smart Chunking: Intelligent code parsing that respects language structure
🚀 Real-time Indexing: Automatic file watching and incremental updates
🤖 AI-Powered Codex: Generate comprehensive codebase summaries
🌐 HTTP API: RESTful API for integration with tools and IDEs
💻 Rich CLI: Beautiful command-line interface with progress indicators
🔒 Privacy-First: All processing happens locally

Advanced Features

📈 Vector Embeddings: Uses Google's Gemini for high-quality embeddings
💾 Persistent Storage: ChromaDB for efficient vector storage
🔄 Auto File Watcher: Monitors changes and updates index automatically
📝 Metadata Indexing: Tracks file types, symbols, and relationships
⚙️ Configurable: Extensive configuration options
🔌 Extensible: Plugin architecture for custom backends

🚀 Quick Start

Installation

Clone the repository:
```
git clone <repository-url>
cd ragd/v1
```
Install dependencies:
```
pip install -r requirements.txt
```

Set up your Gemini API key:

export GEMINI_API_KEY="your-api-key-here"

Or create a .env file:

echo "GEMINI_API_KEY=your-api-key-here" > .env

Basic Usage

Initialize ragd in your project:

cd /path/to/your/project
python -m ragd init

Start the daemon:
```
python -m ragd start
```

Query your code:

python -m ragd query "authentication logic"

Access the web interface: Open http://localhost:8000/api/v1/docs in your browser

📖 Documentation

Configuration

ragd uses a .ragdconfig.yaml file for configuration. Here's an example:

# File processing
include_extensions:
  - ".py"
  - ".js"
  - ".ts"
  - ".go"
  - ".rs"
  - ".java"
  - ".cpp"
  - ".h"
  - ".md"

ignore_directories:
  - "node_modules"
  - ".git"
  - "__pycache__"
  - "target"
  - "build"
  - "dist"

# Embedding settings
embedding_model: "models/embedding-001"
embedding_batch_size: 100
chunk_max_length: 1000
chunk_overlap: 200

# Vector database
vector_db_type: "chroma"
vector_db_path: ".ragdb/chroma"

# Codex generation
codex_update_frequency: 3600  # 1 hour
codex_model: "gemini-pro"
codex_max_tokens: 4000

# Server settings
server_host: "localhost"
server_port: 8000
api_prefix: "/api/v1"

# File watcher
watcher_debounce_seconds: 2.0
watcher_recursive: true

# Logging
log_level: "INFO"
log_file: ".ragdb/ragd.log"

CLI Commands

Initialize a Project

python -m ragd init [PROJECT_PATH] [--force]

Start the Daemon

python -m ragd start [--config CONFIG_PATH] [--no-server] [--background]

Query the Index

python -m ragd query "search term" [--limit 10] [--format table|json|markdown]

Check Status

python -m ragd status [--config CONFIG_PATH]

Force Reindex

python -m ragd reindex [--force] [--config CONFIG_PATH]

Manage Codex

python -m ragd codex [--regenerate] [--config CONFIG_PATH]

View Configuration

python -m ragd config [--show] [--config CONFIG_PATH]

HTTP API

When running, ragd provides a comprehensive REST API:

Search Endpoints

POST /api/v1/query - Semantic search
POST /api/v1/similar-to-code - Find similar code
POST /api/v1/multi-query - Batch queries

File Management

GET /api/v1/file-list - List indexed files
GET /api/v1/file/{file_path} - Get file information
POST /api/v1/compare-files - Compare files

Symbol Lookup

GET /api/v1/symbol/{symbol_name} - Find symbol definitions

Codex

GET /api/v1/codex - Get project codex
POST /api/v1/codex/regenerate - Regenerate codex

System

GET /api/v1/stats - Get system statistics
GET /api/v1/health - Health check

Interactive Documentation

Visit http://localhost:8000/api/v1/docs for interactive API documentation.

🏗️ Architecture

ragd is built with a modular architecture:

┌─────────────────┐    ┌─────────────────┐    ┌─────────────────┐
│   File Watcher  │───▶│   Code Chunker  │───▶│    Embedder     │
└─────────────────┘    └─────────────────┘    └─────────────────┘
                                                        │
┌─────────────────┐    ┌─────────────────┐    ┌─────────────────┐
│   HTTP Server   │◀───│     Indexer     │◀───│  Vector Store   │
└─────────────────┘    └─────────────────┘    └─────────────────┘
         │                       │
┌─────────────────┐    ┌─────────────────┐
│      CLI        │    │ Codex Generator │
└─────────────────┘    └─────────────────┘

Components

File Watcher: Monitors filesystem changes using watchdog
Code Chunker: Parses code using tree-sitter for language-aware chunking
Embedder: Generates embeddings using Google's Gemini API
Vector Store: Persists embeddings using ChromaDB
Indexer: Manages the vector database and search operations
Codex Generator: Creates AI-powered codebase summaries
HTTP Server: FastAPI-based REST API
CLI: Rich command-line interface using Typer

🔧 Development

Project Structure

ragd/
├── ragd/
│   ├── __init__.py          # Package initialization
│   ├── config.py            # Configuration management
│   ├── watcher.py           # File system watcher
│   ├── parser.py            # Code parsing and chunking
│   ├── embedder.py          # Embedding generation
│   ├── indexer.py           # Vector database management
│   ├── codex_gen.py         # Codex generation
│   ├── server.py            # HTTP API server
│   ├── daemon.py            # Main daemon orchestrator
│   ├── cli.py               # Command-line interface
│   └── __main__.py          # Entry point
├── requirements.txt         # Python dependencies
├── .ragdconfig.yaml         # Default configuration
└── README.md               # This file

Running Tests

# Install test dependencies
pip install pytest pytest-asyncio

# Run tests
pytest tests/

Contributing

Fork the repository
Create a feature branch
Make your changes
Add tests
Submit a pull request

🔒 Privacy & Security

Local Processing: All code analysis happens on your machine
Optional Cloud: Gemini API is only used for embeddings (configurable)
No Code Upload: Your source code never leaves your environment
Secure Storage: Local vector database with no external dependencies

🚀 Performance

Incremental Updates: Only processes changed files
Efficient Chunking: Smart code parsing respects language structure
Batch Processing: Optimized embedding generation
Fast Search: Vector similarity search with ChromaDB
Memory Efficient: Streaming processing for large codebases

🛠️ Troubleshooting

Common Issues

"No Gemini API key found"

Set the GEMINI_API_KEY environment variable
Or add it to a .env file in your project root

"No files found to index"

Check your include_extensions in .ragdconfig.yaml
Ensure you're in the correct project directory
Verify files aren't in ignore_directories

"Server won't start"

Check if port 8000 is already in use
Modify server_port in configuration
Check logs in .ragdb/ragd.log

"Search returns no results"

Ensure indexing completed successfully
Check if files were processed (use ragd status)
Try broader search terms

Debug Mode

# Enable debug logging
export RAGD_LOG_LEVEL=DEBUG
python -m ragd start

Logs

Logs are stored in .ragdb/ragd.log by default.

📋 Roadmap

Phase 1: Core Infrastructure ✅

Phase 2: Advanced Features (Coming Soon)

Phase 3: Developer Experience

📄 License

MIT License - see LICENSE file for details.

🤝 Support

📧 Email: [email protected]
🐛 Issues: GitHub Issues
💬 Discussions: GitHub Discussions

🙏 Acknowledgments

tree-sitter for language parsing
ChromaDB for vector storage
FastAPI for the web framework
Typer for the CLI
Rich for beautiful terminal output
Google Gemini for embeddings

Made with ❤️ for developers who love their code

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
ragd		ragd
.gitignore		.gitignore
.ragdconfig.yaml		.ragdconfig.yaml
LICENSE		LICENSE
PRD.md		PRD.md
README.md		README.md
demo.py		demo.py
requirements.txt		requirements.txt
setup.py		setup.py
tasks.md		tasks.md
test_ragd.py		test_ragd.py

License

dumko2001/ragd

Folders and files

Latest commit

History

Repository files navigation