Skip to content

dumko2001/ragd

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

1 Commit
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ” ragd - Local-first RAG Daemon for Code Understanding

ragd is a powerful, local-first RAG (Retrieval-Augmented Generation) daemon designed specifically for code understanding and intelligence. It provides semantic search, code analysis, and AI-powered insights for your codebase without sending your code to external services.

✨ Features

Core Capabilities

  • 🌍 Cross-language Support: Works with Python, JavaScript, TypeScript, Go, Rust, Java, C++, and more
  • πŸ” Semantic Search: Find code by meaning, not just keywords
  • πŸ“Š Smart Chunking: Intelligent code parsing that respects language structure
  • πŸš€ Real-time Indexing: Automatic file watching and incremental updates
  • πŸ€– AI-Powered Codex: Generate comprehensive codebase summaries
  • 🌐 HTTP API: RESTful API for integration with tools and IDEs
  • πŸ’» Rich CLI: Beautiful command-line interface with progress indicators
  • πŸ”’ Privacy-First: All processing happens locally

Advanced Features

  • πŸ“ˆ Vector Embeddings: Uses Google's Gemini for high-quality embeddings
  • πŸ’Ύ Persistent Storage: ChromaDB for efficient vector storage
  • πŸ”„ Auto File Watcher: Monitors changes and updates index automatically
  • πŸ“ Metadata Indexing: Tracks file types, symbols, and relationships
  • βš™οΈ Configurable: Extensive configuration options
  • πŸ”Œ Extensible: Plugin architecture for custom backends

πŸš€ Quick Start

Installation

  1. Clone the repository:

    git clone <repository-url>
    cd ragd/v1
  2. Install dependencies:

    pip install -r requirements.txt
  3. Set up your Gemini API key:

    export GEMINI_API_KEY="your-api-key-here"

    Or create a .env file:

    echo "GEMINI_API_KEY=your-api-key-here" > .env

Basic Usage

  1. Initialize ragd in your project:

    cd /path/to/your/project
    python -m ragd init
  2. Start the daemon:

    python -m ragd start
  3. Query your code:

    python -m ragd query "authentication logic"
  4. Access the web interface: Open http://localhost:8000/api/v1/docs in your browser

πŸ“– Documentation

Configuration

ragd uses a .ragdconfig.yaml file for configuration. Here's an example:

# File processing
include_extensions:
  - ".py"
  - ".js"
  - ".ts"
  - ".go"
  - ".rs"
  - ".java"
  - ".cpp"
  - ".h"
  - ".md"

ignore_directories:
  - "node_modules"
  - ".git"
  - "__pycache__"
  - "target"
  - "build"
  - "dist"

# Embedding settings
embedding_model: "models/embedding-001"
embedding_batch_size: 100
chunk_max_length: 1000
chunk_overlap: 200

# Vector database
vector_db_type: "chroma"
vector_db_path: ".ragdb/chroma"

# Codex generation
codex_update_frequency: 3600  # 1 hour
codex_model: "gemini-pro"
codex_max_tokens: 4000

# Server settings
server_host: "localhost"
server_port: 8000
api_prefix: "/api/v1"

# File watcher
watcher_debounce_seconds: 2.0
watcher_recursive: true

# Logging
log_level: "INFO"
log_file: ".ragdb/ragd.log"

CLI Commands

Initialize a Project

python -m ragd init [PROJECT_PATH] [--force]

Start the Daemon

python -m ragd start [--config CONFIG_PATH] [--no-server] [--background]

Query the Index

python -m ragd query "search term" [--limit 10] [--format table|json|markdown]

Check Status

python -m ragd status [--config CONFIG_PATH]

Force Reindex

python -m ragd reindex [--force] [--config CONFIG_PATH]

Manage Codex

python -m ragd codex [--regenerate] [--config CONFIG_PATH]

View Configuration

python -m ragd config [--show] [--config CONFIG_PATH]

HTTP API

When running, ragd provides a comprehensive REST API:

Search Endpoints

  • POST /api/v1/query - Semantic search
  • POST /api/v1/similar-to-code - Find similar code
  • POST /api/v1/multi-query - Batch queries

File Management

  • GET /api/v1/file-list - List indexed files
  • GET /api/v1/file/{file_path} - Get file information
  • POST /api/v1/compare-files - Compare files

Symbol Lookup

  • GET /api/v1/symbol/{symbol_name} - Find symbol definitions

Codex

  • GET /api/v1/codex - Get project codex
  • POST /api/v1/codex/regenerate - Regenerate codex

System

  • GET /api/v1/stats - Get system statistics
  • GET /api/v1/health - Health check

Interactive Documentation

Visit http://localhost:8000/api/v1/docs for interactive API documentation.

πŸ—οΈ Architecture

ragd is built with a modular architecture:

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   File Watcher  │───▢│   Code Chunker  │───▢│    Embedder     β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                                        β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   HTTP Server   │◀───│     Indexer     │◀───│  Vector Store   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚                       β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚      CLI        β”‚    β”‚ Codex Generator β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Components

  • File Watcher: Monitors filesystem changes using watchdog
  • Code Chunker: Parses code using tree-sitter for language-aware chunking
  • Embedder: Generates embeddings using Google's Gemini API
  • Vector Store: Persists embeddings using ChromaDB
  • Indexer: Manages the vector database and search operations
  • Codex Generator: Creates AI-powered codebase summaries
  • HTTP Server: FastAPI-based REST API
  • CLI: Rich command-line interface using Typer

πŸ”§ Development

Project Structure

ragd/
β”œβ”€β”€ ragd/
β”‚   β”œβ”€β”€ __init__.py          # Package initialization
β”‚   β”œβ”€β”€ config.py            # Configuration management
β”‚   β”œβ”€β”€ watcher.py           # File system watcher
β”‚   β”œβ”€β”€ parser.py            # Code parsing and chunking
β”‚   β”œβ”€β”€ embedder.py          # Embedding generation
β”‚   β”œβ”€β”€ indexer.py           # Vector database management
β”‚   β”œβ”€β”€ codex_gen.py         # Codex generation
β”‚   β”œβ”€β”€ server.py            # HTTP API server
β”‚   β”œβ”€β”€ daemon.py            # Main daemon orchestrator
β”‚   β”œβ”€β”€ cli.py               # Command-line interface
β”‚   └── __main__.py          # Entry point
β”œβ”€β”€ requirements.txt         # Python dependencies
β”œβ”€β”€ .ragdconfig.yaml         # Default configuration
└── README.md               # This file

Running Tests

# Install test dependencies
pip install pytest pytest-asyncio

# Run tests
pytest tests/

Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Add tests
  5. Submit a pull request

πŸ”’ Privacy & Security

  • Local Processing: All code analysis happens on your machine
  • Optional Cloud: Gemini API is only used for embeddings (configurable)
  • No Code Upload: Your source code never leaves your environment
  • Secure Storage: Local vector database with no external dependencies

πŸš€ Performance

  • Incremental Updates: Only processes changed files
  • Efficient Chunking: Smart code parsing respects language structure
  • Batch Processing: Optimized embedding generation
  • Fast Search: Vector similarity search with ChromaDB
  • Memory Efficient: Streaming processing for large codebases

πŸ› οΈ Troubleshooting

Common Issues

"No Gemini API key found"

  • Set the GEMINI_API_KEY environment variable
  • Or add it to a .env file in your project root

"No files found to index"

  • Check your include_extensions in .ragdconfig.yaml
  • Ensure you're in the correct project directory
  • Verify files aren't in ignore_directories

"Server won't start"

  • Check if port 8000 is already in use
  • Modify server_port in configuration
  • Check logs in .ragdb/ragd.log

"Search returns no results"

  • Ensure indexing completed successfully
  • Check if files were processed (use ragd status)
  • Try broader search terms

Debug Mode

# Enable debug logging
export RAGD_LOG_LEVEL=DEBUG
python -m ragd start

Logs

Logs are stored in .ragdb/ragd.log by default.

πŸ“‹ Roadmap

Phase 1: Core Infrastructure βœ…

  • File watcher and indexing
  • Language-agnostic chunking
  • Embedding system
  • Vector storage
  • HTTP API
  • CLI interface
  • Codex generation

Phase 2: Advanced Features (Coming Soon)

  • Symbol graph analysis
  • Advanced query types
  • Pluggable vector backends
  • Codex templating
  • Performance optimizations

Phase 3: Developer Experience

  • VSCode extension
  • Web dashboard
  • Git integration
  • Tool-call adapters for AI agents
  • Auto-bootstrap scripts

πŸ“„ License

MIT License - see LICENSE file for details.

🀝 Support

πŸ™ Acknowledgments


Made with ❀️ for developers who love their code

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages