Skip to content

flamehaven01/Flamehaven-Filesearch

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

78 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

FLAMEHAVEN FileSearch

FLAMEHAVEN FileSearch

Your documents. Searchable in minutes. No infrastructure needed.

Search your local documents with RAG instantly

CI/CD Latest Version Python 3.8+ MIT License

Quick Start (3 min) • API Docs • Roadmap • Contributing


[>] Why FLAMEHAVEN FileSearch?

[!] Fast

From zero to production in under 5 minutes. No complex infrastructure.

[#] Private

100% self-hosted. Your data never leaves your servers.

[$] Affordable

Leverages Gemini's generous free tier. Process thousands of docs free.


Comparison with Alternatives

Feature FLAMEHAVEN Pinecone Weaviate Custom RAG
Setup Time < 5 min ~20 min ~30 min Days
Self-Hosted [+] Yes [-] No [+] Yes [+] Yes
Free Tier Generous Limited Yes N/A
Code Complexity Low Medium High Very High
Maintenance Minimal None Medium High
Best For Quick POCs, SMBs Enterprise scale ML teams Full control

[&] Key Features

Feature Description
[*] Multi-Format Support PDF, DOCX, TXT, MD files up to 50MB
[*] Semantic Search Natural language queries with AI-powered answers
[*] Source Attribution Every answer links back to source documents
[*] Store Management Organize documents into separate collections
[*] Dual Interface Python SDK + REST API with Swagger UI
[*] Docker Ready One-command deployment with persistence
[*] Enterprise Auth v1.2.0: API keys, audit logging, rate limiting
[*] Batch Processing Process 1-100 queries per request

Admin Dashboard (v1.2.0)

FLAMEHAVEN FileSearch Admin Dashboard


Problem: Your Situation

âś— You have PDF, Word, and text documents scattered across your local drive
âś— You want to search them intelligently, not keyword-by-keyword
âś— You don't want to upload your data to external services (Pinecone, Cloudflare, etc.)
âś— You need production-ready security without complex setup
âś— You want zero infrastructure costs for prototype phase

Solution: FLAMEHAVEN FileSearch

âś“ Local RAG search engine in 5 minutes
âś“ 100% self-hosted (your data stays yours)
âś“ Single Docker command deployment
âś“ Free tier Google Gemini (up to 1500 queries/month)
âś“ v1.2.0: Enterprise-grade authentication & multi-user support
âś“ Batch search API for 1-100 queries per request
âś“ Optional Redis for distributed caching across workers

[>] 3-Minute Quick Start

1. Docker (No Setup)

# Start with one command
docker run -d \
  -e GEMINI_API_KEY="your_gemini_api_key" \
  -p 8000:8000 \
  -v $(pwd)/data:/app/data \
  flamehaven-filesearch:1.2.0

# Available at http://localhost:8000 in 3 seconds

2. Your First Search (cURL)

# Step 1: Generate API key (v1.2.0 requirement)
curl -X POST http://localhost:8000/api/admin/keys \
  -H "X-Admin-Key: your_admin_key" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "My First Key",
    "permissions": ["upload", "search"]
  }'

# Response: {"key_id": "...", "plain_key": "sk_live_abc123..."}
# SAVE this key - it won't be shown again!

# Step 2: Upload a document
curl -X POST http://localhost:8000/api/upload/single \
  -H "Authorization: Bearer sk_live_abc123..." \
  -F "[email protected]" \
  -F "store=documents"

# Step 3: Search
curl -X POST http://localhost:8000/api/search \
  -H "Authorization: Bearer sk_live_abc123..." \
  -H "Content-Type: application/json" \
  -d '{
    "query": "What are the main points in this document?",
    "store": "documents"
  }'

# Response:
# {
#   "answer": "The main points are...",
#   "sources": [
#     {
#       "file": "sample.pdf",
#       "page": 3,
#       "excerpt": "..."
#     }
#   ]
# }

3. Python Code Example

from flamehaven_filesearch import FlamehavenFileSearch, FileSearchConfig

# Configuration
config = FileSearchConfig(
    google_api_key="your_gemini_api_key",
    environment="offline"  # or "remote" for online
)

# Initialize
searcher = FlamehavenFileSearch(config)

# Create document store
searcher.create_store("my_documents")

# Upload files
searcher.upload_file("path/to/document.pdf", "my_documents")
searcher.upload_file("path/to/document.docx", "my_documents")

# Search with RAG
result = searcher.search(
    "Summarize the key findings",
    store="my_documents"
)

print(f"Answer: {result['answer']}")
print(f"Sources: {result['sources']}")

Installation

Option 1: Pip (Recommended for Development)

# Core functionality
pip install flamehaven-filesearch

# With API server
pip install flamehaven-filesearch[api]

# With Redis support
pip install flamehaven-filesearch[api,redis]

# Everything (for development)
pip install flamehaven-filesearch[all]

Option 2: Docker (Recommended for Production)

# Build image
docker build -t flamehaven-filesearch:1.2.0 .

# Run standalone
docker run \
  -e GEMINI_API_KEY="your_key" \
  -e FLAMEHAVEN_ADMIN_KEY="your_admin_key" \
  -p 8000:8000 \
  flamehaven-filesearch:1.2.0

# With Docker Compose (includes Redis)
docker-compose up -d

Option 3: Kubernetes (For High Availability)

See the Kubernetes deployment example above for complete manifests with StatefulSets, ConfigMaps, and Secrets.


Configuration

Required Environment Variables

# Google Gemini API key (get free tier at ai.google.dev)
export GEMINI_API_KEY="your_api_key"

# Admin key for creating API keys in v1.2.0
export FLAMEHAVEN_ADMIN_KEY="your_secure_admin_password"

Optional Environment Variables

# Server settings
export HOST="0.0.0.0"              # Default: 127.0.0.1
export PORT="8000"                  # Default: 8000
export ENVIRONMENT="production"     # Default: development

# Database location
export FLAMEHAVEN_API_KEYS_DB="/path/to/api_keys.db"

# Redis (for multi-worker deployments)
export REDIS_HOST="localhost"       # Default: localhost
export REDIS_PORT="6379"            # Default: 6379
export REDIS_PASSWORD="password"    # Optional

# API limits
export MAX_FILE_SIZE_MB="50"        # Default: 50

API Key Management (v1.2.0)

Generate API Key

curl -X POST http://localhost:8000/api/admin/keys \
  -H "X-Admin-Key: $FLAMEHAVEN_ADMIN_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Production API Key",
    "permissions": ["upload", "search", "stores", "delete"],
    "rate_limit_per_minute": 100
  }'

List Your Keys

curl http://localhost:8000/api/admin/keys \
  -H "Authorization: Bearer sk_live_your_key..."

Revoke a Key

curl -X DELETE http://localhost:8000/api/admin/keys/{key_id} \
  -H "Authorization: Bearer sk_live_your_key..."

View Usage Statistics

curl "http://localhost:8000/api/admin/usage?days=7" \
  -H "Authorization: Bearer sk_live_your_key..."

Performance

Operation Response Time Throughput
Health check <10ms -
Search (cache hit) <10ms >2 searches/sec
Search (cache miss) 500ms - 3s Depends on Gemini API
Batch search (10 queries) 2-5s >1 batch/sec
File upload 1-5s >1 file/sec

Resource Usage

  • Memory: ~200MB baseline + 50MB per 1000 cached queries
  • CPU: <5% idle, <50% under sustained load
  • Disk: 10MB per 100,000 cached queries

Security Features (v1.2.0)

  • [#] API Key Encryption: SHA256 hashing (plain keys never stored)
  • [#] Rate Limiting: Per-API-key customizable limits (default 100/min)
  • [#] Permission Control: Fine-grained access (upload, search, stores, delete)
  • [#] Audit Logging: Complete request trail with timestamps
  • [#] OWASP Headers: Security headers on all responses
  • [#] Request Tracking: Unique request IDs for debugging
  • [#] Key Expiration: Optional time-limited API keys

Roadmap

v1.2.1 (Q4 2025)

  • Improved admin authentication (IAM integration)
  • Redis UI configuration in dashboard
  • Encryption at rest for sensitive data
  • Fix deprecated FastAPI on_event decorators

v1.3.0 (Q1 2026)

  • OAuth2/OIDC integration
  • API key rotation
  • Billing/metering system
  • Advanced analytics dashboard

v2.0.0 (Q2 2026)

  • Multi-language support
  • Enhanced file type support (XLSX, PPTX, RTF)
  • Export search results (JSON, CSV, PDF)
  • WebSocket streaming support

Good First Issues (Contributions Welcome!)

Looking to contribute? Start with these issues:

  1. Add XLSX Support (Difficulty: Medium)

    • Extend file upload to handle Excel spreadsheets
    • Reference: flamehaven_filesearch/loaders.py
    • Estimated time: 2-3 hours
  2. Implement Search Result Caching UI (Difficulty: Easy)

    • Display cache hit rate in admin dashboard
    • Reference: flamehaven_filesearch/dashboard.py
    • Estimated time: 1-2 hours
  3. Add Batch Search Progress Tracking (Difficulty: Medium)

    • Implement WebSocket endpoint for real-time batch progress
    • Reference: flamehaven_filesearch/batch_routes.py
    • Estimated time: 3-4 hours
  4. Create Integration Tests (Difficulty: Easy)

    • Write end-to-end tests for common workflows
    • Reference: tests/test_api_integration.py
    • Estimated time: 2-3 hours
  5. Add Dark Mode to Admin Dashboard (Difficulty: Easy)

    • Extend dashboard.py with CSS dark theme toggle
    • Reference: flamehaven_filesearch/dashboard.py
    • Estimated time: 1-2 hours

See CONTRIBUTING.md for complete contribution guidelines.


Troubleshooting

Issue: 401 Unauthorized on API requests

Solution:

  • Verify FLAMEHAVEN_ADMIN_KEY environment variable is set
  • Check Authorization header format: Authorization: Bearer sk_live_your_key
  • Ensure API key hasn't expired

Issue: "Redis connection refused"

Solution:

  • Verify Redis is running: redis-cli ping
  • Check REDIS_HOST and REDIS_PORT environment variables
  • Remove Redis if not needed (system falls back to LRU cache)

Issue: High memory usage

Solution:

  • Configure Redis eviction policy: maxmemory-policy allkeys-lru
  • Reduce cache TTL in configuration
  • Monitor cache metrics: curl http://localhost:8000/prometheus | grep cache

Issue: Slow searches

Solution:

  • Check if caching is working (view cache_hits_total metric)
  • Verify Gemini API is responsive
  • Check network latency to Redis instance

Migration from v1.1.0 to v1.2.0

Breaking Changes

All protected endpoints now require API key authentication.

Migration Steps

  1. Update dependencies:

    pip install -U flamehaven-filesearch[api]
  2. Set admin key:

    export FLAMEHAVEN_ADMIN_KEY="your_secure_admin_key"
  3. Generate first API key:

    curl -X POST http://localhost:8000/api/admin/keys \
      -H "X-Admin-Key: your_admin_key" \
      -H "Content-Type: application/json" \
      -d '{"name":"Production","permissions":["upload","search","stores","delete"]}'
  4. Update application code:

    # Old (v1.1.0)
    requests.post("http://localhost:8000/api/search", json={"query": "test"})
    
    # New (v1.2.0)
    api_key = "sk_live_your_key"
    requests.post(
        "http://localhost:8000/api/search",
        json={"query": "test"},
        headers={"Authorization": f"Bearer {api_key}"}
    )
  5. Test before production deployment

Rollback Plan

If issues occur, v1.1.0 remains available. No data loss on downgrade.


Support & Community


Architecture Overview

[Your Documents]
       |
       v
[File Upload Endpoint] ---> [File Parser] ---> [Store Manager]
       |                         |                    |
       +---- (REST API) --------+---- (SQLite DB)----+
       |
[Search Endpoint] ---> [Semantic Search] ---> [Gemini API]
       |                   |
       +--- (Cache) -------+
       |
[Prometheus Metrics] <--- [Audit Log]

Performance Metrics

Recent v1.2.0 benchmark (Docker on M1 Mac):

Health Check:           8ms
Search (cache hit):     9ms
Search (cache miss):    1250ms
Batch Search (10 queries, parallel): 2500ms
Upload (50MB file):     3200ms

License

FLAMEHAVEN FileSearch is released under the MIT License. See LICENSE file for details.


Acknowledgments

Built with:

  • FastAPI - Modern Python web framework
  • Google Gemini API - Semantic understanding
  • SQLite - Lightweight database
  • Redis (optional) - Distributed caching

Questions? Open an issue or email [email protected]

Last Updated: November 16, 2025 (v1.2.0)

About

Open-source semantic document search (RAG) engine with FastAPI and instant self-hosted deployment

Topics

Resources

License

Contributing

Security policy

Stars

Watchers

Forks

Packages

No packages published

Contributors 3

  •  
  •  
  •  

Languages