AI-Powered GitHub Issue Intelligence.
Automatically detect duplicate issues, find similar issues with semantic search, and intelligently route issues across repositories.
- Semantic Duplicate Detection — Find related issues using AI-powered embeddings, not just keyword matching.
- Cross-Repository Search — Search for similar issues across your organization.
- Intelligent Routing — Automatically transfer issues to the correct repository based on content.
- Smart Triage — AI-powered labeling and quality assessment.
- Modular Pipeline — Customize workflows with plug-and-play steps.
- Multi-Repo Support — Central configuration with per-repo overrides.
Simili uses a "Lego with Blueprints" architecture:
- Lego Blocks: Independent, reusable pipeline steps (Gatekeeper, Similarity, Triage, etc.).
- Blueprints: Pre-defined workflows for common use cases.
- State Branch: Git-based state management using an orphan branch (no comment scanning).
┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ Gatekeeper │───▶│ Similarity │───▶│ Triage │───▶│ Action │
│ Check │ │ Search │ │ Analysis │ │ Executor │
└─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘
Simili-Bot supports both Single-Repository and Organization-wide setups.
| Guide | Description |
|---|---|
| Single Repo Setup | Instructions for setting up Simili-Bot on a standalone repository. |
| Organization Setup | Best practices for deploying across an organization using Reusable Workflows. |
Simili supports both Gemini and OpenAI.
- Set at least one key:
GEMINI_API_KEYorOPENAI_API_KEY - If both keys are set, Simili uses Gemini by default (Gemini takes precedence)
- If only one key is set, Simili uses that provider
Default models:
- LLM:
gemini-2.0-flash-lite(Gemini),gpt-5.2(OpenAI) - Embeddings:
gemini-embedding-001(Gemini),text-embedding-3-small(OpenAI)
If you override embedding.model, keep embedding.dimensions aligned with the model:
gemini-embedding-001->3072text-embedding-3-small->1536text-embedding-3-large->3072
We provide copy-pasteable examples to get you started quickly:
- Multi-Repo Examples: Includes shared workflow, caller workflow, and central config.
- Single-Repo Examples: Standard workflow and configuration.
You can specify a workflow in your simili.yaml or define custom steps.
| Preset | Description |
|---|---|
issue-triage |
Full pipeline: similarity search, duplicate check, triage analysis, and action execution. |
similarity-only |
Runs similarity search only. Useful for "Find Similar Issues" features without auto-triage. |
index-only |
Indexes issues to the vector database without providing feedback. |
Simili provides a powerful CLI for local development, testing, and batch operations.
Bulk index issues from a GitHub repository into the vector database.
simili index --repo owner/repo --workers 5 --limit 100Flags:
--repo(required): Target repository (owner/name)--workers: Number of concurrent workers (default: 5)--since: Start from issue number or timestamp--limit: Maximum issues to index--dry-run: Simulate without writing to database
Process a single issue through the pipeline.
simili process --issue issue.json --workflow issue-triage --dry-runFlags:
--issue: Path to issue JSON file--workflow: Workflow preset to run (default: "issue-triage")--dry-run: Run without side effects--repo,--org,--number: Override issue fields
Process multiple issues from a JSON file in batch mode. All operations run in dry-run mode to prevent GitHub writes.
simili batch --file issues.json --format csv --out-file results.csv --workers 5Use Cases:
- Test bot logic on historical data without spamming repositories
- Generate reports showing similarity analysis and duplicate detection
- Analyze issues from repositories where you lack write access
- Bulk identify transfer recommendations and quality scores
Flags:
--file(required): Path to JSON file with array of issues--out-file: Output file path (stdout if not specified)--format: Output format:jsonorcsv(default:json)--workers: Number of concurrent workers (default: 1)--workflow: Workflow preset (default: "issue-triage")--collection: Override Qdrant collection name--threshold: Override similarity threshold--duplicate-threshold: Override duplicate confidence threshold--top-k: Override max similar issues to show
Input Format:
Create a JSON file with an array of issues:
[
{
"org": "owner",
"repo": "repo-name",
"number": 123,
"title": "Issue title",
"body": "Issue description...",
"state": "open",
"labels": ["bug", "high-priority"],
"author": "username",
"created_at": "2026-02-10T10:00:00Z"
}
]Output Formats:
- JSON: Full pipeline results with detailed analysis
- CSV: Flattened summary for spreadsheet analysis
Example Workflow:
# 1. Index repository issues
simili index --repo ballerina-platform/ballerina-library --workers 10
# 2. Prepare test issues in batch.json
# 3. Run batch analysis
simili batch --file batch.json --format csv --out-file analysis.csv --workers 5
# 4. Review results
cat analysis.csvMinimal .github/simili.yaml example:
qdrant:
url: "${QDRANT_URL}"
api_key: "${QDRANT_API_KEY}"
collection: "my-issues"
embedding:
provider: "gemini"
api_key: "${GEMINI_API_KEY}"
model: "gemini-embedding-001"
llm:
provider: "gemini"
api_key: "${GEMINI_API_KEY}"
model: "gemini-2.5-flash"
# temperature: 0.3
defaults:
similarity_threshold: 0.65
max_similar_to_show: 5Notes:
llm.modeldefaults togemini-2.5-flashwhen omitted.llm.api_keycan be omitted ifGEMINI_API_KEYis set.- You can override the model at runtime with
LLM_MODEL.
Scan all open issues labelled potential-duplicate and close those whose grace period has expired with no human activity. Closed issues are relabelled from potential-duplicate → duplicate.
simili auto-close --repo owner/repo --grace-period-minutes 60Flags:
--repo(required): Target repository (owner/name); falls back toGITHUB_REPOSITORYenv var--grace-period-minutes: Override the grace period in minutes for this run (see precedence below)--dry-run: Print what would be closed without making any changes--config: Path tosimili.yaml(auto-discovered if omitted)
Grace period precedence (highest → lowest):
| Source | How to set |
|---|---|
--grace-period-minutes CLI flag |
Pass at runtime — overrides everything |
auto_close.grace_period_hours in simili.yaml |
Persistent per-repo config |
| Built-in default | 72 hours (3 days) |
simili.yaml configuration:
auto_close:
grace_period_hours: 48 # default: 72
dry_run: falseHuman activity signals — any of these prevent auto-close:
- A negative reaction (👎 or 😕) on the bot's triage comment by a non-bot user.
- The issue was reopened by a human after the
potential-duplicatelabel was applied. - A non-bot comment posted after the label was applied.
GitHub Actions usage — the auto-close.yml workflow runs daily at 10:00 UTC and can be triggered manually via workflow_dispatch with an optional grace_period_minutes input:
# Trigger from GitHub UI or gh CLI:
gh workflow run auto-close.yml -f grace_period_minutes=60 -f dry_run=falseLeaving grace_period_minutes empty uses the value from simili.yaml (or the 72 h default).
# Clone the repository
git clone https://github.com/similigh/simili-bot.git
cd simili-bot
# Build
go build ./...
# Run tests
go test ./...
# Lint
go vet ./...This project is licensed under the Apache License 2.0 — see the LICENSE file for details.
Made by the Simili Team
