Skip to content

Commit 7e58573

Browse files
authored
Merge pull request #48 from mohit-sheth/rag
add BYOK RAG for BugZooka
2 parents 3d4e942 + 7ee4a1f commit 7e58573

17 files changed

+344
-18
lines changed

.dockerignore

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -72,4 +72,4 @@ logs/
7272
*.tmp
7373
*.temp
7474
.tmp/
75-
.temp/
75+
.temp/

Dockerfile

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -50,6 +50,7 @@ RUN apt-get update && apt-get install -y --no-install-recommends \
5050
curl \
5151
jq \
5252
ca-certificates \
53+
vim \
5354
&& apt-get clean \
5455
&& rm -rf /var/lib/apt/lists/*
5556

Makefile

Lines changed: 14 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
.PHONY: help install dev-install test lint format check clean
1+
.PHONY: help install dev-install test lint format check clean deploy
22

33
help: ## Show this help
44
@grep -E '^[a-zA-Z_-]+:.*?## .*$$' $(MAKEFILE_LIST) | awk 'BEGIN {FS = ":.*?## "}; {printf "\033[36m%-20s\033[0m %s\n", $$1, $$2}'
@@ -31,7 +31,7 @@ clean: ## Clean cache and temporary files
3131
rm -rf .ruff_cache/
3232
rm -rf .mypy_cache/
3333

34-
run: ## Run BugZooka (requires --product and --ci arguments)
34+
run: ## Run BugZooka (if RAG_IMAGE set in env/.env, apply sidecar overlay first)
3535
PYTHONPATH=. python bugzooka/entrypoint.py $(ARGS)
3636

3737
podman-build: ## Build podman image
@@ -46,3 +46,15 @@ podman-run: ## Run podman container
4646
-e GEMINI_VERIFY_SSL=false \
4747
-v ./.env:/app/.env:Z \
4848
bugzooka:latest
49+
50+
deploy: ## Deploy to OpenShift (uses overlays/rag if RAG_IMAGE is set in .env)
51+
@set -a; \
52+
if [ -f .env ]; then . ./.env; fi; \
53+
set +a; \
54+
if [ -n "$$RAG_IMAGE" ]; then \
55+
echo "Deploying with RAG overlay (RAG_IMAGE=$$RAG_IMAGE)"; \
56+
kustomize build --load-restrictor=LoadRestrictionsNone ./kustomize/overlays/rag | envsubst | oc apply -f -; \
57+
else \
58+
echo "Deploying base kustomize (no RAG_IMAGE)"; \
59+
kustomize build --load-restrictor=LoadRestrictionsNone ./kustomize/base | envsubst | oc apply -f -; \
60+
fi

README.md

Lines changed: 48 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -144,6 +144,54 @@ Along with secrets, prompts are configurable using a `prompts.json` in the root
144144
}
145145
```
146146
147+
148+
### **Historical Failure Summary (summarize)**
149+
150+
- What it does:
151+
- Scans channel history within the specified lookback window
152+
- Counts total jobs and failures, groups failures by type
153+
- Optionally breaks down by OpenShift version and includes representative messages
154+
155+
- How to run:
156+
- Ensure BugZooka is running
157+
- In Slack:
158+
- `summarize 20m`
159+
- `summarize 7d verbose`
160+
161+
- Behavior:
162+
- All summary output is threaded under that parent to avoid channel noise
163+
- Large sections are chunked to fit Slack limits
164+
165+
- Notes:
166+
- Only CI job notifications that clearly indicate a failure are included
167+
- No persistent state; summaries read from channel history at request time
168+
169+
### **RAG-Augmented Analysis (Optional)**
170+
BugZooka can optionally enrich its “Implications to understand” output with Retrieval-Augmented Generation (RAG) context when a local vector store is available.
171+
172+
- What it does:
173+
- Detects RAG data under `RAG_DB_PATH` (default: `/rag`).
174+
- Retrieves top-k relevant chunks via the local FAISS index.
175+
- Uses `RAG_AWARE_PROMPT` to ask the inference API for context-aware insights.
176+
- Appends a “RAG-Informed Insights” section beneath the standard implications.
177+
178+
- Enable via deployment overlay:
179+
- Build your BYOK RAG image following the BYOK tooling HOWTO and set it as `RAG_IMAGE` in your `.env`:
180+
- [BYOK Tooling HOWTO](https://github.com/openshift/lightspeed-rag-content/tree/main/byok#byok-tooling-howto)
181+
- Run `make deploy`. The Makefile will apply the RAG overlay and mount a shared volume at `/rag`.
182+
- Note: The BYOK image is intended to be used as an initContainer to prepare the vector store. In this repository, the provided overlay runs it as a sidecar; both patterns are supported for preparing/serving `/rag`.
183+
- For local testing without a cluster, place your RAG content under `/rag`; BugZooka will auto-detect it.
184+
185+
- Behavior and fallback:
186+
- If no RAG artifacts are detected, analysis proceeds unchanged.
187+
188+
- Files of interest:
189+
- `bugzooka/integrations/rag_client_util.py`: retrieves top-k chunks from FAISS
190+
- `bugzooka/analysis/prompts.py`: `RAG_AWARE_PROMPT`
191+
- `bugzooka/integrations/slack_fetcher.py`: integrates RAG into implications when available
192+
- `kustomize/overlays/rag/*`: RAG sidecar overlay and volume wiring
193+
194+
147195
### **MCP Servers**
148196
MCP servers can be integrated by adding a simple configuration in `mcp_config.json` file in the root directory.
149197

bugzooka/analysis/prompts.py

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -37,3 +37,17 @@
3737
"- <Code fixes or configuration updates>\n"
3838
"- <Relevant logs, metrics, or monitoring tools>",
3939
}
40+
41+
RAG_AWARE_PROMPT = {
42+
"system": "You are an AI assistant specializing in analyzing logs to detect failures. "
43+
"When provided with additional contextual knowledge (from RAG), use it to refine your analysis "
44+
"and improve accuracy of diagnostics.",
45+
"user": (
46+
"You have access to external knowledge retrieved from a vector store (RAG). "
47+
"Use this RAG context to better interpret the following log data.\n\n"
48+
"RAG Context:\n{rag_context}\n\n"
49+
"Log Data:\n{error_list}\n\n"
50+
"Using both, detect anomalies, identify key failures, and summarize the most critical issues."
51+
),
52+
"assistant": "Here is a context-aware analysis of the most relevant failures:",
53+
}
Lines changed: 58 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,58 @@
1+
import os
2+
from typing import Optional
3+
4+
from dotenv import load_dotenv
5+
from llama_index.core import Settings, load_index_from_storage
6+
from llama_index.core.llms.utils import resolve_llm
7+
from llama_index.core.storage.storage_context import StorageContext
8+
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
9+
from llama_index.vector_stores.faiss import FaissVectorStore
10+
11+
# Fix cache permission issue for non-root containers
12+
os.environ.setdefault("HF_HOME", "/tmp/.cache")
13+
os.environ.setdefault("TRANSFORMERS_CACHE", "/tmp/.cache")
14+
os.environ.setdefault("LLAMA_INDEX_CACHE_DIR", "/tmp/.cache")
15+
os.makedirs("/tmp/.cache", exist_ok=True)
16+
17+
18+
def get_rag_context(query: str, top_k: Optional[int] = None) -> str:
19+
"""Return concatenated top-k chunks from the local FAISS store for a query.
20+
21+
Reads configuration from environment variables and optional .env files.
22+
"""
23+
# Load env without overriding already-set variables
24+
load_dotenv(dotenv_path=".env", override=False)
25+
load_dotenv(dotenv_path="/app/.env", override=False)
26+
27+
db_path = os.getenv("RAG_DB_PATH", "/rag")
28+
index_id = os.getenv("RAG_PRODUCT_INDEX", "vector_db_index")
29+
embed_model_path = os.getenv(
30+
"EMBEDDING_MODEL_PATH", "sentence-transformers/all-mpnet-base-v2"
31+
)
32+
k = int(os.getenv("RAG_TOP_K", str(top_k if top_k is not None else 5)))
33+
34+
os.environ.setdefault("TRANSFORMERS_CACHE", embed_model_path)
35+
os.environ.setdefault("TRANSFORMERS_OFFLINE", "0")
36+
37+
Settings.embed_model = HuggingFaceEmbedding(model_name=embed_model_path)
38+
Settings.llm = resolve_llm(None)
39+
40+
storage_context = StorageContext.from_defaults(
41+
vector_store=FaissVectorStore.from_persist_dir(db_path), persist_dir=db_path
42+
)
43+
vector_index = load_index_from_storage(
44+
storage_context=storage_context, index_id=index_id
45+
)
46+
47+
retriever = vector_index.as_retriever(similarity_top_k=k)
48+
nodes = retriever.retrieve(query)
49+
50+
seen_texts = set()
51+
formatted_chunks = []
52+
for i, node in enumerate(nodes, 1):
53+
text = node.get_text().strip()
54+
if text not in seen_texts:
55+
seen_texts.add(text)
56+
formatted_chunks.append(f"--- Chunk {i} ---\n{text}\n")
57+
58+
return "\n".join(formatted_chunks)

bugzooka/integrations/slack_fetcher.py

Lines changed: 53 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,7 @@
33
import sys
44
import time
55
import re
6+
import os
67

78
from slack_sdk import WebClient
89
from slack_sdk.errors import SlackApiError
@@ -25,11 +26,13 @@
2526
classify_failure_type,
2627
build_summary_sections,
2728
)
28-
from bugzooka.core.utils import extract_job_details
29+
from bugzooka.analysis.prompts import RAG_AWARE_PROMPT
2930
from bugzooka.integrations.inference import (
3031
InferenceAPIUnavailableError,
3132
AgentAnalysisLimitExceededError,
33+
ask_inference_api,
3234
)
35+
from bugzooka.integrations.rag_client_util import get_rag_context
3336
from bugzooka.core.utils import (
3437
to_job_history_url,
3538
fetch_job_history_stats,
@@ -164,6 +167,7 @@ def _chunk_text(self, text: str, limit: int = 11900) -> List[str]:
164167
start = split_at
165168

166169
return chunks
170+
167171
def _handle_job_history(
168172
self,
169173
thread_ts: str,
@@ -487,6 +491,14 @@ def _summarize_messages_in_range(
487491
version_type_messages,
488492
)
489493

494+
def _is_rag_enabled(self) -> bool:
495+
"""Check if RAG data exists under /rag."""
496+
rag_dir = os.getenv("RAG_DB_PATH", "/rag")
497+
if not os.path.isdir(rag_dir):
498+
return False
499+
# Check for expected RAG artifacts (JSON index/store files)
500+
return any(f.name.endswith(".json") for f in os.scandir(rag_dir))
501+
490502
def _process_message(
491503
self, msg, product, ci_system, product_config, enable_inference
492504
):
@@ -556,8 +568,46 @@ def _process_message(
556568
error_summary, product, product_config
557569
)
558570

559-
# Send final analysis
560-
self._send_analysis_result(analysis_response, ts)
571+
# Optionally augment with RAG-aware prompt when RAG_IMAGE is set
572+
combined_response = analysis_response
573+
try:
574+
if self._is_rag_enabled():
575+
self.logger.info(
576+
"RAG data detected — augmenting analysis with RAG context."
577+
)
578+
rag_top_k = int(os.getenv("RAG_TOP_K", "3"))
579+
rag_query = f"Provide context relevant to the following errors:\n{error_summary}"
580+
rag_context = get_rag_context(rag_query, top_k=rag_top_k)
581+
if rag_context:
582+
rag_user = RAG_AWARE_PROMPT["user"].format(
583+
rag_context=rag_context,
584+
error_list=error_summary,
585+
)
586+
rag_messages = [
587+
{"role": "system", "content": RAG_AWARE_PROMPT["system"]},
588+
{"role": "user", "content": rag_user},
589+
{
590+
"role": "assistant",
591+
"content": RAG_AWARE_PROMPT["assistant"],
592+
},
593+
]
594+
rag_resp = ask_inference_api(
595+
messages=rag_messages,
596+
url=product_config["endpoint"][product],
597+
api_token=product_config["token"][product],
598+
model=product_config["model"][product],
599+
)
600+
combined_response = (
601+
f"{analysis_response}\n\n"
602+
f"💡 **RAG-Informed Insights:**\n{rag_resp}"
603+
)
604+
else:
605+
self.logger.info("No RAG data found — skipping RAG augmentation.")
606+
except Exception as e:
607+
self.logger.warning("RAG augmentation failed/skipped: %s", e)
608+
609+
# Send final analysis (possibly augmented)
610+
self._send_analysis_result(combined_response, ts)
561611

562612
except InferenceAPIUnavailableError as e:
563613
self.logger.warning(
Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -51,11 +51,11 @@ spec:
5151
subPath: prompts.json
5252
resources:
5353
requests:
54-
cpu: "100m"
55-
memory: "128Mi"
54+
cpu: "1"
55+
memory: "1Gi"
5656
limits:
57-
cpu: "500m"
58-
memory: "512Mi"
57+
cpu: "2"
58+
memory: "2Gi"
5959
volumes:
6060
- name: prompts
6161
configMap:

0 commit comments

Comments
 (0)