-
Notifications
You must be signed in to change notification settings - Fork 1
refactor: modularize agent implementation #5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
refactor: modularize agent implementation #5
Conversation
|
Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. Warning Rate limit exceeded@andrewm4894 has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 24 minutes and 6 seconds before requesting another review. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📒 Files selected for processing (5)
WalkthroughRefactors anomaly detection into a class-based API using a reusable LangGraph. Adds new models, nodes, tools, and graph modules. Updates package exports to source models from a new module. Expands Makefile with lint/format targets. Refreshes documentation with module overviews and formatting fixes. Tests updated to import from package root. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
actor User
participant AA as AnomalyAgent
participant G as Graph (StateGraph)
participant D as Detection Node
participant V as Verification Node
User->>AA: detect_anomalies(DataFrame)
AA->>G: app.invoke({time_series, variable_name})
G->>D: run detection chain
D-->>G: detected_anomalies
alt verify enabled and anomalies exist
G->>V: run verification chain
V-->>G: verified_anomalies
G-->>AA: result (verified_anomalies)
else
G-->>AA: result (detected_anomalies)
end
AA-->>User: {column -> AnomalyList}
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Poem
✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #5 +/- ##
==========================================
+ Coverage 62.39% 63.56% +1.16%
==========================================
Files 6 10 +4
Lines 234 247 +13
Branches 39 36 -3
==========================================
+ Hits 146 157 +11
- Misses 82 84 +2
Partials 6 6 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR refactors the anomaly detection agent from a monolithic implementation into a modular architecture, splitting functionality across separate modules for better maintainability and organization.
Key Changes:
- Split the main
agent.pyinto specialized modules (models,tools,nodes,graph) - Updated imports in test files to use the new module structure
- Added documentation for the modular layout and new Makefile helpers
Reviewed Changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
tests/test_prompts.py |
Updated import path from anomaly_agent.agent to anomaly_agent |
tests/test_agent.py |
Updated import path from anomaly_agent.agent to anomaly_agent |
anomaly_agent/tools.py |
New module containing LLM chain creation utilities |
anomaly_agent/nodes.py |
New module with graph nodes for detection and verification |
anomaly_agent/models.py |
New module with Pydantic models extracted from agent.py |
anomaly_agent/graph.py |
New module with graph construction utilities |
anomaly_agent/agent.py |
Refactored to use new modular components, significantly simplified |
anomaly_agent/__init__.py |
Updated exports to import from new module locations |
README.md |
Added module overview documentation |
Makefile |
Added lint and format helper targets |
ARCHITECTURE.md |
Added module structure documentation |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
| ) | ||
| anomaly_description: str = Field(description="A description of the anomaly") | ||
|
|
||
| @validator("timestamp") # type: ignore |
Copilot
AI
Aug 26, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The @validator decorator is deprecated in Pydantic v2. Consider using @field_validator instead for forward compatibility.
| @validator("timestamp") # type: ignore | |
| @field_validator("timestamp") |
| @validator("variable_value") # type: ignore | ||
| def validate_variable_value(cls, v: float) -> float: | ||
| if not isinstance(v, (int, float)): | ||
| raise ValueError("variable_value must be a number") | ||
| return float(v) | ||
|
|
||
| @validator("anomaly_description") # type: ignore |
Copilot
AI
Aug 26, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The @validator decorator is deprecated in Pydantic v2. Consider using @field_validator instead for forward compatibility.
| @validator("variable_value") # type: ignore | |
| def validate_variable_value(cls, v: float) -> float: | |
| if not isinstance(v, (int, float)): | |
| raise ValueError("variable_value must be a number") | |
| return float(v) | |
| @validator("anomaly_description") # type: ignore | |
| @field_validator("variable_value") | |
| def validate_variable_value(cls, v: float) -> float: | |
| if not isinstance(v, (int, float)): | |
| raise ValueError("variable_value must be a number") | |
| return float(v) | |
| @field_validator("anomaly_description") |
| @validator("variable_value") # type: ignore | ||
| def validate_variable_value(cls, v: float) -> float: | ||
| if not isinstance(v, (int, float)): | ||
| raise ValueError("variable_value must be a number") | ||
| return float(v) | ||
|
|
||
| @validator("anomaly_description") # type: ignore |
Copilot
AI
Aug 26, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The @validator decorator is deprecated in Pydantic v2. Consider using @field_validator instead for forward compatibility.
| @validator("variable_value") # type: ignore | |
| def validate_variable_value(cls, v: float) -> float: | |
| if not isinstance(v, (int, float)): | |
| raise ValueError("variable_value must be a number") | |
| return float(v) | |
| @validator("anomaly_description") # type: ignore | |
| @field_validator("variable_value", mode="before") | |
| def validate_variable_value(cls, v: float) -> float: | |
| if not isinstance(v, (int, float)): | |
| raise ValueError("variable_value must be a number") | |
| return float(v) | |
| @field_validator("anomaly_description", mode="before") |
|
|
||
| anomalies: List[Anomaly] = Field(description="The list of anomalies") | ||
|
|
||
| @validator("anomalies") # type: ignore |
Copilot
AI
Aug 26, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The @validator decorator is deprecated in Pydantic v2. Consider using @field_validator instead for forward compatibility.
| @validator("anomalies") # type: ignore | |
| @field_validator("anomalies") # type: ignore |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 10
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (5)
tests/test_agent.py (1)
102-114: Tests currently hit live LLMs and fail without OPENAI_API_KEY; mock LLM/graph callsRunning
make testsfails with “The api_key client option must be set...” becauseAnomalyAgent()instantiatesChatOpenAIeven if we don't invoke the model. Per repo guidelines, tests should mock LLM calls. Add an autouse pytest-mock fixture (preferably in tests/conftest.py) to stubChatOpenAIso no network or API key is required. The stub should also return timestamps present in the provided time series to satisfy timestamp assertions in irregular/sub-second tests.Add this new file:
# tests/conftest.py import re import pytest from anomaly_agent import Anomaly, AnomalyList @pytest.fixture(autouse=True) def _mock_llm(mocker): class _FakeRunnable: def invoke(self, inputs): ts_str = inputs.get("time_series", "") # Find first timestamp in the table string (supports sub-second) m = re.search(r"\d{4}-\d{2}-\d{2}(?:\s+\d{2}:\d{2}:\d{2}(?:\.\d{3,6})?)?", ts_str) ts = m.group(0) if m else "2024-01-01 00:00:00" var = inputs.get("variable_name", "value") return AnomalyList( anomalies=[ Anomaly( timestamp=ts, variable_value=3.14, anomaly_description=f"synthetic anomaly in {var}", ) ] ) class _FakeLLM: def with_structured_output(self, _schema): return _FakeRunnable() # Prevent real client construction and network usage mocker.patch("anomaly_agent.agent.ChatOpenAI", return_value=_FakeLLM())Optionally, to harden determinism of data-generation in this module (random noise), seed numpy at module level:
# at top of the module np.random.seed(0)If you want me to push a patch across both test files (and add conftest.py), say the word.
Also applies to: 116-127, 129-141, 143-174, 176-183, 185-195, 197-207, 209-216, 218-242, 244-258, 260-279, 281-297, 299-317, 319-340, 342-376, 383-397, 398-417, 419-438, 440-462, 464-487, 511-527
tests/test_prompts.py (1)
122-136: Add an autouse fixture to mock ChatOpenAI in prompt‐integration testsAll of the prompt tests in tests/test_prompts.py (that cover lines 122–136, 138–156, 158–176, 178–196, 198–220, 222–238, and 240–257) call
AnomalyAgent(), which in turn constructs a realChatOpenAIclient in its constructor. Without a pytest fixture to stub outChatOpenAI, these tests will attempt real API calls (and fail whenOPENAI_API_KEYis unset).Please add a
tests/conftest.pywith an automatic fixture that replaces the real LLM with a dummy stub. For example:# tests/conftest.py import pytest from anomaly_agent import agent class DummyLLM: def __init__(self, *args, **kwargs): pass def __call__(self, *args, **kwargs): # return minimal valid response shape for your tests return {} @pytest.fixture(autouse=True) def mock_chat_llm(monkeypatch): """ Prevent real ChatOpenAI instantiation in all tests. """ monkeypatch.setattr(agent, "ChatOpenAI", DummyLLM)• Ensure this fixture is applied globally (no need to import it in individual test files).
• After adding it, verify locally with no OPENAI_API_KEY:unset OPENAI_API_KEY pytest -qThis will guarantee that all prompt‐integration tests use the dummy LLM and never hit the real OpenAI API.
anomaly_agent/models.py (1)
56-66: Adopt Pydantic v2 field validator for list field and add docstringEnsures consistency with v2 and satisfies D102.
class AnomalyList(BaseModel): """Represents a list of anomalies.""" anomalies: List[Anomaly] = Field(description="The list of anomalies") - @validator("anomalies") # type: ignore - def validate_anomalies(cls, v: List[Anomaly]) -> List[Anomaly]: - if not isinstance(v, list): - raise ValueError("anomalies must be a list") - return v + @field_validator("anomalies") + def validate_anomalies(cls, v: List[Anomaly]) -> List[Anomaly]: + """Ensure anomalies is a list of Anomaly instances.""" + if not isinstance(v, list): + raise ValueError("anomalies must be a list") + return vREADME.md (1)
231-243: Enforce clear separation of unit vs. integration testsOur Makefile’s default
make testtarget (lines 30–33) currently runs all tests—including those that require anOPENAI_API_KEY—and none of the existing tests are marked with@pytest.mark.integration. As a result,make testwill fail without an API key, andpytest -m integrationwon’t actually pick up any tests.Required changes:
- Update the Makefile so that
make testexcludes integration tests by default.- Add a new
integration-teststarget (or update the existingtestalias) to run only integration tests.- Mark any tests relying on the OpenAI client or
OPENAI_API_KEYwith@pytest.mark.integration.- Revise README.md to reflect:
make test→ unit tests only (no API key needed)make integration-tests(oruv run pytest -m integration) → requiresOPENAI_API_KEYPinpointed locations:
- Makefile (around lines 30–33):
tests:- uv run pytest tests/ -v --cov=anomaly_agent --cov-branch --cov-report=term-missing --cov-report=xml
- uv run pytest tests/ -v --cov=anomaly_agent --cov-branch --cov-report=term-missing --cov-report=xml -m "not integration"
test: tests
+integration-tests:
- uv run pytest tests/ -v -m integration
- **tests/**: Prepend `@pytest.mark.integration` to any test functions or classes that require `OPENAI_API_KEY`. - **README.md** (around lines 231–243): ```diff -# Run all tests with coverage -make test +# Run all unit tests with coverage (no API key needed) +make test -# Run specific test categories -uv run pytest tests/test_agent.py -v # Core functionality -uv run pytest tests/test_prompts.py -v # Prompt system -uv run pytest tests/test_graph_architecture.py -v # Advanced architecture +# Run specific unit-test categories +uv run pytest tests/test_agent.py -v # Core functionality +uv run pytest tests/test_prompts.py -v # Prompt system +uv run pytest tests/test_graph_architecture.py -v # Advanced architecture -# Integration tests (requires OPENAI_API_KEY in .env) -uv run pytest tests/ -m integration -v +# Integration tests (requires OPENAI_API_KEY in .env) +make integration-testsThis will ensure that
make testnever demands an API key, while preserving a clear path for running integration tests when a key is available.anomaly_agent/agent.py (1)
16-26: Expose Required Streaming and Parallel Anomaly Detection APIsWe’ve confirmed that the following public methods are currently absent from
anomaly_agent/agent.pyand must be exposed to comply with our coding guidelines and preserve backward compatibility:• detect_anomalies_streaming
• detect_anomalies_parallel (async)
• detect_anomalies_streaming_async (async)Please add these methods immediately after the existing
detect_anomalies()implementation inanomaly_agent/agent.py. Below is the minimal skeleton you can merge in; adjust imports and helper calls as needed:class AnomalyAgent: … # existing methods def detect_anomalies_streaming( self, df: pd.DataFrame, timestamp_col: Optional[str] = None, verify_anomalies: Optional[bool] = None, ): """Yield (column, step, AnomalyList) updates as they become available.""" # …implementation as per guidelines + async def detect_anomalies_parallel( + self, + df: pd.DataFrame, + timestamp_col: Optional[str] = None, + verify_anomalies: Optional[bool] = None, + ) -> Dict[str, AnomalyList]: + """Run per-column anomaly detection concurrently.""" + # …implementation as per guidelines + + async def detect_anomalies_streaming_async( + self, + df: pd.DataFrame, + timestamp_col: Optional[str] = None, + verify_anomalies: Optional[bool] = None, + ): + """Async streaming variant yielding (column, step, AnomalyList).""" + # …implementation as per guidelinesBullet-point checklist for updates:
- Add
detect_anomalies_streamingunder the public API.- Add
async def detect_anomalies_parallel.- Add
async def detect_anomalies_streaming_async.- Ensure imports (
asyncio,Dict,Any, etc.) are present.- Update
__all__or other export lists to include the new methods if applicable.These additions are critical to avoid breaking downstream code that depends on our streaming and parallel interfaces.
🧹 Nitpick comments (19)
Makefile (1)
4-4: Add conventional phony targets and satisfy checkmake warningsCheckmake flags missing "all", "clean", and "test" phony targets. Declare them and provide minimal implementations. Also mark the already-present
testalias as phony to avoid target/file name collisions.Apply this diff:
-.PHONY: examples lint format +.PHONY: examples lint format all clean test + +# Conventional aggregate/default targets +all: lint tests + +# Clean build/test artifacts +clean: + rm -rf build dist .pytest_cache .coverage coverage.xml *.egg-infoARCHITECTURE.md (4)
63-70: Module structure section is clear and matches the refactorThe listed modules align with the package split (models, tools, nodes, graph, agent). Consider linking each bullet to the actual file path for quick navigation in GitHub.
270-274: Quantified performance claim needs evidence or qualification“80% Performance Improvement” is a strong claim. Either add a brief note on methodology (dataset size, hardware, baseline) or soften the language (e.g., “significant improvement observed in internal benchmarks”).
Proposed wording tweak:
- **⚡ 80% Performance Improvement**: Eliminates redundant graph creation + **⚡ Significant Performance Improvement** (internal benchmarks): Eliminates redundant graph creation
360-364: Phase status indicators should reflect current implementationPhase 2 is marked ✅ for Streaming & Parallel Processing. If async/streaming isn’t in this PR, mark as “Planned” or “In Progress” to avoid confusion.
- **Phase 2: Streaming & Parallel Processing** ✅ + **Phase 2: Streaming & Parallel Processing** (Planned)
403-409: Minor grammar/flow polish in Summary sectionThere’s a small flow break between the bullet list and the following sentence. Join them with proper punctuation.
Apply this diff:
-This foundation enables robust anomaly detection while maintaining the flexibility to evolve with changing requirements and technological advances. +This foundation enables robust anomaly detection while maintaining the flexibility to evolve with changing requirements and technological advances.Note: If you intended a paragraph break, ensure there’s an empty line above and remove any trailing spaces that might trigger linters.
README.md (5)
70-77: Clarify module paths to the new public API modulesList fully qualified paths to reduce ambiguity and align with the PR’s “modular public API” intent.
-### 📦 Module Overview - -- `models.py` – Pydantic data models for anomalies -- `tools.py` – LLM chains used by the graph nodes -- `nodes.py` – Detection and verification node implementations -- `graph.py` – Utilities for assembling the LangGraph state machine -- `agent.py` – High-level interface for end users +### 📦 Module Overview + +- `anomaly_agent/models.py` – Pydantic data models for anomalies +- `anomaly_agent/tools.py` – LLM chains used by the graph nodes +- `anomaly_agent/nodes.py` – Detection and verification node implementations +- `anomaly_agent/graph.py` – Utilities for assembling the LangGraph state machine +- `anomaly_agent/agent.py` – High-level interface for end users
29-29: Minor punctuation/style consistency in bulletsMaintain consistent punctuation in the feature list (all bullets end with a period or none do). Also resolves a LanguageTool warning.
-- 🔄 **Two-Stage Pipeline**: Detection and optional verification phases to reduce false positives +- 🔄 **Two-Stage Pipeline**: Detection and optional verification phases to reduce false positives.
141-146: Note that examples require an API key when they hit the modelUsers following examples that invoke the LLM may encounter “api_key must be set.” Add a reminder and a non-LLM example command.
# Try real-world sensor data scenario python examples/examples.py --example real-world --plot # Custom model and plotting python examples/examples.py --model gpt-4o-mini --example multiple --plot +# +# Note: Examples that call the LLM require OPENAI_API_KEY in your .env. +# For offline plotting-only runs, pass --no-detect (if supported) or use datasets that skip LLM calls.
204-209: Insert a blank line before the header for proper Markdown renderingFixes a LanguageTool warning and ensures the header renders as a header, not inline text.
-### 🔬 Science & Engineering + +### 🔬 Science & Engineering
295-299: Emphasize Pydantic v2 field validators to match coding guidelinesThe repo standard is Pydantic v2 with field validators. Update wording to avoid confusion with v1 “validator”.
-- **✅ Pydantic Validation**: Type-safe data models throughout the pipeline +- **✅ Pydantic v2 Validation**: Type-safe models with field validators throughout the pipelineanomaly_agent/tools.py (1)
1-11: Import/style layout looks good; keep Black happy by trimming extra blank linesPre-commit flagged Black changes. One common cause is superfluous blank lines around imports. The following retains readability and satisfies Black.
-"""Utility chains used by anomaly detection nodes.""" -from langchain_core.runnables import Runnable -from langchain_openai import ChatOpenAI - -from .models import AnomalyList -from .prompt import ( +"""Utility chains used by anomaly detection nodes.""" +from langchain_core.runnables import Runnable +from langchain_openai import ChatOpenAI +from .models import AnomalyList +from .prompt import ( DEFAULT_SYSTEM_PROMPT, DEFAULT_VERIFY_SYSTEM_PROMPT, get_detection_prompt, get_verification_prompt, ) - +anomaly_agent/nodes.py (3)
20-35: Correct type hints for node factories to reflect partial state updatesNode functions return partial updates (dict fragments), not full AgentState instances. Update hints to reduce confusion and improve mypy accuracy.
-def create_detection_node( - llm: ChatOpenAI, detection_prompt: str -) -> Callable[[AgentState], AgentState]: +def create_detection_node( + llm: ChatOpenAI, detection_prompt: str +) -> Callable[[AgentState], dict[str, Any]]: @@ - def detection_node(state: AgentState) -> AgentState: + def detection_node(state: AgentState) -> dict[str, Any]: - result = chain.invoke( + result = chain.invoke( { - "time_series": state["time_series"], - "variable_name": state["variable_name"], + "time_series": state.time_series if hasattr(state, "time_series") else state["time_series"], + "variable_name": state.variable_name if hasattr(state, "variable_name") else state["variable_name"], } ) return {"detected_anomalies": result, "current_step": "verify"}
38-67: Mirror type hints and accessor pattern in verification node; guard for Pydantic or dict stateKeeps compatibility whether LangGraph passes a dict or a Pydantic instance.
-def create_verification_node( - llm: ChatOpenAI, verification_prompt: str -) -> Callable[[AgentState], AgentState]: +def create_verification_node( + llm: ChatOpenAI, verification_prompt: str +) -> Callable[[AgentState], dict[str, Any]]: @@ - def verification_node(state: AgentState) -> AgentState: - if state["detected_anomalies"] is None: + def verification_node(state: AgentState) -> dict[str, Any]: + anomalies = ( + state.detected_anomalies + if hasattr(state, "detected_anomalies") + else state.get("detected_anomalies") # type: ignore[union-attr] + ) + if anomalies is None: return {"verified_anomalies": None, "current_step": "end"} - detected_str = "\n".join( + detected_str = "\n".join( [ ( f"timestamp: {a.timestamp}, " f"value: {a.variable_value}, " f"Description: {a.anomaly_description}" ) - for a in state["detected_anomalies"].anomalies + for a in anomalies.anomalies ] ) - result = chain.invoke( + ts = state.time_series if hasattr(state, "time_series") else state["time_series"] + var = state.variable_name if hasattr(state, "variable_name") else state["variable_name"] + result = chain.invoke( { - "time_series": state["time_series"], - "variable_name": state["variable_name"], + "time_series": ts, + "variable_name": var, "detected_anomalies": detected_str, } ) return {"verified_anomalies": result, "current_step": "end"}
71-73: Conditional router remains correct; tighten type for clarity if switching to PydanticReadable and concise. If migrating to Pydantic-only state, consider
state.current_step == "verify".-def should_verify(state: AgentState) -> Literal["verify", "end"]: - """Determine if we should proceed to verification.""" - return "verify" if state["current_step"] == "verify" else "end" +def should_verify(state: AgentState) -> Literal["verify", "end"]: + """Determine if we should proceed to verification.""" + step = state.current_step if hasattr(state, "current_step") else state["current_step"] + return "verify" if step == "verify" else "end"anomaly_agent/agent.py (5)
3-3: Type imports: prepare for stricter typing and literals.To support more precise typing in this module (e.g., Literal for format, Any for state), extend the typing imports.
Apply this diff:
-from typing import Dict, Optional +from typing import Any, Dict, Optional, Literal
34-36: Avoid compiling an unused app in init; cache compiled apps by verify flag.You compile self.graph/self.app here but later re-create and compile a new graph in detect_anomalies. That wastes time and memory. Prefer lazy caching keyed by verify_anomalies.
Apply this diff:
- self.graph = create_graph( - self.llm, verify_anomalies, detection_prompt, verification_prompt - ) - self.app = self.graph.compile() + # Lazily compile and cache apps keyed by verify flag + self._apps: Dict[bool, Any] = {} + self._graph_prompts = (detection_prompt, verification_prompt)Add this helper method (outside changed ranges):
def _get_app(self, verify: bool): app = self._apps.get(verify) if app is None: graph = create_graph(self.llm, verify, *self._graph_prompts) app = graph.compile() self._apps[verify] = app return app
45-45: Docstring is fine, but consider documenting token/row limits.Given LLM context limits, documenting recommended max rows/columns (and any sampling) helps users avoid large payloads.
53-55: Recreating the graph on every call is unnecessary; reuse a cached compiled app.Use the cached app based on the effective verify setting.
Apply this diff:
- graph = create_graph( - self.llm, verify_anomalies, self.detection_prompt, self.verification_prompt - ) - app = graph.compile() + app = self._get_app(verify_anomalies)
91-91: Use Literal for ‘format’ and validate early.Narrowing the type improves mypy/IDE help and catches typos at call sites.
Apply this diff:
- def get_anomalies_df( - self, anomalies: Dict[str, AnomalyList], format: str = "long" - ) -> pd.DataFrame: + def get_anomalies_df( + self, + anomalies: Dict[str, AnomalyList], + format: Literal["long", "wide"] = "long", + ) -> pd.DataFrame:
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (11)
ARCHITECTURE.md(12 hunks)Makefile(2 hunks)README.md(8 hunks)anomaly_agent/__init__.py(1 hunks)anomaly_agent/agent.py(4 hunks)anomaly_agent/graph.py(1 hunks)anomaly_agent/models.py(1 hunks)anomaly_agent/nodes.py(1 hunks)anomaly_agent/tools.py(1 hunks)tests/test_agent.py(1 hunks)tests/test_prompts.py(1 hunks)
🧰 Additional context used
📓 Path-based instructions (3)
anomaly_agent/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
anomaly_agent/**/*.py: Format Python code with Black using a maximum line length of 79 characters for files under anomaly_agent/
Sort imports with isort for files under anomaly_agent/
Enforce flake8 linting for files under anomaly_agent/
Run mypy type checking for files under anomaly_agent/
Use Pydantic v2 models (with field validators) for state and configuration instead of TypedDict
Files:
anomaly_agent/tools.pyanomaly_agent/graph.pyanomaly_agent/nodes.pyanomaly_agent/__init__.pyanomaly_agent/models.pyanomaly_agent/agent.py
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Use pytest-mock to mock LLM calls in tests when appropriate
Files:
tests/test_prompts.pytests/test_agent.py
anomaly_agent/agent.py
📄 CodeRabbit inference engine (CLAUDE.md)
anomaly_agent/agent.py: AgentConfig must validate constraints: max_retries between 0–10 and timeout between 30–3600 seconds
Expose and maintain streaming/parallel APIs: detect_anomalies_streaming(), detect_anomalies_parallel() (async), and detect_anomalies_streaming_async()
Files:
anomaly_agent/agent.py
🧠 Learnings (8)
📚 Learning: 2025-08-24T11:39:39.518Z
Learnt from: CR
PR: andrewm4894/anomaly-agent#0
File: CLAUDE.md:0-0
Timestamp: 2025-08-24T11:39:39.518Z
Learning: Applies to anomaly_agent/prompt.py : Implement a validated, customizable prompt system for detection and verification prompts
Applied to files:
tests/test_prompts.py
📚 Learning: 2025-08-24T11:39:39.518Z
Learnt from: CR
PR: andrewm4894/anomaly-agent#0
File: CLAUDE.md:0-0
Timestamp: 2025-08-24T11:39:39.518Z
Learning: Applies to anomaly_agent/**/*.py : Sort imports with isort for files under anomaly_agent/
Applied to files:
anomaly_agent/__init__.pyREADME.md
📚 Learning: 2025-08-24T11:39:39.518Z
Learnt from: CR
PR: andrewm4894/anomaly-agent#0
File: CLAUDE.md:0-0
Timestamp: 2025-08-24T11:39:39.518Z
Learning: Applies to anomaly_agent/constants.py : Set DEFAULT_MODEL_NAME to gpt-5-nano as the cost-optimized default model
Applied to files:
ARCHITECTURE.mdREADME.md
📚 Learning: 2025-08-24T11:39:39.518Z
Learnt from: CR
PR: andrewm4894/anomaly-agent#0
File: CLAUDE.md:0-0
Timestamp: 2025-08-24T11:39:39.518Z
Learning: Applies to anomaly_agent/**/*.py : Use Pydantic v2 models (with field validators) for state and configuration instead of TypedDict
Applied to files:
ARCHITECTURE.mdanomaly_agent/models.pyREADME.md
📚 Learning: 2025-08-24T11:39:39.518Z
Learnt from: CR
PR: andrewm4894/anomaly-agent#0
File: CLAUDE.md:0-0
Timestamp: 2025-08-24T11:39:39.518Z
Learning: Applies to anomaly_agent/**/*.py : Run mypy type checking for files under anomaly_agent/
Applied to files:
README.md
📚 Learning: 2025-08-24T11:39:39.518Z
Learnt from: CR
PR: andrewm4894/anomaly-agent#0
File: CLAUDE.md:0-0
Timestamp: 2025-08-24T11:39:39.518Z
Learning: Applies to anomaly_agent/**/*.py : Format Python code with Black using a maximum line length of 79 characters for files under anomaly_agent/
Applied to files:
README.md
📚 Learning: 2025-08-24T11:39:39.518Z
Learnt from: CR
PR: andrewm4894/anomaly-agent#0
File: CLAUDE.md:0-0
Timestamp: 2025-08-24T11:39:39.518Z
Learning: Applies to anomaly_agent/**/*.py : Enforce flake8 linting for files under anomaly_agent/
Applied to files:
README.md
📚 Learning: 2025-08-24T11:39:39.518Z
Learnt from: CR
PR: andrewm4894/anomaly-agent#0
File: CLAUDE.md:0-0
Timestamp: 2025-08-24T11:39:39.518Z
Learning: Applies to anomaly_agent/agent.py : Expose and maintain streaming/parallel APIs: detect_anomalies_streaming(), detect_anomalies_parallel() (async), and detect_anomalies_streaming_async()
Applied to files:
anomaly_agent/agent.py
🧬 Code graph analysis (7)
anomaly_agent/tools.py (2)
anomaly_agent/models.py (1)
AnomalyList(56-65)anomaly_agent/prompt.py (2)
get_detection_prompt(72-96)get_verification_prompt(99-125)
anomaly_agent/graph.py (1)
anomaly_agent/nodes.py (4)
AgentState(10-17)create_detection_node(20-35)create_verification_node(38-68)should_verify(71-73)
anomaly_agent/nodes.py (2)
anomaly_agent/models.py (1)
AnomalyList(56-65)anomaly_agent/tools.py (2)
create_detection_chain(14-20)create_verification_chain(23-29)
tests/test_prompts.py (2)
anomaly_agent/agent.py (1)
AnomalyAgent(16-119)anomaly_agent/models.py (1)
AnomalyList(56-65)
tests/test_agent.py (2)
anomaly_agent/models.py (2)
Anomaly(9-53)AnomalyList(56-65)anomaly_agent/agent.py (1)
AnomalyAgent(16-119)
anomaly_agent/__init__.py (2)
anomaly_agent/agent.py (1)
AnomalyAgent(16-119)anomaly_agent/models.py (2)
Anomaly(9-53)AnomalyList(56-65)
anomaly_agent/agent.py (3)
anomaly_agent/graph.py (1)
create_graph(8-26)anomaly_agent/models.py (1)
AnomalyList(56-65)anomaly_agent/nodes.py (1)
AgentState(10-17)
🪛 GitHub Actions: pre-commit
anomaly_agent/tools.py
[error] 1-1: Black formatting changed file anomaly_agent/tools.py.
anomaly_agent/graph.py
[error] 1-1: Black formatting changed file anomaly_agent/graph.py.
anomaly_agent/nodes.py
[error] 1-1: Black formatting changed file anomaly_agent/nodes.py.
[error] 3-3: F401 'typing.List' imported but unused.
anomaly_agent/models.py
[error] 1-1: D100 Missing docstring in public module.
[error] 44-44: D102 Missing docstring in public method.
[error] 50-50: D102 Missing docstring in public method.
[error] 62-62: D102 Missing docstring in public method.
🪛 checkmake (0.2.2)
Makefile
[warning] 4-4: Missing required phony target "all"
(minphony)
[warning] 4-4: Missing required phony target "clean"
(minphony)
[warning] 4-4: Missing required phony target "test"
(minphony)
🪛 Ruff (0.12.2)
anomaly_agent/nodes.py
2-2: typing.List imported but unused
Remove unused import: typing.List
(F401)
anomaly_agent/models.py
39-41: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
🪛 LanguageTool
ARCHITECTURE.md
[grammar] ~360-~360: There might be a mistake here.
Context: ...e 2: Streaming & Parallel Processing** ✅ - Real-time anomaly detection streaming - ...
(QB_NEW_EN)
[grammar] ~403-~403: There might be a mistake here.
Context: ...gent caching and optimization strategies - 🛡️ Reliability: Comprehensive error h...
(QB_NEW_EN)
[grammar] ~404-~404: There might be a mistake here.
Context: ...prehensive error handling and validation - 📈 Scalability: Designed for growth an...
(QB_NEW_EN)
[grammar] ~405-~405: There might be a mistake here.
Context: ...ned for growth and enterprise deployment - 🔄 Extensibility: Plugin architecture ...
(QB_NEW_EN)
[grammar] ~406-~406: There might be a mistake here.
Context: ...gin architecture for custom requirements This foundation enables robust anomaly d...
(QB_NEW_EN)
README.md
[grammar] ~29-~29: There might be a mistake here.
Context: ...ication phases to reduce false positives - 📊 Multi-Variable Support: Analyze m...
(QB_NEW_EN)
[grammar] ~204-~204: There might be a mistake here.
Context: ... anomalies ### 🔬 Science & Engineering - 🌡️ IoT Sensors: Temperature, humidity...
(QB_NEW_EN)
[grammar] ~295-~295: There might be a mistake here.
Context: ...safe data models throughout the pipeline - 🎯 GraphManager Caching: Optimized per...
(QB_NEW_EN)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: test (3.10)
- GitHub Check: test (3.11)
🔇 Additional comments (15)
Makefile (1)
48-51: Nice: first-class lint/format shortcuts via pre-commitHaving
lintandformatforward to pre-commit is clean and consistent with the repo’s workflow.tests/test_agent.py (1)
11-11: Importing from package root is correct after re-exports in initGood move to import Anomaly, AnomalyAgent, and AnomalyList from
anomaly_agent. Keeps tests aligned with the public API surface.tests/test_prompts.py (1)
12-12: Importing from package root is consistent with the new public APIMatches the re-exports in
anomaly_agent/__init__.py. Good.anomaly_agent/__init__.py (1)
7-8: Re-exporting models and agent from the package root is the right API shapeThe new imports cleanly expose
AnomalyAgent,Anomaly, andAnomalyListat the package root. This matches the tests and simplifies user imports. Ensure isort keeps this block grouped and stable.README.md (3)
56-61: Mermaid formatting LGTMThe added spacing improves Mermaid readability and does not affect rendering.
189-191: Model table and defaults look goodDemonstrates non-default model override while the documented default remains gpt-5-nano. No action needed.
256-259: Quality commands section LGTMExplicit isort/flake8/mypy entries align with repo guidelines.
anomaly_agent/graph.py (2)
15-26: Graph compilation is already centralized inagent.py, no change neededI’ve verified that every call to
create_graphinanomaly_agent/agent.pyis immediately followed by a.compile()call (lines 37 and 56), ensuring the graph is compiled before use. Returning the rawStateGraphfromcreate_graphis intentional and safe in the current code paths.
1-27: Apply Black Formatting toanomaly_agent/graph.pyPlease format this file with Black to satisfy the project’s style settings (line-length = 88 in
pyproject.toml). After adjusting the import wrapping as discussed, run:black anomaly_agent/graph.pyThis will automatically apply the correct line breaks and spacing. Once formatted, commit the changes to clear the pre-commit failure.
anomaly_agent/tools.py (3)
14-21: Factory function LGTMClear single-responsibility chain builder with typed output via structured parsing to AnomalyList.
23-30: Verification chain LGTMMirrors detection chain; consistent defaults and output type.
1-30: Run Black locally to fix formattingPre-commit indicates that Black would reformat this file, but Black isn’t available in this environment. Please verify and apply the required formatting changes by running locally:
# Ensure Black is installed (if needed) pip install black # Format the file black anomaly_agent/tools.pyanomaly_agent/models.py (1)
9-17: Class docstring and fields read wellModel fields are clear and documented.
anomaly_agent/nodes.py (1)
1-74: Run pre-commit hooks to apply formatting and lintingPre-commit is configured with Black, isort, and flake8 in
.pre-commit-config.yaml. After updating your code, please run the following to fix any formatting or lint issues:# Format with Black pre-commit run black --all-files # Sort imports with isort pre-commit run isort --all-files # Lint with flake8 pre-commit run flake8 --all-filesOr simply run all hooks at once:
pre-commit run --all-filesanomaly_agent/agent.py (1)
1-1: Good concise module docstring.No issues here.
| from .constants import DEFAULT_MODEL_NAME, DEFAULT_TIMESTAMP_COL | ||
| from .graph import create_graph | ||
| from .models import AnomalyList | ||
| from .nodes import AgentState | ||
| from .prompt import DEFAULT_SYSTEM_PROMPT, DEFAULT_VERIFY_SYSTEM_PROMPT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
AgentState import violates guideline to use Pydantic models (v2) for state.
Per coding_guidelines, state should be a Pydantic v2 model, not a TypedDict. Here, importing AgentState (TypedDict) from nodes couples this module to a non-compliant type and makes mypy stricter configuration harder.
Short-term in this file: avoid the TypedDict dependency and type the state as Dict[str, Any]. Longer-term: migrate AgentState in anomaly_agent/nodes.py to a Pydantic v2 model with field validators, then update graph.py’s StateGraph state type accordingly.
Apply this diff to decouple agent.py now:
-from .nodes import AgentStateAnd adjust the state annotation in detect_anomalies (see suggested diff on Lines 75-79).
Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In anomaly_agent/agent.py around lines 9-13 and in the detect_anomalies function
around lines 75-79, remove the import of AgentState from nodes and stop
depending on the TypedDict; instead import typing.Dict and typing.Any and change
any state type annotations to Dict[str, Any] (including the detect_anomalies
parameter/type hints). Ensure you also remove the unused AgentState import line
and update imports so mypy no longer sees the TypedDict dependency; keep the
longer-term migration of AgentState in anomaly_agent/nodes.py to a Pydantic v2
model for later.
| self.llm = ChatOpenAI(model=model_name) | ||
| self.timestamp_col = timestamp_col | ||
| self.verify_anomalies = verify_anomalies | ||
| self.detection_prompt = detection_prompt | ||
| self.verification_prompt = verification_prompt | ||
|
|
||
| # Create the graph |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Inject LLM and validate config (fixes tests without OPENAI_API_KEY; enforces max_retries/timeout constraints).
Tests currently fail with “api_key client option must be set...”. Allow injecting an LLM (for tests/mocks) and/or an explicit api_key. Also enforce required constraints: max_retries ∈ [0,10], timeout ∈ [30,3600] (coding_guidelines).
Apply this diff to the constructor signature and initialization:
- def __init__(
- self,
- model_name: str = DEFAULT_MODEL_NAME,
- timestamp_col: str = DEFAULT_TIMESTAMP_COL,
- verify_anomalies: bool = True,
- detection_prompt: str = DEFAULT_SYSTEM_PROMPT,
- verification_prompt: str = DEFAULT_VERIFY_SYSTEM_PROMPT,
- ):
- """Initialize the AnomalyAgent with a specific model."""
- self.llm = ChatOpenAI(model=model_name)
+ def __init__(
+ self,
+ model_name: str = DEFAULT_MODEL_NAME,
+ timestamp_col: str = DEFAULT_TIMESTAMP_COL,
+ verify_anomalies: bool = True,
+ detection_prompt: str = DEFAULT_SYSTEM_PROMPT,
+ verification_prompt: str = DEFAULT_VERIFY_SYSTEM_PROMPT,
+ *,
+ api_key: Optional[str] = None,
+ llm: Optional["BaseChatModel"] = None,
+ max_retries: int = 2,
+ timeout: int = 120,
+ ):
+ """Initialize the AnomalyAgent with a specific model and config."""
+ # Validate config per guidelines
+ if not (0 <= max_retries <= 10):
+ raise ValueError("max_retries must be between 0 and 10")
+ if not (30 <= timeout <= 3600):
+ raise ValueError("timeout must be between 30 and 3600 seconds")
+ # Allow injection for tests and custom LLMs
+ if llm is not None:
+ self.llm = llm
+ else:
+ self.llm = ChatOpenAI(
+ model=model_name,
+ api_key=api_key,
+ max_retries=max_retries,
+ timeout=timeout,
+ )Add these imports at the top of the file (near other imports):
+from langchain_core.language_models.chat_models import BaseChatModelOptionally, to satisfy “use Pydantic v2 models for configuration”, introduce AgentConfig (add this outside the changed ranges):
from pydantic import BaseModel, field_validator
class AgentConfig(BaseModel):
max_retries: int = 2
timeout: int = 120
@field_validator("max_retries")
@classmethod
def _vr_retries(cls, v: int) -> int:
if not (0 <= v <= 10):
raise ValueError("max_retries must be between 0 and 10")
return v
@field_validator("timeout")
@classmethod
def _vr_timeout(cls, v: int) -> int:
if not (30 <= v <= 3600):
raise ValueError("timeout must be between 30 and 3600 seconds")
return vAnd accept config: Optional[AgentConfig] = None in init, wiring its values to ChatOpenAI if provided. If you want, I can provide the exact patch.
🤖 Prompt for AI Agents
In anomaly_agent/agent.py around lines 27 to 33, the constructor currently
constructs ChatOpenAI directly and doesn't allow injecting a mock LLM or
explicit api_key and doesn't enforce max_retries/timeout constraints; change the
__init__ signature to accept llm: Optional[Any] = None, api_key: Optional[str] =
None, and config: Optional[AgentConfig] = None (or separate max_retries:int=2,
timeout:int=120) and if llm is provided use it otherwise instantiate ChatOpenAI
with api_key (if provided) and the config values; add imports for Optional and
AgentConfig (or add the AgentConfig pydantic BaseModel shown in the comment) and
validate max_retries ∈ [0,10] and timeout ∈ [30,3600] (either via AgentConfig
validators or explicit checks) before passing them to ChatOpenAI so tests can
inject mocks and the constraints are enforced.
| "time_series": df_str, | ||
| "variable_name": col, | ||
| "current_step": "detect", | ||
| } | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
State typing and payload size: decouple from TypedDict and shrink per-call input.
- Type as Dict[str, Any] to avoid the TypedDict import and ease migration to a Pydantic state.
- Instead of passing the whole DataFrame for every column, pass only the two relevant columns as CSV. This reduces tokens and improves performance.
Apply this diff:
- state: AgentState = {
- "time_series": df_str,
+ state: Dict[str, Any] = {
+ "time_series": df[[self.timestamp_col, col]].to_csv(index=False),
"variable_name": col,
"current_step": "detect",
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| state: AgentState = { | |
| "time_series": df_str, | |
| "variable_name": col, | |
| "current_step": "detect", | |
| } | |
| state: Dict[str, Any] = { | |
| "time_series": df[[self.timestamp_col, col]].to_csv(index=False), | |
| "variable_name": col, | |
| "current_step": "detect", | |
| } |
🤖 Prompt for AI Agents
In anomaly_agent/agent.py around lines 75 to 79, the state is currently typed as
AgentState and embeds the full DataFrame for each column; change the type to
Dict[str, Any] (remove or stop using the TypedDict import) and replace the
per-column DataFrame payload with a CSV string containing only the two relevant
columns (time column and the current variable column) to shrink per-call input —
build the CSV via df[[time_col, col]].to_csv(index=False) (or equivalent) and
assign that string to "time_series" in the state object, keeping keys
"variable_name" and "current_step" unchanged.
| from datetime import datetime | ||
| from typing import List | ||
|
|
||
| from pydantic import BaseModel, Field, validator | ||
|
|
||
| from .constants import TIMESTAMP_FORMAT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Adopt Pydantic v2 field validators and add module docstring
Repo standard is Pydantic v2. Switch from validator to field_validator and add a short module docstring to satisfy docstring checks.
+"""Pydantic models for anomaly detection results."""
from datetime import datetime
from typing import List
-from pydantic import BaseModel, Field, validator
+from pydantic import BaseModel, Field, field_validator
+# Optional: enforce strict schema (no extra keys)
+# from pydantic import ConfigDict📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| from datetime import datetime | |
| from typing import List | |
| from pydantic import BaseModel, Field, validator | |
| from .constants import TIMESTAMP_FORMAT | |
| """Pydantic models for anomaly detection results.""" | |
| from datetime import datetime | |
| from typing import List | |
| from pydantic import BaseModel, Field, field_validator | |
| # Optional: enforce strict schema (no extra keys) | |
| # from pydantic import ConfigDict | |
| from .constants import TIMESTAMP_FORMAT |
🧰 Tools
🪛 GitHub Actions: pre-commit
[error] 1-1: D100 Missing docstring in public module.
🤖 Prompt for AI Agents
In anomaly_agent/models.py lines 1-6, add a short module-level docstring at the
top and migrate any use of Pydantic v1's validator to Pydantic v2's
field_validator: update the import to include field_validator (from pydantic
import BaseModel, Field, field_validator), replace @validator decorators with
@field_validator and adjust method signatures to accept the new v2 style
(validate per-field, use mode="after" or "before" as needed), and update any
references to the validator API accordingly so the file conforms to Pydantic v2
and the repo docstring check.
| @validator("timestamp") # type: ignore | ||
| def validate_timestamp(cls, v: str) -> str: | ||
| """Validate that the timestamp is in a valid format.""" | ||
| try: | ||
| datetime.strptime(v, TIMESTAMP_FORMAT) | ||
| return v | ||
| except ValueError: | ||
| try: | ||
| dt = datetime.fromisoformat(v.replace("Z", "+00:00")) | ||
| if "." in v: | ||
| return dt.strftime(TIMESTAMP_FORMAT) | ||
| return dt.strftime("%Y-%m-%d %H:%M:%S") | ||
| except ValueError: | ||
| try: | ||
| dt = datetime.strptime(v, "%Y-%m-%d") | ||
| return dt.strftime("%Y-%m-%d %H:%M:%S") | ||
| except ValueError: | ||
| try: | ||
| datetime.strptime(v, "%Y-%m-%d %H:%M:%S") | ||
| return v | ||
| except ValueError: | ||
| raise ValueError( | ||
| f"timestamp must be in {TIMESTAMP_FORMAT} format, ISO format, or YYYY-MM-DD format" | ||
| ) | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Switch timestamp validator to Pydantic v2 and use exception chaining
Use field_validator in v2. Also prefer raise ... from None (or from err) in except blocks (Ruff B904) for clearer error origins.
- @validator("timestamp") # type: ignore
- def validate_timestamp(cls, v: str) -> str:
- """Validate that the timestamp is in a valid format."""
- try:
- datetime.strptime(v, TIMESTAMP_FORMAT)
- return v
- except ValueError:
- try:
- dt = datetime.fromisoformat(v.replace("Z", "+00:00"))
- if "." in v:
- return dt.strftime(TIMESTAMP_FORMAT)
- return dt.strftime("%Y-%m-%d %H:%M:%S")
- except ValueError:
- try:
- dt = datetime.strptime(v, "%Y-%m-%d")
- return dt.strftime("%Y-%m-%d %H:%M:%S")
- except ValueError:
- try:
- datetime.strptime(v, "%Y-%m-%d %H:%M:%S")
- return v
- except ValueError:
- raise ValueError(
- f"timestamp must be in {TIMESTAMP_FORMAT} format, ISO format, or YYYY-MM-DD format"
- )
+ @field_validator("timestamp")
+ def validate_timestamp(cls, v: str) -> str:
+ """Validate and normalize supported timestamp formats."""
+ try:
+ datetime.strptime(v, TIMESTAMP_FORMAT)
+ return v
+ except ValueError:
+ pass
+ # ISO-like: '...Z' or with offset
+ try:
+ dt = datetime.fromisoformat(v.replace("Z", "+00:00"))
+ return dt.strftime(TIMESTAMP_FORMAT if "." in v else "%Y-%m-%d %H:%M:%S")
+ except ValueError:
+ pass
+ # Date-only
+ try:
+ dt = datetime.strptime(v, "%Y-%m-%d")
+ return dt.strftime("%Y-%m-%d %H:%M:%S")
+ except ValueError:
+ pass
+ # Already normalized?
+ try:
+ datetime.strptime(v, "%Y-%m-%d %H:%M:%S")
+ return v
+ except ValueError:
+ raise ValueError(
+ f"timestamp must be in {TIMESTAMP_FORMAT} format, ISO 8601, or YYYY-MM-DD"
+ ) from None📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| @validator("timestamp") # type: ignore | |
| def validate_timestamp(cls, v: str) -> str: | |
| """Validate that the timestamp is in a valid format.""" | |
| try: | |
| datetime.strptime(v, TIMESTAMP_FORMAT) | |
| return v | |
| except ValueError: | |
| try: | |
| dt = datetime.fromisoformat(v.replace("Z", "+00:00")) | |
| if "." in v: | |
| return dt.strftime(TIMESTAMP_FORMAT) | |
| return dt.strftime("%Y-%m-%d %H:%M:%S") | |
| except ValueError: | |
| try: | |
| dt = datetime.strptime(v, "%Y-%m-%d") | |
| return dt.strftime("%Y-%m-%d %H:%M:%S") | |
| except ValueError: | |
| try: | |
| datetime.strptime(v, "%Y-%m-%d %H:%M:%S") | |
| return v | |
| except ValueError: | |
| raise ValueError( | |
| f"timestamp must be in {TIMESTAMP_FORMAT} format, ISO format, or YYYY-MM-DD format" | |
| ) | |
| @field_validator("timestamp") | |
| def validate_timestamp(cls, v: str) -> str: | |
| """Validate and normalize supported timestamp formats.""" | |
| try: | |
| datetime.strptime(v, TIMESTAMP_FORMAT) | |
| return v | |
| except ValueError: | |
| pass | |
| # ISO-like: '...Z' or with offset | |
| try: | |
| dt = datetime.fromisoformat(v.replace("Z", "+00:00")) | |
| return dt.strftime(TIMESTAMP_FORMAT if "." in v else "%Y-%m-%d %H:%M:%S") | |
| except ValueError: | |
| pass | |
| # Date-only | |
| try: | |
| dt = datetime.strptime(v, "%Y-%m-%d") | |
| return dt.strftime("%Y-%m-%d %H:%M:%S") | |
| except ValueError: | |
| pass | |
| # Already normalized? | |
| try: | |
| datetime.strptime(v, "%Y-%m-%d %H:%M:%S") | |
| return v | |
| except ValueError: | |
| raise ValueError( | |
| f"timestamp must be in {TIMESTAMP_FORMAT} format, ISO 8601, or YYYY-MM-DD" | |
| ) from None |
🧰 Tools
🪛 Ruff (0.12.2)
39-41: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
🤖 Prompt for AI Agents
In anomaly_agent/models.py around lines 18 to 42, the timestamp validator is
written with Pydantic v1's @validator and uses bare raises inside nested
excepts; convert it to a Pydantic v2 field validator (use
@field_validator("timestamp", mode="before") or appropriate mode and update
imports) and refactor the nested try/except blocks to re-raise with explicit
exception chaining (e.g., raise ValueError(...) from err or from None) so each
caught exception preserves its origin; keep the same parsing logic/formatting
but update the decorator/imports and replace bare raises with chained raises.
| @validator("variable_value") # type: ignore | ||
| def validate_variable_value(cls, v: float) -> float: | ||
| if not isinstance(v, (int, float)): | ||
| raise ValueError("variable_value must be a number") | ||
| return float(v) | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Switch numeric validator to Pydantic v2 and document purpose
Also minor docstring addition satisfies D102.
- @validator("variable_value") # type: ignore
- def validate_variable_value(cls, v: float) -> float:
- if not isinstance(v, (int, float)):
- raise ValueError("variable_value must be a number")
- return float(v)
+ @field_validator("variable_value")
+ def validate_variable_value(cls, v: float) -> float:
+ """Coerce numeric inputs to float; reject non-numerics."""
+ if not isinstance(v, (int, float)):
+ raise ValueError("variable_value must be a number")
+ return float(v)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| @validator("variable_value") # type: ignore | |
| def validate_variable_value(cls, v: float) -> float: | |
| if not isinstance(v, (int, float)): | |
| raise ValueError("variable_value must be a number") | |
| return float(v) | |
| @field_validator("variable_value") | |
| def validate_variable_value(cls, v: float) -> float: | |
| """Coerce numeric inputs to float; reject non-numerics.""" | |
| if not isinstance(v, (int, float)): | |
| raise ValueError("variable_value must be a number") | |
| return float(v) |
🧰 Tools
🪛 GitHub Actions: pre-commit
[error] 44-44: D102 Missing docstring in public method.
| @validator("anomaly_description") # type: ignore | ||
| def validate_anomaly_description(cls, v: str) -> str: | ||
| if not isinstance(v, str): | ||
| raise ValueError("anomaly_description must be a string") | ||
| return v | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Switch description validator to Pydantic v2 and add docstring
Adds the missing method docstring flagged by the docstring checker.
- @validator("anomaly_description") # type: ignore
- def validate_anomaly_description(cls, v: str) -> str:
- if not isinstance(v, str):
- raise ValueError("anomaly_description must be a string")
- return v
+ @field_validator("anomaly_description")
+ def validate_anomaly_description(cls, v: str) -> str:
+ """Ensure the anomaly description is a string."""
+ if not isinstance(v, str):
+ raise ValueError("anomaly_description must be a string")
+ return v📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| @validator("anomaly_description") # type: ignore | |
| def validate_anomaly_description(cls, v: str) -> str: | |
| if not isinstance(v, str): | |
| raise ValueError("anomaly_description must be a string") | |
| return v | |
| @field_validator("anomaly_description") | |
| def validate_anomaly_description(cls, v: str) -> str: | |
| """Ensure the anomaly description is a string.""" | |
| if not isinstance(v, str): | |
| raise ValueError("anomaly_description must be a string") | |
| return v |
🧰 Tools
🪛 GitHub Actions: pre-commit
[error] 50-50: D102 Missing docstring in public method.
🤖 Prompt for AI Agents
In anomaly_agent/models.py around lines 49 to 54, the validator uses the
Pydantic v1 @validator decorator and lacks a docstring; replace
@validator("anomaly_description") with Pydantic v2's
@field_validator("anomaly_description") (import field_validator from pydantic),
adjust the function signature to the v2 convention (accepting the field value as
the first parameter) and add a concise docstring describing that the method
ensures anomaly_description is a string and raises ValueError otherwise.
anomaly_agent/nodes.py
Outdated
| from typing import Callable, List, Literal, Optional, TypedDict | ||
|
|
||
| from langchain_openai import ChatOpenAI | ||
|
|
||
| from .models import AnomalyList | ||
| from .tools import create_detection_chain, create_verification_chain | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Remove unused import and prepare for Pydantic-based state
Ruff/flake8 flagged typing.List as unused. Also, repo standard prefers Pydantic v2 models for state over TypedDict; see follow-up diff below.
-from typing import Callable, List, Literal, Optional, TypedDict
+from typing import Any, Callable, Literal, Optional, TypedDict📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| from typing import Callable, List, Literal, Optional, TypedDict | |
| from langchain_openai import ChatOpenAI | |
| from .models import AnomalyList | |
| from .tools import create_detection_chain, create_verification_chain | |
| from typing import Any, Callable, Literal, Optional, TypedDict | |
| from langchain_openai import ChatOpenAI | |
| from .models import AnomalyList | |
| from .tools import create_detection_chain, create_verification_chain |
🧰 Tools
🪛 Ruff (0.12.2)
2-2: typing.List imported but unused
Remove unused import: typing.List
(F401)
🪛 GitHub Actions: pre-commit
[error] 3-3: F401 'typing.List' imported but unused.
🤖 Prompt for AI Agents
In anomaly_agent/nodes.py around lines 2 to 8, remove the unused import
typing.List and replace the TypedDict-based state types with a Pydantic v2
model: delete List and TypedDict from the imports, add "from pydantic import
BaseModel", define a small BaseModel subclass for the state (with the same
keys/types previously declared in the TypedDict), and update any references to
use that Pydantic model type; keep the other imports (Callable, Literal,
Optional, ChatOpenAI, AnomalyList, create_detection_chain,
create_verification_chain) unchanged.
| class AgentState(TypedDict, total=False): | ||
| """State for the anomaly detection agent.""" | ||
|
|
||
| time_series: str | ||
| variable_name: str | ||
| detected_anomalies: Optional[AnomalyList] | ||
| verified_anomalies: Optional[AnomalyList] | ||
| current_step: str | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Migrate AgentState from TypedDict to a Pydantic v2 model (repo standard)
Using Pydantic for state enables validation and better editor support, and aligns with the coding guidelines. If you’d prefer to stage this, at least introduce a Pydantic wrapper and plan a follow-up refactor.
Proposed inline refactor (minimizes behavioral change; functions still return partial dict updates as LangGraph expects):
-from typing import Callable, Any, Literal, Optional, TypedDict
+from typing import Any, Callable, Literal, Optional
+from pydantic import BaseModel, Field
@@
-class AgentState(TypedDict, total=False):
- """State for the anomaly detection agent."""
-
- time_series: str
- variable_name: str
- detected_anomalies: Optional[AnomalyList]
- verified_anomalies: Optional[AnomalyList]
- current_step: str
+class AgentState(BaseModel):
+ """State for the anomaly detection agent."""
+
+ time_series: str
+ variable_name: str
+ detected_anomalies: Optional[AnomalyList] = None
+ verified_anomalies: Optional[AnomalyList] = None
+ current_step: Literal["detect", "verify", "end"] = "detect"Committable suggestion skipped: line range outside the PR's diff.
Summary
models,tools,nodes,graph) and slimagentAPITesting
make lintmake tests(fails: The api_key client option must be set either by passing api_key to the client or setting the OPENAI_API_KEY environment variable)https://chatgpt.com/codex/tasks/task_e_68ae24bd3e2c83289020eeff52c3f7e1
Summary by CodeRabbit