Skip to content

Conversation

@RobGeada
Copy link
Contributor

@RobGeada RobGeada commented Nov 3, 2025

Summary by Sourcery

Add support for user-defined Prometheus metrics and non-blocking custom guardrails, standardize metric naming with a prefix constant, refactor detector instrumentation and module loading, expose a multiprocess /metrics endpoint, and provide documentation and tests for the new capabilities

New Features:

  • Allow custom detectors to register Prometheus metrics using a @use_instruments decorator
  • Introduce a @non_blocking decorator to run custom guardrail functions asynchronously without blocking requests
  • Expose a /metrics endpoint that aggregates multiprocess Prometheus metrics

Enhancements:

  • Standardize all metric names behind a METRIC_PREFIX constant
  • Replace add_instruments with set_instruments and support dynamic instrument registration via add_instrument
  • Refactor custom detectors loader to use importlib, inject utility decorators, and perform security checks

Documentation:

  • Add docs/custom_detectors.md describing custom detector guidelines and available decorators

Tests:

  • Add fixtures for Prometheus multiprocess directory setup and cleanup
  • Add tests for user-defined metrics and non-blocking guardrails and update existing tests to use METRIC_PREFIX and clear metric registry before each test

Summary by CodeRabbit

  • New Features

    • Added /metrics endpoint for Prometheus metrics collection
    • Enabled custom detectors with built-in metric instrumentation and non-blocking background detectors
    • Added decorators for metrics tracking and background execution for custom detectors
  • Documentation

    • New guide for creating custom detectors with security rules and examples
  • Changes

    • Standardized metric names with a common prefix
    • Instrument configuration API updated from add_instruments → set_instruments
    • Docker/test setup improved for Prometheus multiprocess support
  • Tests

    • Added/updated tests covering metrics and non-blocking behavior

@sourcery-ai
Copy link
Contributor

sourcery-ai bot commented Nov 3, 2025

Reviewer's Guide

This PR introduces a unified metric prefix and refactors all Prometheus instrumentation to use it, exposes a proper multiprocess‐aware /metrics endpoint, adds support for user‐defined metrics and non‐blocking guardrail execution via new decorators, updates the instrumentation API, and extends the test suite (with fixtures) to validate these features end-to-end.

Sequence diagram for non-blocking guardrail execution

sequenceDiagram
    participant Client
    participant API
    participant "CustomDetectorRegistry"
    participant "ThreadPoolExecutor"
    Client->>API: POST /api/v1/text/contents
    API->>"CustomDetectorRegistry": Call background_function (decorated with @non_blocking)
    "CustomDetectorRegistry"->>"ThreadPoolExecutor": Submit function to background thread
    "CustomDetectorRegistry"-->>API: Return immediately with return_value (e.g., False)
    Note over "ThreadPoolExecutor": Function logic runs asynchronously
Loading

Entity relationship diagram for Prometheus metrics registry

erDiagram
    DetectorBaseAPI {
        string state_instruments
    }
    InstrumentedDetector {
        string instruments
    }
    CustomDetectorRegistry {
        string registry
    }
    PrometheusInstrument {
    }
    DetectorBaseAPI ||--o| InstrumentedDetector : sets instruments
    InstrumentedDetector ||--o| CustomDetectorRegistry : inherited by
    CustomDetectorRegistry ||--o| PrometheusInstrument : adds instrument
Loading

Class diagram for new decorators and instrumentation API

classDiagram
    class CustomDetectorRegistry {
        +registry: dict
        +function_needs_headers: dict
        +__init__()
    }
    class use_instruments {
        +__call__(func)
    }
    class non_blocking {
        +__call__(func)
    }
    class InstrumentedDetector {
        +set_instruments(instruments)
        +add_instrument(instrument)
        +increment_detector_instruments(function_name, is_detection)
    }
    CustomDetectorRegistry --> use_instruments : uses
    CustomDetectorRegistry --> non_blocking : uses
    InstrumentedDetector <|.. CustomDetectorRegistry : inherits
Loading

File-Level Changes

Change Details Files
Metric prefix constant and instrumentation API refactor
  • Introduce METRIC_PREFIX constant in common/app
  • Rename add_instruments to set_instruments and add add_instrument in instrumented_detector
  • Update all metric definitions and references to use f"{METRIC_PREFIX}_..."
  • Adjust detector setup in built-in and huggingface apps to call set_instruments
detectors/common/app.py
detectors/common/instrumented_detector.py
detectors/built_in/app.py
detectors/huggingface/app.py
tests/detectors/builtIn/test_metrics.py
tests/detectors/huggingface/test_metrics.py
Expose Prometheus /metrics endpoint with multiprocess support
  • Remove Instrumentator-based instrumentation
  • Add GET /metrics handler using CollectorRegistry and MultiProcessCollector
  • Return generated_latest data with correct media type
detectors/built_in/app.py
Support custom user metrics and non-blocking guardrails
  • Implement use_instruments decorator to attach Prometheus instruments to custom functions
  • Implement non_blocking decorator to run guardrails asynchronously
  • Inject decorators into custom_detectors module via importlib in CustomDetectorRegistry
  • Auto-register user metrics instruments during registry initialization
detectors/built_in/custom_detectors_wrapper.py
docs/custom_detectors.md
Enhance test suite for metrics and non-blocking behavior
  • Add session fixtures for PROMETHEUS_MULTIPROC_DIR setup and cleanup
  • Introduce wait_for_metric helper and clear registry before tests
  • Update metric filtering in get_metric_dict to ignore HELP/TYPE lines
  • Add tests for user-defined metrics and non-blocking execution timing
tests/conftest.py
tests/detectors/builtIn/test_metrics.py
Update custom_detectors example to showcase new decorators
  • Simplify example file to include slow_func, has_metrics, and background_function
  • Demonstrate use_instruments and non_blocking usage in example custom detectors
detectors/built_in/custom_detectors/custom_detectors.py

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 3, 2025

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

Walkthrough

This PR adds Prometheus multiprocess support and a /metrics endpoint, introduces instrumentation-aware decorators (@use_instruments, @non_blocking) and dynamic loading/security checks for custom detectors, renames the instruments API from add_instrumentsset_instruments (plus add_instrument), adds a METRIC_PREFIX, and updates tests and docs accordingly.

Changes

Cohort / File(s) Summary
Prometheus multiprocess setup
detectors/Dockerfile.builtIn, tests/conftest.py
Set PROMETHEUS_MULTIPROC_DIR and create the directory in Dockerfile; add session-scoped pytest fixtures to create/cleanup a multiprocess temp dir during tests.
Metrics endpoint & registry
detectors/built_in/app.py, detectors/common/app.py
Add GET /metrics route using CollectorRegistry + MultiProcessCollector and generate_latest; introduce METRIC_PREFIX constant and switch metric names to use it.
Instrument API & callsites
detectors/common/instrumented_detector.py, detectors/huggingface/app.py, detectors/huggingface/detector.py
Replace add_instruments(...) with set_instruments(...), add add_instrument(instrument) helper, and update detector call sites to call set_instruments.
Custom detectors & decorators
detectors/built_in/custom_detectors/custom_detectors.py, detectors/built_in/custom_detectors_wrapper.py
Replace example detectors with Python detectors exposing Prometheus Counters; add @use_instruments and @non_blocking decorators, get_underlying_function, dynamic import of custom_detectors.py, security/static analysis checks, and registration of prometheus_instruments.
Documentation
docs/custom_detectors.md
New guide describing custom detector rules, security restrictions, and usage of @use_instruments / @non_blocking.
Tests
tests/detectors/builtIn/test_custom.py, tests/detectors/builtIn/test_metrics.py, tests/detectors/huggingface/test_metrics.py
Add/modify tests to manage prometheus multiproc dir, adopt METRIC_PREFIX, replace add_instrumentsset_instruments, add wait_for_metric helper, add tests for user-provided and non-blocking metrics; minor formatting tweaks.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant App as FastAPI App
    participant DetectorRegistry as Detector Registry
    participant CustomModule as custom_detectors.py
    participant Decorator as use_instruments / non_blocking
    participant PromCollector as Prometheus CollectorRegistry

    rect rgb(230,245,255)
    note over App,DetectorRegistry: Startup
    App->>DetectorRegistry: set_instruments(app.state.instruments)
    DetectorRegistry->>App: instruments stored
    end

    rect rgb(235,255,235)
    note over Client,App: Detection request
    Client->>App: POST /detections
    App->>CustomModule: call detector func (may be wrapped)
    CustomModule->>Decorator: decorator logic (attach instruments or spawn background)
    Decorator->>PromCollector: increment/update instrument(s)
    Decorator-->>App: detector result (immediate if non_blocking)
    App-->>Client: response
    end

    rect rgb(255,245,235)
    note over Client,App: Metrics scrape
    Client->>App: GET /metrics
    App->>PromCollector: CollectorRegistry + MultiProcessCollector -> generate_latest()
    PromCollector-->>App: metrics payload (CONTENT_TYPE_LATEST)
    App-->>Client: /metrics response
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

  • Pay special attention to:
    • detectors/built_in/custom_detectors_wrapper.py (decorator wrapping, attribute propagation, dynamic import, static analysis/security checks).
    • detectors/built_in/app.py (CollectorRegistry + MultiProcessCollector usage and content-type handling).
    • Consistent migration of add_instrumentsset_instruments across modules and tests.

Possibly related PRs

Suggested labels

ok-to-test

Suggested reviewers

  • saichandrapandraju
  • m-misiura

Poem

🐰 I hopped through code to stitch a metrics seam,
Multiprocess carrots and counters in a stream,
Decorators whisper, instruments hum bright,
/metrics opens up to Prometheus' light,
Happy detectors dance — metrics take flight! ✨

Pre-merge checks and finishing touches

✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The pull request title 'Feat: Custom user metrics and non-blocking guardrails' directly aligns with the main objectives and changes in this changeset. The PR introduces two key features: user-defined Prometheus metrics via the @use_instruments decorator and non-blocking guardrail execution via the @non_blocking decorator, which are precisely the primary focus of the title. The title is concise, specific, and accurately represents the core functionality being added.
✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there - I've reviewed your changes and they look great!

Prompt for AI Agents
Please address the comments from this code review:

## Individual Comments

### Comment 1
<location> `detectors/built_in/custom_detectors_wrapper.py:46-51` </location>
<code_context>
+            if executor is None:
+                executor = ThreadPoolExecutor()
+                non_blocking._executor = executor
+            def runner():
+                try:
+                    func(*args, **kwargs)
+                except Exception as e:
+                    logging.error(f"Exception in non-blocking guardrail {func.__name__}: {e}")
+            executor.submit(runner)
+
</code_context>

<issue_to_address>
**suggestion:** Consider including traceback information in error logging for non_blocking runner.

Logging only the exception message limits debugging. Use logging.exception or traceback.format_exc to capture full traceback details for better issue diagnosis.

```suggestion
            def runner():
                try:
                    func(*args, **kwargs)
                except Exception as e:
                    logging.exception(f"Exception in non-blocking guardrail {func.__name__}: {e}")
            executor.submit(runner)
```
</issue_to_address>

### Comment 2
<location> `detectors/common/instrumented_detector.py:23-24` </location>
<code_context>
+    def set_instruments(self, instruments):
+        self.instruments = instruments
+
+    def add_instrument(self, instrument):
+        self.instruments[instrument._name] = instrument

     def increment_detector_instruments(self, function_name: str, is_detection: bool):
</code_context>

<issue_to_address>
**suggestion:** Directly accessing instrument._name may break if instrument implementation changes.

Consider replacing direct access to _name with a public property or method to ensure compatibility with future Prometheus client updates.

Suggested implementation:

```python
    def add_instrument(self, instrument):
        self.instruments[instrument.name] = instrument

```

If the instrument class does not have a public `name` property, you should add one to the instrument class definition, for example:

```python
@property
def name(self):
    return self._name
```

Then use `instrument.name` as shown above.
</issue_to_address>

### Comment 3
<location> `docs/custom_detectors.md:3-4` </location>
<code_context>
+# Custom Detectors
+You can overwrite [custom_detectors.py](detectors/built_in/custom_detectors/custom_detectors.py) to create
+custom detectors in the built-in server based off of arbitrary Python code. This lets you quickly and flexible
+create your own detection logic!
+
</code_context>

<issue_to_address>
**issue (typo):** Typo: 'flexible' should be 'flexibly'.

It should say 'quickly and flexibly create your own detection logic!' to be grammatically correct.

```suggestion
custom detectors in the built-in server based off of arbitrary Python code. This lets you quickly and flexibly
create your own detection logic!
```
</issue_to_address>

### Comment 4
<location> `docs/custom_detectors.md:15` </location>
<code_context>
+   2) Boolean responses of `true` are considered a detection 
+      * see the `over_100_characters` example in [custom_detectors.py](detectors/built_in/custom_detectors/custom_detectors.py) for usage
+   3) Dict response that are parseable as a `ContentAnalysisResponse` object are considered a detection
+      * see the `contains_word` example in [custom_detectors.py](detectors/built_in/custom_detectors/custom_detectors.py) for ursage
+4) This code may not import `os`, `subprocess`, `sys`, or `shutil` for security reasons
+5) This code may not call `eval`, `exec`, `open`, `compile`, or `input` for security reasons
</code_context>

<issue_to_address>
**issue (typo):** Typo: 'ursage' should be 'usage'.

Change 'ursage' to 'usage' in the referenced line.

```suggestion
      * see the `contains_word` example in [custom_detectors.py](detectors/built_in/custom_detectors/custom_detectors.py) for usage
```
</issue_to_address>

### Comment 5
<location> `docs/custom_detectors.md:23-25` </location>
<code_context>
+## Utility Decorators
+The following decorators are also available, and are automatically imported into the custom_detectors.py file:
+
+### `@use_instruments(instruments=[$INSTRUMENT_1, ..., $INSTRUMENT_N]`
+ Use this decorator to register your own Prometheus instruments with the server's main
+ `/metrics` registry. See the `function_that_has_prometheus_metrics` example 
</code_context>

<issue_to_address>
**issue (typo):** Missing closing parenthesis in decorator example.

Please add the missing closing parenthesis to the decorator example for accuracy.

```suggestion
### `@use_instruments(instruments=[$INSTRUMENT_1, ..., $INSTRUMENT_N])`
 Use this decorator to register your own Prometheus instruments with the server's main
 `/metrics` registry. See the `function_that_has_prometheus_metrics` example 
```
</issue_to_address>

### Comment 6
<location> `tests/detectors/builtIn/test_metrics.py:254-259` </location>
<code_context>

</code_context>

<issue_to_address>
**issue (code-quality):** Avoid loops in tests. ([`no-loop-in-tests`](https://docs.sourcery.ai/Reference/Rules-and-In-Line-Suggestions/Python/Default-Rules/no-loop-in-tests))

<details><summary>Explanation</summary>Avoid complex code, like loops, in test functions.

Google's software engineering guidelines says:
"Clear tests are trivially correct upon inspection"
To reach that avoid complex code in tests:
* loops
* conditionals

Some ways to fix this:

* Use parametrized tests to get rid of the loop.
* Move the complex logic into helpers.
* Move the complex part into pytest fixtures.

> Complexity is most often introduced in the form of logic. Logic is defined via the imperative parts of programming languages such as operators, loops, and conditionals. When a piece of code contains logic, you need to do a bit of mental computation to determine its result instead of just reading it off of the screen. It doesn't take much logic to make a test more difficult to reason about.

Software Engineering at Google / [Don't Put Logic in Tests](https://abseil.io/resources/swe-book/html/ch12.html#donapostrophet_put_logic_in_tests)
</details>
</issue_to_address>

### Comment 7
<location> `tests/detectors/builtIn/test_metrics.py:269-274` </location>
<code_context>

</code_context>

<issue_to_address>
**issue (code-quality):** Avoid loops in tests. ([`no-loop-in-tests`](https://docs.sourcery.ai/Reference/Rules-and-In-Line-Suggestions/Python/Default-Rules/no-loop-in-tests))

<details><summary>Explanation</summary>Avoid complex code, like loops, in test functions.

Google's software engineering guidelines says:
"Clear tests are trivially correct upon inspection"
To reach that avoid complex code in tests:
* loops
* conditionals

Some ways to fix this:

* Use parametrized tests to get rid of the loop.
* Move the complex logic into helpers.
* Move the complex part into pytest fixtures.

> Complexity is most often introduced in the form of logic. Logic is defined via the imperative parts of programming languages such as operators, loops, and conditionals. When a piece of code contains logic, you need to do a bit of mental computation to determine its result instead of just reading it off of the screen. It doesn't take much logic to make a test more difficult to reason about.

Software Engineering at Google / [Don't Put Logic in Tests](https://abseil.io/resources/swe-book/html/ch12.html#donapostrophet_put_logic_in_tests)
</details>
</issue_to_address>

### Comment 8
<location> `detectors/built_in/custom_detectors_wrapper.py:146` </location>
<code_context>
    def __init__(self):
        super().__init__("custom")

        # check the imported code for potential security issues
        issues = static_code_analysis(module_path = os.path.join(os.path.dirname(__file__), "custom_detectors", "custom_detectors.py"))
        if issues:
            logging.error(f"Detected {len(issues)} potential security issues inside the custom_detectors file: {issues}")
            raise ImportError(f"Unsafe code detected in custom_detectors:\n" + "\n".join(issues))

        # grab custom detectors module
        module_path = os.path.join(os.path.dirname(__file__), "custom_detectors", "custom_detectors.py")
        spec = importlib.util.spec_from_file_location("custom_detectors.custom_detectors", module_path)
        custom_detectors = importlib.util.module_from_spec(spec)

        # inject any user utility functions into the code automatically
        inject_imports = {
            "use_instruments": use_instruments,
            "non_blocking": non_blocking,
        }
        for name, mod in inject_imports.items():
            setattr(custom_detectors, name, mod)

        # load the module
        sys.modules["custom_detectors.custom_detectors"] = custom_detectors
        spec.loader.exec_module(custom_detectors)

        self.registry = {name: obj for name, obj
                         in inspect.getmembers(custom_detectors, inspect.isfunction)
                         if not name.startswith("_") and name not in forbidden_names}
        self.function_needs_headers = {name: "headers" in inspect.signature(obj).parameters for name, obj in self.registry.items() }

        # check if functions have requested user prometheus metrics
        for name, func in self.registry.items():
            target = get_underlying_function(func)
            if getattr(target, "prometheus_instruments", False):
                instruments = target.prometheus_instruments
                for instrument in instruments:
                    super().add_instrument(instrument)

        logger.info(f"Registered the following custom detectors: {self.registry.keys()}")

</code_context>

<issue_to_address>
**issue (code-quality):** We've found these issues:

- Use named expression to simplify assignment and conditional ([`use-named-expression`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/use-named-expression/))
- Replace calls to `dict.items` with `dict.values` when the keys are not used ([`replace-dict-items-with-values`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/replace-dict-items-with-values/))
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment on lines +46 to +51
def runner():
try:
func(*args, **kwargs)
except Exception as e:
logging.error(f"Exception in non-blocking guardrail {func.__name__}: {e}")
executor.submit(runner)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: Consider including traceback information in error logging for non_blocking runner.

Logging only the exception message limits debugging. Use logging.exception or traceback.format_exc to capture full traceback details for better issue diagnosis.

Suggested change
def runner():
try:
func(*args, **kwargs)
except Exception as e:
logging.error(f"Exception in non-blocking guardrail {func.__name__}: {e}")
executor.submit(runner)
def runner():
try:
func(*args, **kwargs)
except Exception as e:
logging.exception(f"Exception in non-blocking guardrail {func.__name__}: {e}")
executor.submit(runner)

Comment on lines +23 to +24
def add_instrument(self, instrument):
self.instruments[instrument._name] = instrument
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: Directly accessing instrument._name may break if instrument implementation changes.

Consider replacing direct access to _name with a public property or method to ensure compatibility with future Prometheus client updates.

Suggested implementation:

    def add_instrument(self, instrument):
        self.instruments[instrument.name] = instrument

If the instrument class does not have a public name property, you should add one to the instrument class definition, for example:

@property
def name(self):
    return self._name

Then use instrument.name as shown above.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (5)
detectors/Dockerfile.builtIn (1)

24-25: Consider using more restrictive permissions.

The chmod 777 grants full read/write/execute permissions to all users. While this may work in single-user container environments, it's unnecessarily permissive. Consider using chmod 755 instead to restrict write access to the owner while maintaining read/execute for others.

Apply this diff:

-RUN mkdir -p $PROMETHEUS_MULTIPROC_DIR && chmod 777 $PROMETHEUS_MULTIPROC_DIR
+RUN mkdir -p $PROMETHEUS_MULTIPROC_DIR && chmod 755 $PROMETHEUS_MULTIPROC_DIR
tests/conftest.py (1)

40-47: Address the unused request parameter.

The request parameter in the cleanup_prometheus_multiproc_dir fixture is not used. If this parameter is not required for pytest's fixture dependency ordering, it should be removed.

If the parameter is unnecessary, apply this diff:

 @pytest.fixture(scope="session", autouse=True)
-def cleanup_prometheus_multiproc_dir(request, prometheus_multiproc_dir):
+def cleanup_prometheus_multiproc_dir(prometheus_multiproc_dir):
     """
     Cleanup the PROMETHEUS_MULTIPROC_DIR after the test session.
     """
detectors/common/app.py (1)

10-10: Remove unused import.

The CollectorRegistry import is not used anywhere in this file. Consider removing it to keep imports clean.

Apply this diff:

-from prometheus_client import Counter, CollectorRegistry
+from prometheus_client import Counter
docs/custom_detectors.md (2)

3-3: Fix wordy phrase.

Consider replacing "based off of" with "based on" for more concise writing.

Apply this diff:

-custom detectors in the built-in server based off of arbitrary Python code. This lets you quickly and flexible
+custom detectors in the built-in server based on arbitrary Python code. This lets you quickly and flexibly

Note: Also fixed "flexible" → "flexibly" for grammatical correctness.


11-11: Add missing period after abbreviation.

In American English, "etc" should be followed by a period.

Apply this diff:

-   1) Return values that evaluate to false (e.g., `{}`, `""`, `None`, etc) are treated as non-detections 
+   1) Return values that evaluate to false (e.g., `{}`, `""`, `None`, etc.) are treated as non-detections 
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d3a34fd and 077a77d.

📒 Files selected for processing (12)
  • detectors/Dockerfile.builtIn (1 hunks)
  • detectors/built_in/app.py (2 hunks)
  • detectors/built_in/custom_detectors/custom_detectors.py (1 hunks)
  • detectors/built_in/custom_detectors_wrapper.py (2 hunks)
  • detectors/common/app.py (2 hunks)
  • detectors/common/instrumented_detector.py (1 hunks)
  • detectors/huggingface/app.py (1 hunks)
  • docs/custom_detectors.md (1 hunks)
  • tests/conftest.py (2 hunks)
  • tests/detectors/builtIn/test_custom.py (0 hunks)
  • tests/detectors/builtIn/test_metrics.py (8 hunks)
  • tests/detectors/huggingface/test_metrics.py (4 hunks)
💤 Files with no reviewable changes (1)
  • tests/detectors/builtIn/test_custom.py
🧰 Additional context used
🧬 Code graph analysis (6)
detectors/built_in/custom_detectors/custom_detectors.py (1)
detectors/built_in/custom_detectors_wrapper.py (2)
  • use_instruments (19-30)
  • non_blocking (32-59)
tests/detectors/huggingface/test_metrics.py (1)
detectors/common/instrumented_detector.py (1)
  • set_instruments (20-21)
tests/detectors/builtIn/test_metrics.py (7)
tests/detectors/huggingface/test_metrics.py (2)
  • client (46-76)
  • get_metric_dict (32-42)
tests/detectors/builtIn/test_custom.py (2)
  • client (36-40)
  • cleanup_custom_detectors (43-46)
detectors/built_in/custom_detectors_wrapper.py (1)
  • CustomDetectorRegistry (144-203)
detectors/built_in/regex_detectors.py (1)
  • RegexDetectorRegistry (123-157)
detectors/built_in/file_type_detectors.py (1)
  • FileTypeDetectorRegistry (171-218)
detectors/common/app.py (1)
  • set_detector (119-121)
detectors/common/instrumented_detector.py (1)
  • set_instruments (20-21)
detectors/huggingface/app.py (1)
detectors/common/instrumented_detector.py (1)
  • set_instruments (20-21)
detectors/built_in/custom_detectors_wrapper.py (3)
detectors/built_in/base_detector_registry.py (1)
  • BaseDetectorRegistry (9-37)
detectors/common/scheme.py (1)
  • ContentAnalysisResponse (126-137)
detectors/common/instrumented_detector.py (1)
  • add_instrument (23-24)
detectors/built_in/app.py (3)
detectors/common/instrumented_detector.py (1)
  • set_instruments (20-21)
detectors/common/app.py (1)
  • cleanup_detector (131-133)
detectors/huggingface/app.py (1)
  • lifespan (16-25)
🪛 LanguageTool
docs/custom_detectors.md

[style] ~3-~3: ‘off of’ might be wordy. Consider a shorter alternative.
Context: ... detectors in the built-in server based off of arbitrary Python code. This lets you qu...

(EN_WORDINESS_PREMIUM_OFF_OF)


[style] ~11-~11: In American English, abbreviations like “etc.” require a period.
Context: ...ate to false (e.g., {}, "", None, etc) are treated as non-detections 2) B...

(ETC_PERIOD)

🪛 Ruff (0.14.2)
tests/conftest.py

40-40: Unused function argument: request

(ARG001)

detectors/built_in/custom_detectors/custom_detectors.py

9-9: Unused function argument: text

(ARG001)


20-20: Undefined name use_instruments

(F821)


30-30: Undefined name use_instruments

(F821)


31-31: Undefined name non_blocking

(F821)

tests/detectors/builtIn/test_metrics.py

76-76: Avoid specifying long messages outside the exception class

(TRY003)

detectors/built_in/custom_detectors_wrapper.py

28-28: Do not call setattr with a constant attribute value. It is not any safer than normal property access.

Replace setattr with assignment

(B010)


49-49: Do not catch blind exception: Exception

(BLE001)


50-50: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


56-56: Do not call setattr with a constant attribute value. It is not any safer than normal property access.

Replace setattr with assignment

(B010)


152-152: f-string without any placeholders

Remove extraneous f prefix

(F541)


177-177: Loop control variable name not used within loop body

(B007)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Sourcery review
🔇 Additional comments (7)
detectors/common/app.py (1)

31-31: LGTM: Centralized metric prefix.

The introduction of METRIC_PREFIX is a good practice for maintaining consistent metric naming across the application.

detectors/common/instrumented_detector.py (1)

20-24: LGTM: Clearer API semantics.

The rename from add_instruments to set_instruments better communicates that this operation replaces the entire instruments dictionary rather than adding to it. The new add_instrument method provides a way to add individual instruments.

detectors/huggingface/app.py (1)

19-19: LGTM: Consistent with API refactor.

The change from add_instruments to set_instruments aligns with the API refactor in detectors/common/instrumented_detector.py.

tests/detectors/huggingface/test_metrics.py (2)

12-12: LGTM: Dynamic metric naming.

Importing and using METRIC_PREFIX ensures tests stay in sync with the metric naming scheme defined in the application code.


75-75: LGTM: Consistent with API refactor.

The change to set_instruments aligns with the API refactor across the codebase.

detectors/built_in/app.py (2)

34-39: LGTM: Proper multiprocess metrics collection.

The new /metrics endpoint correctly uses MultiProcessCollector to aggregate metrics across multiple worker processes, which aligns with the Dockerfile setup of PROMETHEUS_MULTIPROC_DIR.


24-24: LGTM: Consistent with API refactor.

The change to set_instruments is consistent with the API refactor in detectors/common/instrumented_detector.py.

Comment on lines +20 to +32
@use_instruments(instruments=[prompt_rejection_counter])
def has_metrics(text: str) -> bool:
if "sorry" in text:
prompt_rejection_counter.inc()
return False

background_metric = Counter(
"trustyai_guardrails_background_metric",
"Runs some logic in the background without blocking the /detections call"
)
@use_instruments(instruments=[background_metric])
@non_blocking(return_value=False)
def background_function(text: str) -> bool:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Module import fails without decorator injection

Importing detectors.built_in.custom_detectors.custom_detectors now raises NameError because use_instruments / non_blocking are undefined unless the wrapper has already injected them. Any direct import (e.g., docs or tooling) blows up before tests can run. Please restore explicit availability of these decorators when the module is loaded.

+from detectors.built_in.custom_detectors_wrapper import non_blocking, use_instruments
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
@use_instruments(instruments=[prompt_rejection_counter])
def has_metrics(text: str) -> bool:
if "sorry" in text:
prompt_rejection_counter.inc()
return False
background_metric = Counter(
"trustyai_guardrails_background_metric",
"Runs some logic in the background without blocking the /detections call"
)
@use_instruments(instruments=[background_metric])
@non_blocking(return_value=False)
def background_function(text: str) -> bool:
from detectors.built_in.custom_detectors_wrapper import non_blocking, use_instruments
@use_instruments(instruments=[prompt_rejection_counter])
def has_metrics(text: str) -> bool:
if "sorry" in text:
prompt_rejection_counter.inc()
return False
background_metric = Counter(
"trustyai_guardrails_background_metric",
"Runs some logic in the background without blocking the /detections call"
)
@use_instruments(instruments=[background_metric])
@non_blocking(return_value=False)
def background_function(text: str) -> bool:
🧰 Tools
🪛 Ruff (0.14.2)

20-20: Undefined name use_instruments

(F821)


30-30: Undefined name use_instruments

(F821)


31-31: Undefined name non_blocking

(F821)

🤖 Prompt for AI Agents
In detectors/built_in/custom_detectors/custom_detectors.py around lines 20 to
32, the decorators use_instruments and non_blocking are referenced before they
are defined by any injection, causing a NameError on import; explicitly import
(or fallback-define) those decorators and any required symbols (e.g.,
use_instruments, non_blocking, prompt_rejection_counter, Counter) at the top of
the module so they exist at import time. Add the appropriate from ... import ...
statements for the real implementations (or a safe noop fallback) before the
decorated functions, and ensure background_metric is created after importing
Counter; this restores decorator availability and prevents import-time failures.

detector_registry.set_instruments(app.state.instruments)
yield
app.cleanup_detector()

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we add clean-up of old multiprocess metrics to prevent accumulation?

my concern is that old Prometheus multiprocess metric files might accumulate here, causing incorrect metrics after container restarts -- what do you think?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this only might be the case in instances when the server crashes, so it's no big deal


forbidden_names = [use_instruments.__name__, non_blocking.__name__]

def get_underlying_function(func):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is very nice

def inner_layer_1(func):
@functools.wraps(func)
def inner_layer_2(*args, **kwargs):
executor = getattr(non_blocking, "_executor", None)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could perhpas add shutdown as currently this is never explicitly shut down?

RobGeada and others added 3 commits November 3, 2025 15:41
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com>
Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
docs/custom_detectors.md (2)

3-3: Minor: Consider "based on" instead of "based off of."

The phrase "based off of" is slightly wordy; "based on" is more concise while maintaining the same meaning.

-custom detectors in the built-in server based off of arbitrary Python code. This lets you quickly and flexibly
+custom detectors in the built-in server based on arbitrary Python code. This lets you quickly and flexibly

11-11: Minor: Add period to abbreviation "etc."

The abbreviation "etc." requires a period in American English.

-   1) Return values that evaluate to false (e.g., `{}`, `""`, `None`, etc) are treated as non-detections
+   1) Return values that evaluate to false (e.g., `{}`, `""`, `None`, etc.) are treated as non-detections
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 077a77d and dc2f360.

📒 Files selected for processing (1)
  • docs/custom_detectors.md (1 hunks)
🧰 Additional context used
🪛 LanguageTool
docs/custom_detectors.md

[style] ~3-~3: ‘off of’ might be wordy. Consider a shorter alternative.
Context: ... detectors in the built-in server based off of arbitrary Python code. This lets you qu...

(EN_WORDINESS_PREMIUM_OFF_OF)


[style] ~11-~11: In American English, abbreviations like “etc.” require a period.
Context: ...ate to false (e.g., {}, "", None, etc) are treated as non-detections 2) B...

(ETC_PERIOD)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Sourcery review
🔇 Additional comments (1)
docs/custom_detectors.md (1)

1-41: Excellent documentation of custom detector framework.

This documentation clearly explains the rules for creating custom detectors, the purpose of the utility decorators (@use_instruments and @non_blocking), and provides helpful references to implementation examples. The structure is logical and the security restrictions are well-documented. Past typos have been corrected, and the content aligns well with the PR objectives for exposing custom detector capabilities.

@RobGeada RobGeada merged commit cbe5b63 into trustyai-explainability:main Nov 4, 2025
2 of 3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants