Feat: Custom user metrics and non-blocking guardrails #57

RobGeada · 2025-11-03T12:37:34Z

Summary by Sourcery

Add support for user-defined Prometheus metrics and non-blocking custom guardrails, standardize metric naming with a prefix constant, refactor detector instrumentation and module loading, expose a multiprocess /metrics endpoint, and provide documentation and tests for the new capabilities

New Features:

Allow custom detectors to register Prometheus metrics using a @use_instruments decorator
Introduce a @non_blocking decorator to run custom guardrail functions asynchronously without blocking requests
Expose a /metrics endpoint that aggregates multiprocess Prometheus metrics

Enhancements:

Standardize all metric names behind a METRIC_PREFIX constant
Replace add_instruments with set_instruments and support dynamic instrument registration via add_instrument
Refactor custom detectors loader to use importlib, inject utility decorators, and perform security checks

Documentation:

Add docs/custom_detectors.md describing custom detector guidelines and available decorators

Tests:

Add fixtures for Prometheus multiprocess directory setup and cleanup
Add tests for user-defined metrics and non-blocking guardrails and update existing tests to use METRIC_PREFIX and clear metric registry before each test

Summary by CodeRabbit

New Features
- Added /metrics endpoint for Prometheus metrics collection
- Enabled custom detectors with built-in metric instrumentation and non-blocking background detectors
- Added decorators for metrics tracking and background execution for custom detectors
Documentation
- New guide for creating custom detectors with security rules and examples
Changes
- Standardized metric names with a common prefix
- Instrument configuration API updated from add_instruments → set_instruments
- Docker/test setup improved for Prometheus multiprocess support
Tests
- Added/updated tests covering metrics and non-blocking behavior

sourcery-ai · 2025-11-03T12:37:40Z

Reviewer's Guide

This PR introduces a unified metric prefix and refactors all Prometheus instrumentation to use it, exposes a proper multiprocess‐aware /metrics endpoint, adds support for user‐defined metrics and non‐blocking guardrail execution via new decorators, updates the instrumentation API, and extends the test suite (with fixtures) to validate these features end-to-end.

Sequence diagram for non-blocking guardrail execution

sequenceDiagram
    participant Client
    participant API
    participant "CustomDetectorRegistry"
    participant "ThreadPoolExecutor"
    Client->>API: POST /api/v1/text/contents
    API->>"CustomDetectorRegistry": Call background_function (decorated with @non_blocking)
    "CustomDetectorRegistry"->>"ThreadPoolExecutor": Submit function to background thread
    "CustomDetectorRegistry"-->>API: Return immediately with return_value (e.g., False)
    Note over "ThreadPoolExecutor": Function logic runs asynchronously

Entity relationship diagram for Prometheus metrics registry

erDiagram
    DetectorBaseAPI {
        string state_instruments
    }
    InstrumentedDetector {
        string instruments
    }
    CustomDetectorRegistry {
        string registry
    }
    PrometheusInstrument {
    }
    DetectorBaseAPI ||--o| InstrumentedDetector : sets instruments
    InstrumentedDetector ||--o| CustomDetectorRegistry : inherited by
    CustomDetectorRegistry ||--o| PrometheusInstrument : adds instrument

Class diagram for new decorators and instrumentation API

classDiagram
    class CustomDetectorRegistry {
        +registry: dict
        +function_needs_headers: dict
        +__init__()
    }
    class use_instruments {
        +__call__(func)
    }
    class non_blocking {
        +__call__(func)
    }
    class InstrumentedDetector {
        +set_instruments(instruments)
        +add_instrument(instrument)
        +increment_detector_instruments(function_name, is_detection)
    }
    CustomDetectorRegistry --> use_instruments : uses
    CustomDetectorRegistry --> non_blocking : uses
    InstrumentedDetector <|.. CustomDetectorRegistry : inherits

File-Level Changes

Change	Details	Files
Metric prefix constant and instrumentation API refactor	Introduce METRIC_PREFIX constant in common/app Rename add_instruments to set_instruments and add add_instrument in instrumented_detector Update all metric definitions and references to use f"{METRIC_PREFIX}_..." Adjust detector setup in built-in and huggingface apps to call set_instruments	`detectors/common/app.py` `detectors/common/instrumented_detector.py` `detectors/built_in/app.py` `detectors/huggingface/app.py` `tests/detectors/builtIn/test_metrics.py` `tests/detectors/huggingface/test_metrics.py`
Expose Prometheus /metrics endpoint with multiprocess support	Remove Instrumentator-based instrumentation Add GET /metrics handler using CollectorRegistry and MultiProcessCollector Return generated_latest data with correct media type	`detectors/built_in/app.py`
Support custom user metrics and non-blocking guardrails	Implement use_instruments decorator to attach Prometheus instruments to custom functions Implement non_blocking decorator to run guardrails asynchronously Inject decorators into custom_detectors module via importlib in CustomDetectorRegistry Auto-register user metrics instruments during registry initialization	`detectors/built_in/custom_detectors_wrapper.py` `docs/custom_detectors.md`
Enhance test suite for metrics and non-blocking behavior	Add session fixtures for PROMETHEUS_MULTIPROC_DIR setup and cleanup Introduce wait_for_metric helper and clear registry before tests Update metric filtering in get_metric_dict to ignore HELP/TYPE lines Add tests for user-defined metrics and non-blocking execution timing	`tests/conftest.py` `tests/detectors/builtIn/test_metrics.py`
Update custom_detectors example to showcase new decorators	Simplify example file to include slow_func, has_metrics, and background_function Demonstrate use_instruments and non_blocking usage in example custom detectors	`detectors/built_in/custom_detectors/custom_detectors.py`

Tips and commands

Interacting with Sourcery

Trigger a new review: Comment @sourcery-ai review on the pull request.
Continue discussions: Reply directly to Sourcery's review comments.
Generate a GitHub issue from a review comment: Ask Sourcery to create an
issue from a review comment by replying to it. You can also reply to a
review comment with @sourcery-ai issue to create an issue from it.
Generate a pull request title: Write @sourcery-ai anywhere in the pull
request title to generate a title at any time. You can also comment
@sourcery-ai title on the pull request to (re-)generate the title at any time.
Generate a pull request summary: Write @sourcery-ai summary anywhere in
the pull request body to generate a PR summary at any time exactly where you
want it. You can also comment @sourcery-ai summary on the pull request to
(re-)generate the summary at any time.
Generate reviewer's guide: Comment @sourcery-ai guide on the pull
request to (re-)generate the reviewer's guide at any time.
Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
pull request to resolve all Sourcery comments. Useful if you've already
addressed all the comments and don't want to see them anymore.
Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
request to dismiss all existing Sourcery reviews. Especially useful if you
want to start fresh with a new review - don't forget to comment
@sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

Enable or disable review features such as the Sourcery-generated pull request
summary, the reviewer's guide, and others.
Change the review language.
Add, remove or edit custom review instructions.
Adjust other review settings.

Getting Help

Contact our support team for questions or feedback.
Visit our documentation for detailed guides and information.
Keep in touch with the Sourcery team by following us on X/Twitter, LinkedIn or GitHub.

coderabbitai · 2025-11-03T12:37:43Z

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

Walkthrough

This PR adds Prometheus multiprocess support and a /metrics endpoint, introduces instrumentation-aware decorators (@use_instruments, @non_blocking) and dynamic loading/security checks for custom detectors, renames the instruments API from add_instruments → set_instruments (plus add_instrument), adds a METRIC_PREFIX, and updates tests and docs accordingly.

Changes

Cohort / File(s)	Summary
Prometheus multiprocess setup `detectors/Dockerfile.builtIn`, `tests/conftest.py`	Set `PROMETHEUS_MULTIPROC_DIR` and create the directory in Dockerfile; add session-scoped pytest fixtures to create/cleanup a multiprocess temp dir during tests.
Metrics endpoint & registry `detectors/built_in/app.py`, `detectors/common/app.py`	Add GET `/metrics` route using `CollectorRegistry` + `MultiProcessCollector` and `generate_latest`; introduce `METRIC_PREFIX` constant and switch metric names to use it.
Instrument API & callsites `detectors/common/instrumented_detector.py`, `detectors/huggingface/app.py`, `detectors/huggingface/detector.py`	Replace `add_instruments(...)` with `set_instruments(...)`, add `add_instrument(instrument)` helper, and update detector call sites to call `set_instruments`.
Custom detectors & decorators `detectors/built_in/custom_detectors/custom_detectors.py`, `detectors/built_in/custom_detectors_wrapper.py`	Replace example detectors with Python detectors exposing Prometheus Counters; add `@use_instruments` and `@non_blocking` decorators, `get_underlying_function`, dynamic import of `custom_detectors.py`, security/static analysis checks, and registration of prometheus_instruments.
Documentation `docs/custom_detectors.md`	New guide describing custom detector rules, security restrictions, and usage of `@use_instruments` / `@non_blocking`.
Tests `tests/detectors/builtIn/test_custom.py`, `tests/detectors/builtIn/test_metrics.py`, `tests/detectors/huggingface/test_metrics.py`	Add/modify tests to manage prometheus multiproc dir, adopt `METRIC_PREFIX`, replace `add_instruments` → `set_instruments`, add `wait_for_metric` helper, add tests for user-provided and non-blocking metrics; minor formatting tweaks.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant App as FastAPI App
    participant DetectorRegistry as Detector Registry
    participant CustomModule as custom_detectors.py
    participant Decorator as use_instruments / non_blocking
    participant PromCollector as Prometheus CollectorRegistry

    rect rgb(230,245,255)
    note over App,DetectorRegistry: Startup
    App->>DetectorRegistry: set_instruments(app.state.instruments)
    DetectorRegistry->>App: instruments stored
    end

    rect rgb(235,255,235)
    note over Client,App: Detection request
    Client->>App: POST /detections
    App->>CustomModule: call detector func (may be wrapped)
    CustomModule->>Decorator: decorator logic (attach instruments or spawn background)
    Decorator->>PromCollector: increment/update instrument(s)
    Decorator-->>App: detector result (immediate if non_blocking)
    App-->>Client: response
    end

    rect rgb(255,245,235)
    note over Client,App: Metrics scrape
    Client->>App: GET /metrics
    App->>PromCollector: CollectorRegistry + MultiProcessCollector -> generate_latest()
    PromCollector-->>App: metrics payload (CONTENT_TYPE_LATEST)
    App-->>Client: /metrics response
    end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Pay special attention to:
- detectors/built_in/custom_detectors_wrapper.py (decorator wrapping, attribute propagation, dynamic import, static analysis/security checks).
- detectors/built_in/app.py (CollectorRegistry + MultiProcessCollector usage and content-type handling).
- Consistent migration of add_instruments → set_instruments across modules and tests.

Possibly related PRs

Feat: Modularize Prometheus metrics and add them to the HF runtime #54 — changes to Prometheus instrumentation and instrument registry API closely related to this PR.

Suggested labels

ok-to-test

Suggested reviewers

saichandrapandraju
m-misiura

Poem

🐰 I hopped through code to stitch a metrics seam,
Multiprocess carrots and counters in a stream,
Decorators whisper, instruments hum bright,
/metrics opens up to Prometheus' light,
Happy detectors dance — metrics take flight! ✨

Pre-merge checks and finishing touches

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The pull request title 'Feat: Custom user metrics and non-blocking guardrails' directly aligns with the main objectives and changes in this changeset. The PR introduces two key features: user-defined Prometheus metrics via the @use_instruments decorator and non-blocking guardrail execution via the @non_blocking decorator, which are precisely the primary focus of the title. The title is concise, specific, and accurately represents the core functionality being added.

✨ Finishing touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

sourcery-ai

Hey there - I've reviewed your changes and they look great!

Prompt for AI Agents

Please address the comments from this code review:

## Individual Comments

### Comment 1
<location> `detectors/built_in/custom_detectors_wrapper.py:46-51` </location>
<code_context>
+            if executor is None:
+                executor = ThreadPoolExecutor()
+                non_blocking._executor = executor
+            def runner():
+                try:
+                    func(*args, **kwargs)
+                except Exception as e:
+                    logging.error(f"Exception in non-blocking guardrail {func.__name__}: {e}")
+            executor.submit(runner)
+
</code_context>

<issue_to_address>
**suggestion:** Consider including traceback information in error logging for non_blocking runner.

Logging only the exception message limits debugging. Use logging.exception or traceback.format_exc to capture full traceback details for better issue diagnosis.

```suggestion
            def runner():
                try:
                    func(*args, **kwargs)
                except Exception as e:
                    logging.exception(f"Exception in non-blocking guardrail {func.__name__}: {e}")
            executor.submit(runner)
```
</issue_to_address>

### Comment 2
<location> `detectors/common/instrumented_detector.py:23-24` </location>
<code_context>
+    def set_instruments(self, instruments):
+        self.instruments = instruments
+
+    def add_instrument(self, instrument):
+        self.instruments[instrument._name] = instrument

     def increment_detector_instruments(self, function_name: str, is_detection: bool):
</code_context>

<issue_to_address>
**suggestion:** Directly accessing instrument._name may break if instrument implementation changes.

Consider replacing direct access to _name with a public property or method to ensure compatibility with future Prometheus client updates.

Suggested implementation:

```python
    def add_instrument(self, instrument):
        self.instruments[instrument.name] = instrument

```

If the instrument class does not have a public `name` property, you should add one to the instrument class definition, for example:

```python
@property
def name(self):
    return self._name
```

Then use `instrument.name` as shown above.
</issue_to_address>

### Comment 3
<location> `docs/custom_detectors.md:3-4` </location>
<code_context>
+# Custom Detectors
+You can overwrite [custom_detectors.py](detectors/built_in/custom_detectors/custom_detectors.py) to create
+custom detectors in the built-in server based off of arbitrary Python code. This lets you quickly and flexible
+create your own detection logic!
+
</code_context>

<issue_to_address>
**issue (typo):** Typo: 'flexible' should be 'flexibly'.

It should say 'quickly and flexibly create your own detection logic!' to be grammatically correct.

```suggestion
custom detectors in the built-in server based off of arbitrary Python code. This lets you quickly and flexibly
create your own detection logic!
```
</issue_to_address>

### Comment 4
<location> `docs/custom_detectors.md:15` </location>
<code_context>
+   2) Boolean responses of `true` are considered a detection 
+      * see the `over_100_characters` example in [custom_detectors.py](detectors/built_in/custom_detectors/custom_detectors.py) for usage
+   3) Dict response that are parseable as a `ContentAnalysisResponse` object are considered a detection
+      * see the `contains_word` example in [custom_detectors.py](detectors/built_in/custom_detectors/custom_detectors.py) for ursage
+4) This code may not import `os`, `subprocess`, `sys`, or `shutil` for security reasons
+5) This code may not call `eval`, `exec`, `open`, `compile`, or `input` for security reasons
</code_context>

<issue_to_address>
**issue (typo):** Typo: 'ursage' should be 'usage'.

Change 'ursage' to 'usage' in the referenced line.

```suggestion
      * see the `contains_word` example in [custom_detectors.py](detectors/built_in/custom_detectors/custom_detectors.py) for usage
```
</issue_to_address>

### Comment 5
<location> `docs/custom_detectors.md:23-25` </location>
<code_context>
+## Utility Decorators
+The following decorators are also available, and are automatically imported into the custom_detectors.py file:
+
+### `@use_instruments(instruments=[$INSTRUMENT_1, ..., $INSTRUMENT_N]`
+ Use this decorator to register your own Prometheus instruments with the server's main
+ `/metrics` registry. See the `function_that_has_prometheus_metrics` example 
</code_context>

<issue_to_address>
**issue (typo):** Missing closing parenthesis in decorator example.

Please add the missing closing parenthesis to the decorator example for accuracy.

```suggestion
### `@use_instruments(instruments=[$INSTRUMENT_1, ..., $INSTRUMENT_N])`
 Use this decorator to register your own Prometheus instruments with the server's main
 `/metrics` registry. See the `function_that_has_prometheus_metrics` example 
```
</issue_to_address>

### Comment 6
<location> `tests/detectors/builtIn/test_metrics.py:254-259` </location>
<code_context>

</code_context>

<issue_to_address>
**issue (code-quality):** Avoid loops in tests. ([`no-loop-in-tests`](https://docs.sourcery.ai/Reference/Rules-and-In-Line-Suggestions/Python/Default-Rules/no-loop-in-tests))

<details><summary>Explanation</summary>Avoid complex code, like loops, in test functions.

Google's software engineering guidelines says:
"Clear tests are trivially correct upon inspection"
To reach that avoid complex code in tests:
* loops
* conditionals

Some ways to fix this:

* Use parametrized tests to get rid of the loop.
* Move the complex logic into helpers.
* Move the complex part into pytest fixtures.

> Complexity is most often introduced in the form of logic. Logic is defined via the imperative parts of programming languages such as operators, loops, and conditionals. When a piece of code contains logic, you need to do a bit of mental computation to determine its result instead of just reading it off of the screen. It doesn't take much logic to make a test more difficult to reason about.

Software Engineering at Google / [Don't Put Logic in Tests](https://abseil.io/resources/swe-book/html/ch12.html#donapostrophet_put_logic_in_tests)
</details>
</issue_to_address>

### Comment 7
<location> `tests/detectors/builtIn/test_metrics.py:269-274` </location>
<code_context>

</code_context>

<issue_to_address>
**issue (code-quality):** Avoid loops in tests. ([`no-loop-in-tests`](https://docs.sourcery.ai/Reference/Rules-and-In-Line-Suggestions/Python/Default-Rules/no-loop-in-tests))

<details><summary>Explanation</summary>Avoid complex code, like loops, in test functions.

Google's software engineering guidelines says:
"Clear tests are trivially correct upon inspection"
To reach that avoid complex code in tests:
* loops
* conditionals

Some ways to fix this:

* Use parametrized tests to get rid of the loop.
* Move the complex logic into helpers.
* Move the complex part into pytest fixtures.

> Complexity is most often introduced in the form of logic. Logic is defined via the imperative parts of programming languages such as operators, loops, and conditionals. When a piece of code contains logic, you need to do a bit of mental computation to determine its result instead of just reading it off of the screen. It doesn't take much logic to make a test more difficult to reason about.

Software Engineering at Google / [Don't Put Logic in Tests](https://abseil.io/resources/swe-book/html/ch12.html#donapostrophet_put_logic_in_tests)
</details>
</issue_to_address>

### Comment 8
<location> `detectors/built_in/custom_detectors_wrapper.py:146` </location>
<code_context>
    def __init__(self):
        super().__init__("custom")

        # check the imported code for potential security issues
        issues = static_code_analysis(module_path = os.path.join(os.path.dirname(__file__), "custom_detectors", "custom_detectors.py"))
        if issues:
            logging.error(f"Detected {len(issues)} potential security issues inside the custom_detectors file: {issues}")
            raise ImportError(f"Unsafe code detected in custom_detectors:\n" + "\n".join(issues))

        # grab custom detectors module
        module_path = os.path.join(os.path.dirname(__file__), "custom_detectors", "custom_detectors.py")
        spec = importlib.util.spec_from_file_location("custom_detectors.custom_detectors", module_path)
        custom_detectors = importlib.util.module_from_spec(spec)

        # inject any user utility functions into the code automatically
        inject_imports = {
            "use_instruments": use_instruments,
            "non_blocking": non_blocking,
        }
        for name, mod in inject_imports.items():
            setattr(custom_detectors, name, mod)

        # load the module
        sys.modules["custom_detectors.custom_detectors"] = custom_detectors
        spec.loader.exec_module(custom_detectors)

        self.registry = {name: obj for name, obj
                         in inspect.getmembers(custom_detectors, inspect.isfunction)
                         if not name.startswith("_") and name not in forbidden_names}
        self.function_needs_headers = {name: "headers" in inspect.signature(obj).parameters for name, obj in self.registry.items() }

        # check if functions have requested user prometheus metrics
        for name, func in self.registry.items():
            target = get_underlying_function(func)
            if getattr(target, "prometheus_instruments", False):
                instruments = target.prometheus_instruments
                for instrument in instruments:
                    super().add_instrument(instrument)

        logger.info(f"Registered the following custom detectors: {self.registry.keys()}")

</code_context>

<issue_to_address>
**issue (code-quality):** We've found these issues:

- Use named expression to simplify assignment and conditional ([`use-named-expression`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/use-named-expression/))
- Replace calls to `dict.items` with `dict.values` when the keys are not used ([`replace-dict-items-with-values`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/replace-dict-items-with-values/))
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

sourcery-ai · 2025-11-03T12:38:27Z

detectors/built_in/custom_detectors_wrapper.py

+            def runner():
+                try:
+                    func(*args, **kwargs)
+                except Exception as e:
+                    logging.error(f"Exception in non-blocking guardrail {func.__name__}: {e}")
+            executor.submit(runner)


suggestion: Consider including traceback information in error logging for non_blocking runner.

Logging only the exception message limits debugging. Use logging.exception or traceback.format_exc to capture full traceback details for better issue diagnosis.

Suggested change

def runner():

try:

func(*args, **kwargs)

except Exception as e:

logging.error(f"Exception in non-blocking guardrail {func.__name__}: {e}")

executor.submit(runner)

def runner():

try:

func(*args, **kwargs)

except Exception as e:

logging.exception(f"Exception in non-blocking guardrail {func.__name__}: {e}")

executor.submit(runner)

sourcery-ai · 2025-11-03T12:38:27Z

detectors/common/instrumented_detector.py

+    def add_instrument(self, instrument):
+        self.instruments[instrument._name] = instrument


suggestion: Directly accessing instrument._name may break if instrument implementation changes.

Consider replacing direct access to _name with a public property or method to ensure compatibility with future Prometheus client updates.

Suggested implementation:

def add_instrument(self, instrument): self.instruments[instrument.name] = instrument

If the instrument class does not have a public name property, you should add one to the instrument class definition, for example:

@property def name(self): return self._name

Then use instrument.name as shown above.

docs/custom_detectors.md

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (5)

detectors/Dockerfile.builtIn (1)
24-25: Consider using more restrictive permissions.

The chmod 777 grants full read/write/execute permissions to all users. While this may work in single-user container environments, it's unnecessarily permissive. Consider using chmod 755 instead to restrict write access to the owner while maintaining read/execute for others.

Apply this diff:
-RUN mkdir -p $PROMETHEUS_MULTIPROC_DIR && chmod 777 $PROMETHEUS_MULTIPROC_DIR
+RUN mkdir -p $PROMETHEUS_MULTIPROC_DIR && chmod 755 $PROMETHEUS_MULTIPROC_DIR
tests/conftest.py (1)
40-47: Address the unused request parameter.

The request parameter in the cleanup_prometheus_multiproc_dir fixture is not used. If this parameter is not required for pytest's fixture dependency ordering, it should be removed.

If the parameter is unnecessary, apply this diff:
 @pytest.fixture(scope="session", autouse=True)
-def cleanup_prometheus_multiproc_dir(request, prometheus_multiproc_dir):
+def cleanup_prometheus_multiproc_dir(prometheus_multiproc_dir):
     """
     Cleanup the PROMETHEUS_MULTIPROC_DIR after the test session.
     """
detectors/common/app.py (1)
10-10: Remove unused import.

The CollectorRegistry import is not used anywhere in this file. Consider removing it to keep imports clean.

Apply this diff:
-from prometheus_client import Counter, CollectorRegistry
+from prometheus_client import Counter
docs/custom_detectors.md (2)
3-3: Fix wordy phrase.

Consider replacing "based off of" with "based on" for more concise writing.

Apply this diff:
-custom detectors in the built-in server based off of arbitrary Python code. This lets you quickly and flexible
+custom detectors in the built-in server based on arbitrary Python code. This lets you quickly and flexibly
Note: Also fixed "flexible" → "flexibly" for grammatical correctness.

11-11: Add missing period after abbreviation.

In American English, "etc" should be followed by a period.

Apply this diff:
-   1) Return values that evaluate to false (e.g., `{}`, `""`, `None`, etc) are treated as non-detections 
+   1) Return values that evaluate to false (e.g., `{}`, `""`, `None`, etc.) are treated as non-detections 

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d3a34fd and 077a77d.

📒 Files selected for processing (12)

detectors/Dockerfile.builtIn (1 hunks)
detectors/built_in/app.py (2 hunks)
detectors/built_in/custom_detectors/custom_detectors.py (1 hunks)
detectors/built_in/custom_detectors_wrapper.py (2 hunks)
detectors/common/app.py (2 hunks)
detectors/common/instrumented_detector.py (1 hunks)
detectors/huggingface/app.py (1 hunks)
docs/custom_detectors.md (1 hunks)
tests/conftest.py (2 hunks)
tests/detectors/builtIn/test_custom.py (0 hunks)
tests/detectors/builtIn/test_metrics.py (8 hunks)
tests/detectors/huggingface/test_metrics.py (4 hunks)

💤 Files with no reviewable changes (1)

tests/detectors/builtIn/test_custom.py

🧰 Additional context used

🧬 Code graph analysis (6)

detectors/built_in/custom_detectors/custom_detectors.py (1)

detectors/built_in/custom_detectors_wrapper.py (2)

use_instruments (19-30)

non_blocking (32-59)

tests/detectors/huggingface/test_metrics.py (1)

detectors/common/instrumented_detector.py (1)

set_instruments (20-21)

tests/detectors/builtIn/test_metrics.py (7)

tests/detectors/huggingface/test_metrics.py (2)

client (46-76)

get_metric_dict (32-42)

tests/detectors/builtIn/test_custom.py (2)

client (36-40)

cleanup_custom_detectors (43-46)

detectors/built_in/custom_detectors_wrapper.py (1)

CustomDetectorRegistry (144-203)

detectors/built_in/regex_detectors.py (1)

RegexDetectorRegistry (123-157)

detectors/built_in/file_type_detectors.py (1)

FileTypeDetectorRegistry (171-218)

detectors/common/app.py (1)

set_detector (119-121)

detectors/common/instrumented_detector.py (1)

set_instruments (20-21)

detectors/huggingface/app.py (1)

detectors/common/instrumented_detector.py (1)

set_instruments (20-21)

detectors/built_in/custom_detectors_wrapper.py (3)

detectors/built_in/base_detector_registry.py (1)

BaseDetectorRegistry (9-37)

detectors/common/scheme.py (1)

ContentAnalysisResponse (126-137)

detectors/common/instrumented_detector.py (1)

add_instrument (23-24)

detectors/built_in/app.py (3)

detectors/common/instrumented_detector.py (1)

set_instruments (20-21)

detectors/common/app.py (1)

cleanup_detector (131-133)

detectors/huggingface/app.py (1)

lifespan (16-25)

🪛 LanguageTool

docs/custom_detectors.md

[style] ~3-~3: ‘off of’ might be wordy. Consider a shorter alternative.
Context: ... detectors in the built-in server based off of arbitrary Python code. This lets you qu...

(EN_WORDINESS_PREMIUM_OFF_OF)

[style] ~11-~11: In American English, abbreviations like “etc.” require a period.
Context: ...ate to false (e.g., {}, "", None, etc) are treated as non-detections 2) B...

(ETC_PERIOD)

🪛 Ruff (0.14.2)

tests/conftest.py

40-40: Unused function argument: request

(ARG001)

detectors/built_in/custom_detectors/custom_detectors.py

9-9: Unused function argument: text

(ARG001)

20-20: Undefined name use_instruments

(F821)

30-30: Undefined name use_instruments

(F821)

31-31: Undefined name non_blocking

(F821)

tests/detectors/builtIn/test_metrics.py

76-76: Avoid specifying long messages outside the exception class

(TRY003)

detectors/built_in/custom_detectors_wrapper.py

28-28: Do not call setattr with a constant attribute value. It is not any safer than normal property access.

Replace setattr with assignment

(B010)

49-49: Do not catch blind exception: Exception

(BLE001)

50-50: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

56-56: Do not call setattr with a constant attribute value. It is not any safer than normal property access.

Replace setattr with assignment

(B010)

152-152: f-string without any placeholders

Remove extraneous f prefix

(F541)

177-177: Loop control variable name not used within loop body

(B007)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Sourcery review

🔇 Additional comments (7)

detectors/common/app.py (1)

31-31: LGTM: Centralized metric prefix.

The introduction of METRIC_PREFIX is a good practice for maintaining consistent metric naming across the application.

detectors/common/instrumented_detector.py (1)

20-24: LGTM: Clearer API semantics.

The rename from add_instruments to set_instruments better communicates that this operation replaces the entire instruments dictionary rather than adding to it. The new add_instrument method provides a way to add individual instruments.

detectors/huggingface/app.py (1)

19-19: LGTM: Consistent with API refactor.

The change from add_instruments to set_instruments aligns with the API refactor in detectors/common/instrumented_detector.py.

tests/detectors/huggingface/test_metrics.py (2)

12-12: LGTM: Dynamic metric naming.

Importing and using METRIC_PREFIX ensures tests stay in sync with the metric naming scheme defined in the application code.

75-75: LGTM: Consistent with API refactor.

The change to set_instruments aligns with the API refactor across the codebase.

detectors/built_in/app.py (2)

34-39: LGTM: Proper multiprocess metrics collection.

The new /metrics endpoint correctly uses MultiProcessCollector to aggregate metrics across multiple worker processes, which aligns with the Dockerfile setup of PROMETHEUS_MULTIPROC_DIR.

24-24: LGTM: Consistent with API refactor.

The change to set_instruments is consistent with the API refactor in detectors/common/instrumented_detector.py.

coderabbitai · 2025-11-03T12:48:48Z

detectors/built_in/custom_detectors/custom_detectors.py

+@use_instruments(instruments=[prompt_rejection_counter])  
+def has_metrics(text: str) -> bool:
+    if "sorry" in text:
+        prompt_rejection_counter.inc()
+    return False
+
+background_metric = Counter(
+    "trustyai_guardrails_background_metric",
+    "Runs some logic in the background without blocking the /detections call"
+)
+@use_instruments(instruments=[background_metric])
+@non_blocking(return_value=False)
+def background_function(text: str) -> bool:


⚠️ Potential issue | 🟠 Major

Module import fails without decorator injection

Importing detectors.built_in.custom_detectors.custom_detectors now raises NameError because use_instruments / non_blocking are undefined unless the wrapper has already injected them. Any direct import (e.g., docs or tooling) blows up before tests can run. Please restore explicit availability of these decorators when the module is loaded.

+from detectors.built_in.custom_detectors_wrapper import non_blocking, use_instruments

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

@use_instruments(instruments=[prompt_rejection_counter])

def has_metrics(text: str) -> bool:

if "sorry" in text:

prompt_rejection_counter.inc()

return False

background_metric = Counter(

"trustyai_guardrails_background_metric",

"Runs some logic in the background without blocking the /detections call"

)

@use_instruments(instruments=[background_metric])

@non_blocking(return_value=False)

def background_function(text: str) -> bool:

from detectors.built_in.custom_detectors_wrapper import non_blocking, use_instruments

@use_instruments(instruments=[prompt_rejection_counter])

def has_metrics(text: str) -> bool:

if "sorry" in text:

prompt_rejection_counter.inc()

return False

background_metric = Counter(

"trustyai_guardrails_background_metric",

"Runs some logic in the background without blocking the /detections call"

)

@use_instruments(instruments=[background_metric])

@non_blocking(return_value=False)

def background_function(text: str) -> bool:

🧰 Tools

🪛 Ruff (0.14.2)

20-20: Undefined name use_instruments

(F821)

30-30: Undefined name use_instruments

(F821)

31-31: Undefined name non_blocking

(F821)

🤖 Prompt for AI Agents

In detectors/built_in/custom_detectors/custom_detectors.py around lines 20 to 32, the decorators use_instruments and non_blocking are referenced before they are defined by any injection, causing a NameError on import; explicitly import (or fallback-define) those decorators and any required symbols (e.g., use_instruments, non_blocking, prompt_rejection_counter, Counter) at the top of the module so they exist at import time. Add the appropriate from ... import ... statements for the real implementations (or a safe noop fallback) before the decorated functions, and ensure background_metric is created after importing Counter; this restores decorator availability and prevents import-time failures.

docs/custom_detectors.md

m-misiura · 2025-11-03T14:22:01Z

detectors/built_in/app.py

+        detector_registry.set_instruments(app.state.instruments)
    yield
    app.cleanup_detector()



should we add clean-up of old multiprocess metrics to prevent accumulation?

my concern is that old Prometheus multiprocess metric files might accumulate here, causing incorrect metrics after container restarts -- what do you think?

I think this only might be the case in instances when the server crashes, so it's no big deal

m-misiura · 2025-11-03T14:49:29Z

detectors/built_in/custom_detectors_wrapper.py

+
+forbidden_names = [use_instruments.__name__, non_blocking.__name__]
+
+def get_underlying_function(func):


this is very nice

m-misiura · 2025-11-03T14:51:28Z

detectors/built_in/custom_detectors_wrapper.py

+    def inner_layer_1(func):
+        @functools.wraps(func)
+        def inner_layer_2(*args, **kwargs):
+            executor = getattr(non_blocking, "_executor", None)


could perhpas add shutdown as currently this is never explicitly shut down?

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com>

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (2)

docs/custom_detectors.md (2)
3-3: Minor: Consider "based on" instead of "based off of."

The phrase "based off of" is slightly wordy; "based on" is more concise while maintaining the same meaning.
-custom detectors in the built-in server based off of arbitrary Python code. This lets you quickly and flexibly
+custom detectors in the built-in server based on arbitrary Python code. This lets you quickly and flexibly
11-11: Minor: Add period to abbreviation "etc."

The abbreviation "etc." requires a period in American English.
-   1) Return values that evaluate to false (e.g., `{}`, `""`, `None`, etc) are treated as non-detections
+   1) Return values that evaluate to false (e.g., `{}`, `""`, `None`, etc.) are treated as non-detections

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 077a77d and dc2f360.

📒 Files selected for processing (1)

docs/custom_detectors.md (1 hunks)

🧰 Additional context used

🪛 LanguageTool

docs/custom_detectors.md

[style] ~3-~3: ‘off of’ might be wordy. Consider a shorter alternative.
Context: ... detectors in the built-in server based off of arbitrary Python code. This lets you qu...

(EN_WORDINESS_PREMIUM_OFF_OF)

[style] ~11-~11: In American English, abbreviations like “etc.” require a period.
Context: ...ate to false (e.g., {}, "", None, etc) are treated as non-detections 2) B...

(ETC_PERIOD)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Sourcery review

🔇 Additional comments (1)

docs/custom_detectors.md (1)

1-41: Excellent documentation of custom detector framework.

This documentation clearly explains the rules for creating custom detectors, the purpose of the utility decorators (@use_instruments and @non_blocking), and provides helpful references to implementation examples. The structure is logical and the security restrictions are well-documented. Past typos have been corrected, and the content aligns well with the PR objectives for exposing custom detector capabilities.

Add custom metrics and non-blocking guardrails to custom detectors

077a77d

sourcery-ai bot reviewed Nov 3, 2025

View reviewed changes

coderabbitai bot reviewed Nov 3, 2025

View reviewed changes

m-misiura reviewed Nov 3, 2025

View reviewed changes

RobGeada and others added 3 commits November 3, 2025 15:41

Update docs/custom_detectors.md

6ee377a

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

Update docs/custom_detectors.md

c4b8d89

Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com>

Update docs/custom_detectors.md

dc2f360

Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com>

coderabbitai bot reviewed Nov 3, 2025

View reviewed changes

m-misiura approved these changes Nov 3, 2025

View reviewed changes

RobGeada merged commit cbe5b63 into trustyai-explainability:main Nov 4, 2025
2 of 3 checks passed

		def add_instrument(self, instrument):
		self.instruments[instrument._name] = instrument


		forbidden_names = [use_instruments.__name__, non_blocking.__name__]

		def get_underlying_function(func):

Feat: Custom user metrics and non-blocking guardrails #57

Feat: Custom user metrics and non-blocking guardrails #57

Uh oh!

Conversation

RobGeada commented Nov 3, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by Sourcery

Summary by CodeRabbit

Uh oh!

sourcery-ai bot commented Nov 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviewer's Guide

Sequence diagram for non-blocking guardrail execution

Entity relationship diagram for Prometheus metrics registry

Class diagram for new decorators and instrumentation API

File-Level Changes

Interacting with Sourcery

Customizing Your Experience

Getting Help

Uh oh!

coderabbitai bot commented Nov 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Other AI code review bot(s) detected

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

Pre-merge checks and finishing touches

Uh oh!

sourcery-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Nov 3, 2025

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Nov 3, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Nov 3, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

m-misiura Nov 3, 2025

Choose a reason for hiding this comment

Uh oh!

m-misiura Nov 3, 2025

Choose a reason for hiding this comment

Uh oh!

m-misiura Nov 3, 2025

Choose a reason for hiding this comment

Uh oh!

m-misiura Nov 3, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

RobGeada commented Nov 3, 2025 •

edited by coderabbitai bot

Loading

sourcery-ai bot commented Nov 3, 2025 •

edited

Loading

coderabbitai bot commented Nov 3, 2025 •

edited

Loading