Skip to content

feat(server): add ingest token guardrails MVP#1554

Open
yeyitech wants to merge 1 commit intovolcengine:mainfrom
yeyitech:feat/rfc1430-token-guardrails
Open

feat(server): add ingest token guardrails MVP#1554
yeyitech wants to merge 1 commit intovolcengine:mainfrom
yeyitech:feat/rfc1430-token-guardrails

Conversation

@yeyitech
Copy link
Copy Markdown
Contributor

Summary

  • add server config for ingest token guardrails on add_resource and add_skill
  • enforce preflight token estimates in ResourceService and return RESOURCE_EXHAUSTED when limits are exceeded
  • cover config loading, service-level blocking/allow paths, and structured HTTP 429 behavior

Scope

Ingest preflight token guardrails MVP for discussion #1430.

Testing

  • PYTEST_DISABLE_PLUGIN_AUTOLOAD=1 PYTHONPATH="/tmp/ov-stubs:${PYTHONPATH}" pytest -o addopts='' tests/service/test_token_guardrails.py tests/metrics/config/test_server_config.py -q
  • python -m py_compile openviking/server/app.py openviking/server/config.py openviking/service/resource_service.py openviking_cli/client/http.py openviking_cli/exceptions.py tests/metrics/config/test_server_config.py tests/server/test_error_scenarios.py tests/service/test_token_guardrails.py

Notes

  • HTTP/server pytest coverage for the new 429 case is included in the diff, but full async pytest execution is blocked in this environment by the installed pytest-asyncio collector.

@CLAassistant
Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.


Codex seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.

@github-actions
Copy link
Copy Markdown

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 3 🔵🔵🔵⚪⚪
🏅 Score: 80
🧪 PR contains tests
🔒 No security concerns identified
✅ No TODO sections
🔀 Multiple PR themes

Sub-PR theme: Add token guardrails server config

Relevant files:

  • openviking/server/config.py
  • tests/metrics/config/test_server_config.py

Sub-PR theme: Add ResourceExhaustedError and client mapping

Relevant files:

  • openviking_cli/exceptions.py
  • openviking_cli/client/http.py

Sub-PR theme: Implement token guardrails enforcement in ResourceService

Relevant files:

  • openviking/service/resource_service.py
  • openviking/server/app.py
  • tests/service/test_token_guardrails.py
  • tests/server/test_error_scenarios.py

⚡ Recommended focus areas for review

Missing Import

The function get_current_telemetry is used in _enforce_token_guardrail but not imported, which will cause a NameError at runtime when the guardrail is triggered.

telemetry = get_current_telemetry()
Performance Risk

The _estimate_path_payload_tokens function uses Path.rglob("*") to recursively traverse directories, which can lead to unbounded processing time and I/O for large or deeply nested directory trees, potentially causing timeouts or excessive resource usage.

total_bytes = 0
for item in candidate.rglob("*"):
    if item.is_file():
        try:
            total_bytes += item.stat().st_size
        except OSError:
            continue
return (total_bytes + 3) // 4

@github-actions
Copy link
Copy Markdown

PR Code Suggestions ✨

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
Possible issue
Offload sync file I/O to thread pool

Wrap the synchronous file I/O operations (exists, stat, rglob) in
asyncio.to_thread() to avoid blocking the async event loop. Since
_estimate_path_payload_tokens is called from async functions (add_resource,
add_skill), offloading sync I/O to a thread pool prevents event loop stalls.

openviking/service/resource_service.py [127-145]

-def _estimate_path_payload_tokens(self, path: str) -> int:
-    candidate = Path(path)
-    try:
-        exists = candidate.exists()
-    except (OSError, ValueError):
-        exists = False
-    if exists:
-        if candidate.is_file():
-            return (candidate.stat().st_size + 3) // 4
-        if candidate.is_dir():
-            total_bytes = 0
-            for item in candidate.rglob("*"):
-                if item.is_file():
-                    try:
-                        total_bytes += item.stat().st_size
-                    except OSError:
-                        continue
-            return (total_bytes + 3) // 4
-    return _estimate_tokens_from_text(path)
+async def _estimate_path_payload_tokens(self, path: str) -> int:
+    import asyncio
+    from pathlib import Path
 
+    def _sync_estimate(path: str) -> int:
+        candidate = Path(path)
+        try:
+            exists = candidate.exists()
+        except (OSError, ValueError):
+            exists = False
+        if exists:
+            if candidate.is_file():
+                return (candidate.stat().st_size + 3) // 4
+            if candidate.is_dir():
+                total_bytes = 0
+                for item in candidate.rglob("*"):
+                    if item.is_file():
+                        try:
+                            total_bytes += item.stat().st_size
+                        except OSError:
+                            continue
+                return (total_bytes + 3) // 4
+        return _estimate_tokens_from_text(path)
+
+    return await asyncio.to_thread(_sync_estimate, path)
+
Suggestion importance[1-10]: 7

__

Why: Wrapping synchronous file I/O operations (exists, stat, rglob) in asyncio.to_thread() prevents blocking the async event loop, which is a critical performance improvement for async functions like add_resource and add_skill.

Medium
Await async token estimation call

Await the now-async _estimate_path_payload_tokens call since it now uses
asyncio.to_thread(). Also update _estimate_skill_payload_tokens and its call site to
handle the async change.

openviking/service/resource_service.py [271-276]

 self._enforce_token_guardrail(
     operation="add_resource",
-    estimated_tokens=self._estimate_path_payload_tokens(path),
+    estimated_tokens=await self._estimate_path_payload_tokens(path),
     extra_tokens=_estimate_tokens_from_text(reason)
     + _estimate_tokens_from_text(instruction),
 )
Suggestion importance[1-10]: 6

__

Why: After making _estimate_path_payload_tokens async (as in suggestion 2), awaiting the call is necessary to avoid syntax errors and ensure proper async execution, making this a required follow-up correction.

Low
General
Remove redundant config application call

Remove the redundant call to _apply_server_service_config after creating the FastAPI
app. The same function is already invoked inside the lifespan handler, which runs
unconditionally for both created or injected service instances.

openviking/server/app.py [164-166]

 app.state.config = config
-if service is not None:
-    _apply_server_service_config(config, service)
Suggestion importance[1-10]: 5

__

Why: The call to _apply_server_service_config after app creation is redundant because the lifespan handler already invokes it unconditionally for both created and injected service instances, improving code clarity by removing duplication.

Low

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Backlog

Development

Successfully merging this pull request may close these issues.

2 participants