-
Notifications
You must be signed in to change notification settings - Fork 11
Expose builtin cache API #33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Expose builtin cache API #33
Conversation
Signed-off-by: Philipp Zagar <[email protected]>
shomron
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for submitting, @philippzagar!
I have a few comments / questions. Also if you can add some words in the PR description about the change it would be much appreciated! 🙏🏻
| public func evaluate( | ||
| input: AST.RegoValue = .undefined, | ||
| tracer: OPA.Trace.QueryTracer? = nil, | ||
| cache: BuiltinsCache = .init(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that this is a per-evaluation cache, and the scope of this function is a new evaluation, what is the use case for allowing the caller to pass in an existing cache, which would likely violate our constraints around non-concurrent access?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great question! I completely agree that in most cases, the caller shouldn’t pass in an existing cache.
By default, each evaluation should have its own isolated cache instance.
That said, we’ve encountered scenarios where it’s useful to reuse a warmed cache across multiple sequential evaluations (never concurrently). This can significantly reduce overhead for deterministic builtins that e.g. perform costly network calls that are stable for a short period of time. To support that optimization, I exposed the cache parameter on evaluate() so developers can opt in when it’s safe.
I’ll extend the documentation to make it clear that the cache must never be shared across concurrent evaluations, only reused sequentially under controlled conditions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this mixes two separate types of caches in a problematic way.
For the per-evaluation cache we make no accommodations for synchronization. We agree this does not need to be exposed externally.
There's a separate concept of an inter-query cache, which spans evaluations, where we need to give consideration to synchronization, TTL/expiration, etc. I propose we keep these two layers of cache separate and take on designing the inter-query cache in a separate PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, agreed. In the long run, we should absolutely separate the two cache types with clear semantics (per-evaluation vs. inter-query, including TTL and synchronization) and proper API.
That said, the current approach simply offers an opt-in escape hatch for advanced users to reuse a warmed cache sequentially, never concurrently. It’s clearly and extensively documented and leaves both the choice and responsibility to the developer.
A full inter-query cache will definitely need a proper design pass with synchronization and TTL management, but until then, this lightweight, documented option offers a practical middle ground, giving developers flexibility without impacting default behavior.
Also worth noting: the QueryTracer parameter follows a very similar model, it’s not enforced as Sendable and developers are responsible to ensure it’s not shared across concurrent evaluations. So this pattern doesn’t really break precedent; it simply extends the same level of flexibility to caching.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also worth noting: the QueryTracer parameter follows a very similar model, it’s not enforced as Sendable and developers are responsible to ensure it’s not shared across concurrent evaluations. So this pattern doesn’t really break precedent;
That precedent seems like a bug to me - the QueryTracer should absolutely be Sendable to indicate its use across concurrency contexts.
Signed-off-by: Philipp Zagar <[email protected]>
anderseknert
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't say much about the code, but some questions around docs / communication. Great work, Philip!
Signed-off-by: Philipp Zagar <[email protected]>
Signed-off-by: Philipp Zagar <[email protected]>
This PR exposes the builtin cache API, allowing builtins to access a shared cache within policy evaluation via
BuiltinContext/cache.Each cache entry is identified by a composite key consisting of a
String-based key and an optional namespace (defaulting to a global namespace) and maps to aRegoValuevalue.The cache provides Swift-native access through subscript syntax (
cache[key],cache[key] = value).Internally, the cache is backed by a simple Swift dictionary of type
[CompositeKey: RegoValue].Example
Usage Notes
evaluate()