Skip to content

Conversation

@philippzagar
Copy link
Contributor

@philippzagar philippzagar commented Oct 9, 2025

This PR exposes the builtin cache API, allowing builtins to access a shared cache within policy evaluation via BuiltinContext/cache.

Each cache entry is identified by a composite key consisting of a String-based key and an optional namespace (defaulting to a global namespace) and maps to a RegoValue value.
The cache provides Swift-native access through subscript syntax (cache[key], cache[key] = value).

Internally, the cache is backed by a simple Swift dictionary of type [CompositeKey: RegoValue].

Example

"custom_cache_builtin": { ctx, args in
    // Read / Write from / to builtin cache
    ctx.cache["test_key"] = .string("test_value")
    let value = ctx.cache["test_key"] ?? .string("fallback")

    return value
}

Usage Notes

  • A new cache is automatically created for each policy evaluation via evaluate()
  • Developers may optionally reuse an existing cache across evaluations (for example, to avoid repeating costly deterministic network requests).

Signed-off-by: Philipp Zagar <[email protected]>
Copy link
Contributor

@shomron shomron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for submitting, @philippzagar!

I have a few comments / questions. Also if you can add some words in the PR description about the change it would be much appreciated! 🙏🏻

public func evaluate(
input: AST.RegoValue = .undefined,
tracer: OPA.Trace.QueryTracer? = nil,
cache: BuiltinsCache = .init(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that this is a per-evaluation cache, and the scope of this function is a new evaluation, what is the use case for allowing the caller to pass in an existing cache, which would likely violate our constraints around non-concurrent access?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great question! I completely agree that in most cases, the caller shouldn’t pass in an existing cache.
By default, each evaluation should have its own isolated cache instance.

That said, we’ve encountered scenarios where it’s useful to reuse a warmed cache across multiple sequential evaluations (never concurrently). This can significantly reduce overhead for deterministic builtins that e.g. perform costly network calls that are stable for a short period of time. To support that optimization, I exposed the cache parameter on evaluate() so developers can opt in when it’s safe.

I’ll extend the documentation to make it clear that the cache must never be shared across concurrent evaluations, only reused sequentially under controlled conditions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this mixes two separate types of caches in a problematic way.
For the per-evaluation cache we make no accommodations for synchronization. We agree this does not need to be exposed externally.

There's a separate concept of an inter-query cache, which spans evaluations, where we need to give consideration to synchronization, TTL/expiration, etc. I propose we keep these two layers of cache separate and take on designing the inter-query cache in a separate PR.

Copy link
Contributor Author

@philippzagar philippzagar Oct 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, agreed. In the long run, we should absolutely separate the two cache types with clear semantics (per-evaluation vs. inter-query, including TTL and synchronization) and proper API.

That said, the current approach simply offers an opt-in escape hatch for advanced users to reuse a warmed cache sequentially, never concurrently. It’s clearly and extensively documented and leaves both the choice and responsibility to the developer.

A full inter-query cache will definitely need a proper design pass with synchronization and TTL management, but until then, this lightweight, documented option offers a practical middle ground, giving developers flexibility without impacting default behavior.

Also worth noting: the QueryTracer parameter follows a very similar model, it’s not enforced as Sendable and developers are responsible to ensure it’s not shared across concurrent evaluations. So this pattern doesn’t really break precedent; it simply extends the same level of flexibility to caching.

Copy link
Contributor

@shomron shomron Oct 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also worth noting: the QueryTracer parameter follows a very similar model, it’s not enforced as Sendable and developers are responsible to ensure it’s not shared across concurrent evaluations. So this pattern doesn’t really break precedent;

That precedent seems like a bug to me - the QueryTracer should absolutely be Sendable to indicate its use across concurrency contexts.

@philippzagar philippzagar requested a review from shomron October 10, 2025 13:16
Copy link
Member

@anderseknert anderseknert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't say much about the code, but some questions around docs / communication. Great work, Philip!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants