-
Notifications
You must be signed in to change notification settings - Fork 9
perf: memoize IR plan calls to reduce reevaluation cost #36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
4a15eb5 to
7d55262
Compare
anderseknert
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice! Can't wait to try this out.
Sources/Rego/IREvaluator.swift
Outdated
| isDynamic: Bool = false | ||
| ) async throws -> AST.RegoValue { | ||
| // Check memo cache if applicable to save repeated evaluation time for rules | ||
| let shouldMemoize = args.count == 2 // Currently support _rules_, not _functions_ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Currently support _rules_, not _functions_
Perhaps add a line or two here to explain why this is for anyone without that context. Which could very well be us a year or two from now :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll leave it to @srenatus to provide additional context on this - I derived this from the WASM backend logic but do not understand the full nuance yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would also be interested in the context here! :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Taking a step back, in the IR, there are only functions, no (complete/partial or single-value/multi-value) rules as you know in Rego. The first two arguments are always present, it's data and input. Addition arguments come in through two means:
- it's a user-defined (rego) function, so it has extra args
- it's subject to a certain kind of with-replacement of (rego, builtin or user-defined) functions with a local variable. Those variables are turned into extra args, as otherwise the information wouldn't be available down the call stack where the with-replaced function might be used.
It's fair to say that (2.) is rather uncommon. You might do that in tests, but it's a concern for production policies, I would say.
Within a single eval of a policy (IR or Rego), data and input is static -- it won't change in between. So if an IR function only has two arguments, those will always be data and input, it's going to be a (Rego) rule. Caching these requires no further tracking of the values of extra arguments. If we, for example, wanted to cache this user-defined function,
allow(a, b) if { ... }we'd need to put an element into the cache for any values of a and b we encounter. This can be undesirable, as you might call the function often with varying arguments -- that's why it's a function in the first place.
Caching only rules (IR functions with two arguments) is thus a middle ground between memory usage and performance improvements.
What about with you might ask, that can change input and data in arbitrary ways. It can, but when the planner in OPA encounters a with replacement, it'll increase a generation counter and start planning new functions. E.g.
package t
allow if { # planned as `g0.data.t.allow`
something # gets planned as `g0.data.t.something`
something with input.foo as "bar" # this gets planned as `g1.data.t.something`
}
something if other_rule
other_rule if input.foo == "bar" # both planned as `g0.data.t.other_rule` and as `g1.data.t.other_rule`, called accordingly in the function `g0.data.t.allow`This also ensures that a simple cache mechanism, caching the values of the IR-names of the functions -- cache keys like g0.data.t.something and g1.data.t.something, not data.t.something -- will yield the correct results.
Does that help for context? :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great write down, thanks for making the effort!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Certainly! Linking to this comment would be great :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the first time I've ever seen the little g0.data.x prefix explained, and that makes a lot of sense now that I know what it's for! 😄 I'd previously assumed it was an arbitrary compiler prefix, probably because I'd never plugged in a policy that would have generated a g1.data.x function variant.
philippzagar
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome to see this finally coming together, and I’m excited to see the performance gains in action! 🚀
Sources/Rego/Memo.swift
Outdated
| /// Note we capture the arguments as unresolved operands and not resolved values, | ||
| /// as hashing the values was proving extremely expensive. We instead rely on the | ||
| /// invariant the the plan / evaluator will not modify a local after it has been initally set. | ||
| internal struct InvocationKey: Hashable { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There’s no need to explicitly mark symbols as internal in Swift, it’s the default access level. Adding it for the sake of being explicit isn’t common practice (rarely done in Swift Server projects) and is generally discouraged.
This applies to all similar changes in this PR.
Sources/Rego/IREvaluator.swift
Outdated
| isDynamic: Bool = false | ||
| ) async throws -> AST.RegoValue { | ||
| // Check memo cache if applicable to save repeated evaluation time for rules | ||
| let shouldMemoize = args.count == 2 // Currently support _rules_, not _functions_ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would also be interested in the context here! :)
| block: stmt.block | ||
| ) | ||
|
|
||
| let blockResult = try await ctx.callMemo.withPush { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sweet, I like the with-style syntax in this use case, great idea that prohibits forgetting to pop.
Sources/Rego/IREvaluator.swift
Outdated
| init(ctx: EvaluationContext, policy: IndexedIRPolicy) { | ||
| self.ctx = ctx | ||
| self.policy = policy | ||
| self.callMemo = MemoStack.init() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’d suggest moving this initialization directly into the variable declaration for consistency with the callDepth and maxCallDepth initializations. That would also make the explicit initializer unnecessary, but I don’t feel strongly about it.
7d55262 to
4de8294
Compare
| /// withPush returns the result of calling the provided closure with | ||
| /// a fresh memoCache pushed on the stack. The memoCache will only be | ||
| /// active during that call, and discarded when it completes. | ||
| func withPush<T>(_ body: () async throws -> T) async rethrows -> T { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could use sending T here, but it doesn't make a big difference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, probably a good idea to explicitly pass down the isolation domain via:
isolation: isolated (any Actor)? = #isolation,
so that we avoid an actor hop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not clear to me that those are needed here? We're not doing an actor hop that I can see, and this stack is only used within a single evaluation, not across evaluations. Please let me know if I'm misunderstanding 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any method without isolation inheritance is going to hop defensively. The recommendation with Swift 6.2 is to adopt the upcoming language feature NonIsolatedNonSendingByDefault which makes all async methods nonisolated(nonsending)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shomron That shouldn't be a huge performance loss in this case, however this applies to the entirety of swift-opa and should be kept in mind
| } | ||
| } | ||
|
|
||
| func push() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion: Mark these funcs as @inlineable, should enable the compiler to inline these calls.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I suggest we revisit this optimization if we see any measurable impact - my hunch is that we will not see excessive with statements in the hot path to the extent that a function call will be an issue.
Adds a per-evaluation call memoization stack to the IREvaluationContext. This currently supports rules, not arbitrary functions, based on arity. For a policy with many repeated calls, we see evaluation time drop from ~241ms to ~2ms. No regression in compliance tests. Signed-off-by: Oren Shomron <[email protected]>
4de8294 to
3811c03
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approval with limited Swift knowledge
Adds a per-evaluation call memoization stack to the IREvaluationContext. This currently supports rules, not arbitrary functions, based on arity. For a policy with many repeated calls, we see evaluation time drop from ~241ms to ~2ms. No regression in compliance tests.