-
Notifications
You must be signed in to change notification settings - Fork 15
Description
Is your feature request related to a problem? Please describe.
I'm always frustrated when we can't have policy-defined delayed placement, make some copies initially and let the system handle the rest in the background. CopiesNumber tried to do that, but:
- it's per-request, data redundancy policy is not something to be decided per-request (N=1 is not OK in most cases, but fine with CopiesNumber)
- it's not expressive enough, people may want some locality for initial placement (put into one DC, replicate to another) or the other way around, spread to at least N locations before spreading to another M
Describe the solution you'd like
Use main policy as a source of data. That's critical, the node set must remain the same. We have some vectors and REPs there (or just vectors for EC). Add the following set of settings to container as initial placement constraints: max replicas number, locality preference flag and a set of replica numbers per placement vector (+EC).
The way it works is:
- "max replicas" limits the maximum number of replicas if set (if not -- follow per-vector replica numbers if present), no more replicas done (EC vector is treated as a single replica)
- locality preference tunes the behavior of placement vector processing, if set we're trying to satisfy replica count using vectors that contain current node, using different vectors otherwise
- replica numbers limits specific vectors to some numbers of replicas
Practical constraints for "REP 2 in MSK REP 2 in SPB REP 2 in NSK" policy:
- MaxReplicas=2, local --- storing 2 replicas in the same location as current node initially
- MaxReplicas=2 --- pushes 1 replica into some location and 1 into another one
- [1, 1, 1] --- pushes 1 replica into every location
- MaxReplicas=3, local --- 2 replicas locally, 1 elsewhere
- MaxReplicas=2, local, [2, 2, 0] --- two local replicas, but never in NSK
The scheme seems to be expressive enough to cover all potential use cases. In each case remaining replicas (or EC plcaments) are done asynchronously. Node policer can be optimized to perform these replications with priority to regular checks/relocations.
Deprecate CopiesNumber.