Proposal: Plan Validation mechanism for TaskRunner API #1541
Replies: 4 comments 3 replies
-
|
As discussed offline, my suggestion would be to move |
Beta Was this translation helpful? Give feedback.
-
|
@theakshaypant please also add more details regarding the need for the |
Beta Was this translation helpful? Give feedback.
-
|
Thanks for putting together this proposal @theakshaypant. I get a feeling that plan validation is a disproportionately involved activity. I suggest limiting what we deliver first for the following reasons:
As of today, very few OpenFL features have (*) asterisks on them. It is OK in my opinion, to limit the scope of The current Overall, my recommendation is for us to be cautious of the scope of verification, and keep it as simple as possible (it is already a great start with one function that user need not worry about). We have to be cautious as a team, that |
Beta Was this translation helpful? Give feedback.
-
|
Thanks for the detailed clarification @theakshaypant
Overall, we both do agree on having a thin verification check. As for implementation (yaml or not), I am OK with the consensus approach here |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Summary
In OpenFL TaskRunner API 1.8, secure aggregation was introduced with a limitation that it only worked with the
WaitForAllPolicystraggler handling policy, and any misconfiguration wasn’t detected until runtime, leading to unnecessary reinitialization. To address this, a validation mechanism was introduced during thefx plan initializestep, and this proposal extends that mechanism using a structured validation manifest. This file defines compatibility rules and constraints for FL plans usingtriggers,requires,allowed, andforbiddenkeys, enabling early detection of invalid configurations. The system supports both single-file and modular multi-file formats for maintainability, ensuring that incompatible features are flagged upfront. While it supports only equality-based and AND-composite validations at initialization, it significantly improves plan reliability and user experience by enforcing constraints early in the federated learning workflow.Motivation
In OpenFL TaskRunner API 1.8, secure aggregation was introduced with a constraint that it could only be used alongside
WaitForAllPolicystraggler handling policy. Without any validations or verifications of the plan, if secure aggregation was enabled with any other straggler handling policy, an error would be raised only after the experiment is started; this would require the model owner to re-initialize and redistribute the plan with the appropriate value for straggler handling policy. As a stop gap measure, a planverifymethod was introduced; whenfx plan initializecommand is run an error is raised if an incompatible straggler handling policy was used with secure aggregation.Extending the same for other parameters in the FL plan, a need for validation of plan was identified before its distribution such that any incompatible features are not enabled together or any other similar constraints are met early in the FL process.
High-Level Design
Technical Details
In the current implementation, when
fx plan initialize(Ref.) is called, the validation of plan happens when the plan is parsed. The proposal retains the existing flow and targets changes solely in its implementation.validation.yaml
Given that it is hard (and long) to define all the constraints within a python script, we propose to add a
validation.yamlfile which is used to define and reference all the constraints for an FL plan.This validation manifest serves two purposes:
Similar to the
defaultsdirectory, we introduce aopenfl-workspace/workspace/plan/validationsdirectory which would store all the constraints and compatibility checks for a plan.File definition
The file is defined in a YAML format and contains a map of all validations that need to be done with the top level keys being an arbitrary string to only name the validation.
For each feature, we propose two ways to define the constraints.
useskey word that is used to point to a reference file for definition of constraints for that featurevalidation.yamlis ignored ifuseskeyword is present.This proposal permits the use of both definitions.
Constraint definition
Apart from the
useskey (for a separate file), we introduce the following keys for constraint definition.triggers: Map of triggers if met, the constraints are checked else no validation required for that feature.super_key.nested_key1.nested_key2.key1 == value2 && key2 > value3 && key2 < value4 && key3 >= value5 && key3 <= value6..in key indicates nested key in the plan sosuper_key.nested_key1.nested_key2.key1:would look like this in the FL plan..in the values does not mean anything and is matched as is with the value in plan.yaml.trigger_valueindicates the values for which the constraint check is done. Can be a string or a list.rangedefines a range of values within which the constraint would be valid.requires: List of other constraints that need to be validated when the current definition is triggered.feature_xandfeature_yshould be set to their trigger values and their respective constraints are also met.feature_arequires forfeature_xandfeature_yto be enabled, instead of duplicating the constraints underfeature_a, the constraints defined for x and y can be referenced here.allowed: Defines the list of keys that need to be set to certain values in order for the plan to be valid.super_key.nested_key.key1 == value1 AND (key2 == value2 OR key2 == value3.value4).forbidden: Defines the list of keys that cannot be set to certain values in order for the plan to be valid.allowedin terms of validation.super_key.nested_key.key3 != value1 AND (key4 != value6 OR key4 != value7.value8).super_key.key1 > value 1 && super_key.key1 < value2 && super_key.key1 >= value3 && super_key.key1 <= value4.Combining all the elements
For a single constraint, this is how the validation would look like with dependencies on constraints feature_x and feature_y.
Incompatible features
WaitForAllPolicy.db_store_roundsshould be set to greater than 1.Scope
Limitations
Open Questions
validation.yamlfile.Next Steps
verifymethod to make it pure and add other incompatible features there.Beta Was this translation helpful? Give feedback.
All reactions