Skip to content

Conversation

@powerslider
Copy link
Contributor

@powerslider powerslider commented Dec 3, 2025

Why this should be merged

Static state sync blocks until completion, causing the node to fall further behind as new blocks arrive. Dynamic state sync addresses this by:

  • Queueing block operations during sync for deferred execution.
  • Updating sync targets as new blocks are accepted.
  • Processing queued blocks after sync completes.

Check #4582

How this works

Check the full reference documentation here.

How this was tested

new unit tests (will add e2e later)

Need to be documented in RELEASES.md?

for now no

@powerslider powerslider requested a review from a team as a code owner December 3, 2025 13:10
@powerslider powerslider changed the title Powerslider/1259 dyn sync client support feat(vmsync): dynamic state sync with coordinator, pivot cadence, and engine-driven target updates Dec 3, 2025
@powerslider powerslider moved this to In Progress 🏗️ in avalanchego Dec 3, 2025
@powerslider powerslider removed this from avalanchego Dec 3, 2025
@powerslider powerslider linked an issue Dec 3, 2025 that may be closed by this pull request
@powerslider powerslider marked this pull request as draft December 3, 2025 13:15
… engine-driven target updates

- Add Coordinator to orchestrate dynamic state sync, enforce pivot cadence, and manage queue execution.
- Introduce engine hook OnEngineAccept to enqueue accepted blocks and advance the sync target.
- Implement pivot policy (every N blocks) and idempotence (skip behind/equal, allow same-height reorgs).

resolves #1259

Signed-off-by: Tsvetan Dimitrov ([email protected])
When UpdateSyncTarget is called, remove all queued blocks with height <=
new target height since they will never be executed. This prevents
processing blocks that the sync has already advanced past.

- Add RemoveBlocksBelowHeight method to blockQueue to filter stale blocks.
- Call RemoveBlocksBelowHeight in UpdateSyncTarget after pivot check.
- Support accept/reject/verify operations in block queue.
- Add OnEngineReject and OnEngineVerify handlers to sync client.
- Propagate context through ApplyQueuedBatch for proper cancellation.
- Remove unnecessary defer vm.versiondb.Abort() from Accept.
- Prevent recursion during batch execution via state check.
- Make dequeueBatch private to reduce API surface.

resolves #1259
Signed-off-by: Tsvetan Dimitrov ([email protected])
- Add context parameter to finishSync() and propagate through stateSyncStatic/Dynamic
- Add context parameter to FinalizeVM callback in Coordinator
- Add context parameter to ProcessQueuedBlockOperations (renamed from ApplyQueuedBatch)
- Add context parameter to executeBlockOperationBatch (moved from blockQueue)
- Propagate context through ProcessQueue operations
- Add cancellation checks before expensive operations in finishSync() using declarative
  operation list pattern with runWithCancellationCheck helper.
- Add cancellation checks in ProcessQueuedBlockOperations before state transitions.
- Add cancellation checks in executeBlockOperationBatch loop using select pattern.
- Improve error messages to include operation index and type for better debugging.

Refactoring:
- Move block operation processing logic from blockQueue to Coordinator
  (executeBlockOperationBatch) for better separation of concerns.
- Simplify blockQueue to be a pure data structure (enqueue, dequeueBatch, removeBelowHeight).
- Rename pivot.go to pivot_policy.go for clarity.
- Remove cancel function from Coordinator struct, pass as parameter to finish().

Pivot Policy:
- Add defaultPivotInterval constant (10000 blocks) in pivot_policy.go.
- Apply default pivot interval when WithPivotInterval is not explicitly called.
- Update newPivotPolicy to use default when interval is 0.

This change enables graceful shutdown of state sync operations and ensures
that cancellation signals propagate correctly through all layers of the
dynamic state sync orchestration.

resolves #1259

Signed-off-by: Tsvetan Dimitrov ([email protected])
- Refactor block operation handling and error management to improve
maintainability, reduce code duplication, and enhance type safety.
Enable blocks to be enqueued during StateExecutingBatch for processing
in the next batch, while preventing recursion by skipping sync target
updates during batch execution.

Block Enqueuing During Batch Execution:
- Update AddBlockOperation to allow enqueuing during both StateRunning
  and StateExecutingBatch states.
- Remove early return check in enqueueBlockOperation that prevented
  enqueuing during batch execution.
- Blocks enqueued during batch execution are automatically processed
  in the next batch (via dequeueBatch snapshot behavior).

Prevent Recursion:
- Skip UpdateSyncTarget in OnEngineAccept when state is StateExecutingBatch
- Blocks are still enqueued during batch execution, but sync target
  updates are deferred to prevent recursion.
- Add documentation explaining the behavior.

Code Simplification:
- Simplify finishSync cancellation checks from per-operation checks
  to single check at beginning.
- Operations are not cancellable mid-execution, so single check is
  sufficient and more efficient.

This change ensures blocks arriving during batch execution can be
queued for the next batch (solving dependency issues) while maintaining
fast consensus-critical paths and preventing recursion.
…o sync flow

- Add Finalize method to the Syncer interface and implement it across all
syncer types. Integrate Finalize calls into both static and dynamic state
sync flows to allow syncers to clean up their state before VM finalization.
- Dynamic sync: Syncers complete -> Finalize syncers -> Finalize VM -> Execute batch
- Static sync: Syncers complete -> Finalize syncers -> Finalize VM
…mic state sync

Fix critical issues in dynamic state sync that could cause blocks to be
processed twice and introduce race conditions. Improve error handling and
state consistency throughout the sync flow.

Double Execution Prevention:
- Change OnEngineAccept/Reject/Verify to return (bool, error) indicating
  whether block was enqueued for deferred processing
- Update wrapped_block.Accept/Reject/Verify to skip immediate execution
  when block is enqueued during dynamic sync which prevents blocks from
  being processed both immediately and from queue.

Race Condition Fixes:
- Add state re-check in UpdateSyncTarget before modifying queue to handle
  concurrent state transitions.
- Prevent UpdateSyncTarget from being called during batch execution to
  avoid race with removeBelowHeight.

State Consistency Improvements:
- Set StateAborted on all error paths in ProcessQueuedBlockOperations
  before returning to ensure consistent state.
- Add context checks at critical points (before FinalizeVM, before batch
  execution) to catch cancellations early.
- Ensure state transitions are atomic with error handling.

Error Handling Enhancements:
- Improve OnEngineAccept error handling when UpdateSyncTarget fails
- Return clear error message indicating block was enqueued but sync target
  update failed.
- Reorder operations to check batch execution state before enqueuing.
Extract sync orchestration into a strategy pattern to improve code
organization and separation of concerns.

- Add SyncStrategy interface in client.go for sync orchestration.
- Extract VM finalization logic to finalizer.go with sentinel errors.
- Add staticStrategy for sequential sync without block queueing.
- Add dynamicStrategy wrapping Coordinator for concurrent sync.
- Simplify client.go by delegating to strategies.
- Simplify sync_target.go by removing redundant ID field.
- Move syncer creation to standalone newSyncerRegistry function.
- Replace embedded *ClientConfig with named config field for clearer
  field access. Convert newSyncerRegistry to client method.
@powerslider powerslider force-pushed the powerslider/1259-dyn-sync-client-support branch from dbb76a9 to 05f5763 Compare December 3, 2025 21:11
@powerslider powerslider changed the base branch from master to powerslider/4651-sync-client-strategy-support December 3, 2025 21:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add support for dynamic state syncing to the sync client

3 participants