-
Notifications
You must be signed in to change notification settings - Fork 27
feat: query multiple beacon nodes in parallel #751
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: unstable
Are you sure you want to change the base?
Conversation
|
Claude finished @petarjuki7's task —— View job Code Review CompleteI've completed a comprehensive review of the parallel beacon node querying implementation. While the approach is sound, there are several critical issues that need to be addressed before this can be merged. Critical Issues
Design Concerns
Minor Issues
Testing Gaps
RecommendationsBefore merging:
The parallel querying approach is a solid improvement over sequential fallback, but the implementation needs refinement before it's production-ready. |
|
|
||
| use crate::{AnchorValidatorStore, ContributionWaiter, SlotMetadata}; | ||
|
|
||
| const SOFT_TIMEOUT: Duration = Duration::from_millis(500); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Magic Numbers: These timeout values should be documented or made configurable. Consider:
- Why is soft timeout exactly 500ms?
- How do these values relate to slot timing (typically 12s)?
- What happens if beacon nodes are consistently slower than 500ms but faster than 1s?
Add documentation explaining the rationale, or make these configurable based on network conditions.
| .data; | ||
|
|
||
| // Calculate base score from checkpoint epochs (higher epochs = more recent) | ||
| let base_score = (attestation_data.source.epoch.as_u64() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Scoring Algorithm Concerns:
-
Epoch Collision: Multiple attestation data with different roots but same source/target epochs get identical base scores. This could select suboptimal data.
-
Bonus Weighting: The proximity bonus (
1.0 / (1 + distance)) is tiny compared to base score. For example:- Base score: ~400,000 (epoch 200,000)
- Max bonus: 1.0
The bonus is essentially negligible and won't affect selection.
-
Missing Safety Checks: No validation that source < target, or that checkpoints are reasonable.
Consider:
- Normalizing scores or making bonus more significant
- Adding validation of attestation data consistency
- Implementing majority agreement checking (as mentioned in TODO)
| client: &BeaconNodeHttpClient, | ||
| block_root: Hash256, | ||
| ) -> Option<Slot> { | ||
| tokio::time::timeout(BLOCK_SLOT_LOOKUP_TIMEOUT, async { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unbounded Timeout: If the block lookup hangs or is very slow, this adds 250ms to every attestation fetch. Consider:
- Using a shared timeout for all block lookups
- Caching block slot lookups
- Making block lookup optional with a fast-fail path
The 250ms timeout per client could accumulate to significant delays when querying many nodes.
3b8ef15 to
de9082f
Compare
|
Claude finished @petarjuki7's task in 4m 21s —— View job Code Review CompleteI've completed a comprehensive review of the parallel beacon node querying implementation. The PR adds good functionality, but there are several critical issues that must be addressed before merging. Critical Issues1. Compilation Error - Async Closure Captures
|
|
|
||
| const SOFT_TIMEOUT: Duration = Duration::from_millis(500); | ||
| const HARD_TIMEOUT: Duration = Duration::from_secs(1); | ||
| const BLOCK_SLOT_LOOKUP_TIMEOUT: Duration = Duration::from_millis(125); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Documentation Needed: These timeout constants need justification. Consider adding:
/// Soft timeout for beacon node responses. If we receive at least one valid
/// attestation data response by this timeout, we'll return early with the best
/// scored result. This is set to 500ms to balance responsiveness with giving
/// slower nodes a chance to respond.
const SOFT_TIMEOUT: Duration = Duration::from_millis(500);
/// Hard timeout for beacon node responses. After this timeout, we return
/// unconditionally with whatever responses we've received (or error if none).
/// Set to 1s to ensure we have time to fetch attestation data before the
/// attestation deadline (slot time is 12s).
const HARD_TIMEOUT: Duration = Duration::from_secs(1);
/// Timeout for looking up a block's slot number by root. This is an optional
/// enhancement to scoring, so we use a short timeout to avoid delaying the
/// overall attestation data fetch.
const BLOCK_SLOT_LOOKUP_TIMEOUT: Duration = Duration::from_millis(125);Questions to address:
- Why 500ms soft / 1s hard? How were these values determined?
- What happens if beacon nodes are consistently slower than 500ms but faster than 1s?
- How do these relate to attestation deadline constraints?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Timeouts taken inspiration from the same parts of code in the Go implementation of SSV.
They use:
Soft timeout = 2 seconds
Hard timeout = 5 seconds
Block slot lookup timeout = 1.25 seconds
I found that a bit too long so adjusted to 1 second, 3 seconds and 0.5 seconds respectively
|
Some required checks have failed. Could you please take a look @petarjuki7? 🙏 |
Issue Addressed
WAD - Weighted Attestation Data
Proposed Changes
The implementation uses
FuturesUnorderedto query all beacon nodes concurrently and selects the highest-scoring attestation data. Scoring works by summing source and target epochs (base score) plus a proximity bonus based on how close the head slot is to the attestation slot.Additional Info