Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
45 changes: 32 additions & 13 deletions crates/blockchain/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -35,13 +35,18 @@ pub const MILLISECONDS_PER_INTERVAL: u64 = 800;
pub const INTERVALS_PER_SLOT: u64 = 5;
/// Milliseconds in a slot (derived from interval duration and count).
pub const MILLISECONDS_PER_SLOT: u64 = MILLISECONDS_PER_INTERVAL * INTERVALS_PER_SLOT;
/// Number of slots our head can lag behind the current slot before
/// validator duties are suppressed. During sync we lack a complete view
/// of the chain, so proposing or attesting would cast uninformed votes.
pub const SYNC_TOLERANCE_SLOTS: u64 = 2;
impl BlockChain {
pub fn spawn(
store: Store,
validator_keys: HashMap<u64, ValidatorSecretKey>,
is_aggregator: bool,
) -> BlockChain {
metrics::set_is_aggregator(is_aggregator);
metrics::set_is_syncing(true);
let genesis_time = store.config().genesis_time;
let key_manager = key_manager::KeyManager::new(validator_keys);
let handle = BlockChainServer {
Expand All @@ -51,6 +56,7 @@ impl BlockChain {
pending_blocks: HashMap::new(),
is_aggregator,
pending_block_parents: HashMap::new(),
is_syncing: true, // assume syncing until on_tick proves otherwise
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 lean_is_syncing metric starts at 0 despite is_syncing = true

is_syncing is initialized to true here, but metrics::set_is_syncing(true) is never called at this point. Prometheus initialises IntGauge to 0 by default, and metrics::set_is_syncing is only called inside the if now_syncing != self.is_syncing branch, which fires on transitions. Because the field starts at true, the very first tick where the node is still syncing finds now_syncing == self.is_syncing (both true) and skips the branch — so the metric stays at 0 (not-syncing) for the entire initial sync.

The fix is to initialize the metric at startup, just as set_is_aggregator is called in BlockChain::spawn:

// In BlockChain::spawn(), alongside set_is_aggregator:
metrics::set_is_syncing(true);
Suggested change
is_syncing: true, // assume syncing until on_tick proves otherwise
is_syncing: true, // assume syncing until on_tick proves otherwise
Prompt To Fix With AI
This is a comment left during a code review.
Path: crates/blockchain/src/lib.rs
Line: 58

Comment:
**`lean_is_syncing` metric starts at 0 despite `is_syncing = true`**

`is_syncing` is initialized to `true` here, but `metrics::set_is_syncing(true)` is never called at this point. Prometheus initialises `IntGauge` to `0` by default, and `metrics::set_is_syncing` is only called inside the `if now_syncing != self.is_syncing` branch, which fires on *transitions*. Because the field starts at `true`, the very first tick where the node is still syncing finds `now_syncing == self.is_syncing` (both `true`) and skips the branch — so the metric stays at `0` (not-syncing) for the entire initial sync.

The fix is to initialize the metric at startup, just as `set_is_aggregator` is called in `BlockChain::spawn`:

```rust
// In BlockChain::spawn(), alongside set_is_aggregator:
metrics::set_is_syncing(true);
```

```suggestion
            is_syncing: true, // assume syncing until on_tick proves otherwise
```

How can I resolve this? If you propose a fix, please make it concise.

}
.start();
let time_until_genesis = (SystemTime::UNIX_EPOCH + Duration::from_secs(genesis_time))
Expand Down Expand Up @@ -92,35 +98,49 @@ pub struct BlockChainServer {

/// Whether this node acts as a committee aggregator.
is_aggregator: bool,
/// Whether this node is still catching up to the chain head.
/// When true, block proposal and attestation duties are skipped.
is_syncing: bool,
}

impl BlockChainServer {
fn on_tick(&mut self, timestamp_ms: u64) {
let genesis_time_ms = self.store.config().genesis_time * 1000;

// Calculate current slot and interval from milliseconds
let time_since_genesis_ms = timestamp_ms.saturating_sub(genesis_time_ms);
let slot = time_since_genesis_ms / MILLISECONDS_PER_SLOT;
let interval = (time_since_genesis_ms % MILLISECONDS_PER_SLOT) / MILLISECONDS_PER_INTERVAL;

// Fail fast: a state with zero validators is invalid and would cause
// panics in proposer selection and attestation processing.
if self.store.head_state().validators.is_empty() {
error!("Head state has no validators, skipping tick");
return;
}

// Update current slot metric
metrics::update_current_slot(slot);

// Determine sync status: suppress validator duties while our head is
// more than SYNC_TOLERANCE_SLOTS behind the current slot.
// Log once per transition to avoid spam.
let head_slot = self.store.head_slot();
let behind_by = slot.saturating_sub(head_slot);
let now_syncing = behind_by > SYNC_TOLERANCE_SLOTS;
if now_syncing != self.is_syncing {
if now_syncing {
info!(%slot, %head_slot, %behind_by, "Node is syncing, pausing validator duties");
} else {
info!(%slot, %head_slot, "Sync complete, resuming validator duties");
}
self.is_syncing = now_syncing;
metrics::set_is_syncing(self.is_syncing);
}

// At interval 0, check if we will propose (but don't build the block yet).
// Tick forkchoice first to accept attestations, then build the block
// using the freshly-accepted attestations.
let proposer_validator_id = (interval == 0 && slot > 0)
// Skip entirely while syncing — no complete chain view.
let proposer_validator_id = (!self.is_syncing && interval == 0 && slot > 0)
.then(|| self.get_our_proposer(slot))
.flatten();

// Tick the store first - this accepts attestations at interval 0 if we have a proposal
// Tick the store first accepts attestations at interval 0 if we have a proposal
let new_aggregates = store::on_tick(
&mut self.store,
timestamp_ms,
Expand All @@ -136,19 +156,18 @@ impl BlockChainServer {
}
}

// Now build and publish the block (after attestations have been accepted)
// Propose block at interval 0 (after attestations have been accepted)
if let Some(validator_id) = proposer_validator_id {
self.propose_block(slot, validator_id);
}

// Produce attestations at interval 1 (proposer already attested in block)
if interval == 1 {
// Produce attestations at interval 1 (proposer already attested in block).
// Skip while syncing.
if !self.is_syncing && interval == 1 {
self.produce_attestations(slot);
}

// Update safe target slot metric (updated by store.on_tick at interval 3)
metrics::update_safe_target_slot(self.store.safe_target_slot());
// Update head slot metric (head may change when attestations are promoted at intervals 0/4)
metrics::update_head_slot(self.store.head_slot());
}

Expand Down
14 changes: 14 additions & 0 deletions crates/blockchain/src/metrics.rs
Original file line number Diff line number Diff line change
Expand Up @@ -84,6 +84,14 @@ static LEAN_IS_AGGREGATOR: std::sync::LazyLock<IntGauge> = std::sync::LazyLock::
.unwrap()
});

static LEAN_IS_SYNCING: std::sync::LazyLock<IntGauge> = std::sync::LazyLock::new(|| {
register_int_gauge!(
"lean_is_syncing",
"Whether the node is currently syncing. True=1, False=0"
)
.unwrap()
});

static LEAN_ATTESTATION_COMMITTEE_COUNT: std::sync::LazyLock<IntGauge> =
std::sync::LazyLock::new(|| {
register_int_gauge!(
Expand Down Expand Up @@ -293,6 +301,7 @@ pub fn init() {
std::sync::LazyLock::force(&LEAN_LATEST_NEW_AGGREGATED_PAYLOADS);
std::sync::LazyLock::force(&LEAN_LATEST_KNOWN_AGGREGATED_PAYLOADS);
std::sync::LazyLock::force(&LEAN_IS_AGGREGATOR);
std::sync::LazyLock::force(&LEAN_IS_SYNCING);
std::sync::LazyLock::force(&LEAN_ATTESTATION_COMMITTEE_COUNT);
std::sync::LazyLock::force(&LEAN_TABLE_BYTES);
// Counters
Expand Down Expand Up @@ -467,6 +476,11 @@ pub fn set_is_aggregator(is_aggregator: bool) {
LEAN_IS_AGGREGATOR.set(i64::from(is_aggregator));
}

/// Set the is_syncing gauge.
pub fn set_is_syncing(is_syncing: bool) {
LEAN_IS_SYNCING.set(i64::from(is_syncing));
}

/// Set the attestation committee count gauge.
pub fn set_attestation_committee_count(count: u64) {
LEAN_ATTESTATION_COMMITTEE_COUNT.set(count.try_into().unwrap_or_default());
Expand Down
1 change: 1 addition & 0 deletions docs/metrics.md
Original file line number Diff line number Diff line change
Expand Up @@ -68,6 +68,7 @@ The exposed metrics follow [the leanMetrics specification](https://github.com/le
|--------|-------|-------|-------------------------|--------|-----------|
|`lean_validators_count`| Gauge | Number of validators managed by a node | On scrape | | ✅(*) |
|`lean_is_aggregator`| Gauge | Validator's `is_aggregator` status. True=1, False=0 | On node start | | ✅ |
|`lean_is_syncing`| Gauge | Whether the node is currently syncing. True=1, False=0 | On sync state change | | ✅ |

## Network Metrics
Comment on lines +71 to 73
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Missing blank line before section heading

The blank line that previously separated the Validator Metrics table from the ## Network Metrics heading was removed when the new row was inserted. Without it, some Markdown renderers merge the heading into the table or render it incorrectly.

Suggested change
|`lean_is_syncing`| Gauge | Whether the node is currently syncing. True=1, False=0 | On sync state change | ||
## Network Metrics
|`lean_is_syncing`| Gauge | Whether the node is currently syncing. True=1, False=0 | On sync state change | ||
## Network Metrics
Prompt To Fix With AI
This is a comment left during a code review.
Path: docs/metrics.md
Line: 71-72

Comment:
**Missing blank line before section heading**

The blank line that previously separated the Validator Metrics table from the `## Network Metrics` heading was removed when the new row was inserted. Without it, some Markdown renderers merge the heading into the table or render it incorrectly.

```suggestion
|`lean_is_syncing`| Gauge | Whether the node is currently syncing. True=1, False=0 | On sync state change | | ✅ |

## Network Metrics
```

How can I resolve this? If you propose a fix, please make it concise.


Expand Down