Skip to content

dash-spv: CancellationToken ignored during DashSpvClient::start() — DNS discovery blocks shutdown #577

@lklimek

Description

@lklimek

Summary

DashSpvClient::run(token) calls self.start().await before entering the tokio::select! loop that monitors token.cancelled(). If start() blocks (e.g. during DNS peer discovery), cancelling the token has no effect until start() completes — the client cannot be stopped during its startup phase.

Steps to reproduce

  1. Create a DashSpvClient for testnet with no explicit peers configured (no saved peers on disk either)
  2. Call client.run(token)
  3. After ~100ms, cancel the token
  4. Observe that run() does not return within 10 seconds — it is stuck in DNS peer discovery inside PeerNetworkManager::start()

This was discovered via a unit test in dash-evo-tool (spv::tests::test_start_stop_clean_shutdown) that reliably times out.

Root cause

In src/client/sync_coordinator.rs:

pub async fn run(&self, token: CancellationToken) -> Result<()> {
    self.start().await?;  // ← blocks here, token not checked

    // Token is only checked AFTER start() completes:
    let error = loop {
        tokio::select! {
            // ...
            _ = token.cancelled() => { break None }
        };
    };
}

start() calls PeerNetworkManager::start() (src/network/manager.rs:143), which performs DNS peer discovery when no peers are available:

if peer_addresses.is_empty() {
    let dns_peers = self.discovery.discover_peers(self.network).await;  // ← no cancellation, no timeout
}

Suggested fix

Wrap self.start() in tokio::select! with the cancellation token:

pub async fn run(&self, token: CancellationToken) -> Result<()> {
    tokio::select! {
        result = self.start() => result?,
        _ = token.cancelled() => return Ok(()),
    }
    // ... monitoring loop
}

Optionally, PeerNetworkManager::start() could also accept a CancellationToken to make DNS discovery cancellation-aware at a finer granularity.

Priority

Low — only affects shutdown responsiveness during the initial connection phase. The monitoring loop handles cancellation correctly once startup completes.

🤖 Co-authored by Claudius the Magnificent AI Agent

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions