Skip to content

spam-stream subcommand for streaming tx specs (relayer use case)#589

Merged
zeroXbrock merged 10 commits into
flashbots:mainfrom
jelias2:jelias/spam-stream-mode
Jun 5, 2026
Merged

spam-stream subcommand for streaming tx specs (relayer use case)#589
zeroXbrock merged 10 commits into
flashbots:mainfrom
jelias2:jelias/spam-stream-mode

Conversation

@jelias2
Copy link
Copy Markdown
Contributor

@jelias2 jelias2 commented May 29, 2026

Summary

Adds a new spam-stream subcommand that reads newline-delimited JSON tx specs from stdin (or a file) and spams them through the existing TestScenario pipeline. Each spec is a FunctionCallDefinition — the same schema as scenario TOML [[spam.tx]] — so access_list, signature/args, gas_limit, value, and from_pool all work without any new schema.

This is a Draft PR for design feedback. It compiles, runs end-to-end against a real devnet, and lands tx receipts via the regular tx_actor flush loop. It is deliberately small: no bundle support, no fuzzing, no recorded spam_runs entry, no integration with --rpc-batch-size/--send-raw-tx-sync. See docs/stream-mode.md for the architecture note and scope list.

Motivation: a generic streaming primitive

Today, contender spam is generator-driven: a static scenario describes what to send, contender cycles through it. That's the right shape for synthetic throughput tests but not for cases where what to send is computed at runtime by something other than contender.

Stream mode lets any upstream process pipe specs into contender and reuse the existing agent pools, rate limiting, signer/nonce management, gas-price caching, receipt tracking, and Prometheus latency metrics. The upstream owns what to send; contender owns how to send it efficiently.

Use cases

Cross-chain / bridge / relayer workflows — the motivating case. Watch chain A for an event, compute a tx (with access list) for chain B, emit it. Works for OP-stack interop, Hyperlane, LayerZero, native rollup bridges, any "receive-and-forward" pattern. Also fits MEV relayers replaying captured bundles and AA bundlers feeding pre-signed UserOperations.

Tx replay

  • Replay a captured mainnet tx range against a forked node to reproduce a bug or audit a migration.
  • Replay mempool snapshots for performance regression testing.
  • Pin a flaky failure as a JSON file and replay it deterministically.

External generators

  • Fuzzers emit "interesting" tx specs; contender executes them at a controlled rate.
  • Property-based testers (QuickCheck-style) and state-space explorers that emit "drive the contract into state X" sequences.
  • Custom team DSLs that compile down to tx specs.

Operations

  • Production traffic sampling: pipe a sample of real mainnet txs through staging at N× speed.
  • Incident-response drills: replay pre-built "incident traffic" patterns through a copy of prod.
  • CI gating: a build job computes "expected txs for this release" and uses contender to exercise them against a testnet.

Differential testing

  • Same stream → multiple chains in parallel, diff the receipts (useful for proving rollup or client equivalence).
  • Same stream → same chain with varied client configs.

Composition with existing contender features: keep [[create]] and [[setup]] static in a scenario, deploy/fund once, then stream the dynamic spam phase.

CLI

contender spam-stream \
  -r https://chain-b \
  -p $FUNDING_KEY \
  --from <stdin|FILE> \
  --from-pool executors --pool-size 10 \
  --tps 5

Flags: -r/--rpc-url, -p/--priv-key, --from, --from-pool (default executors), --pool-size (default 10), --tps (default 0 = drain as fast as the stream emits), --min-balance, --seed, --skip-funding.

Stream format

Newline-delimited JSON, one FunctionCallDefinition per line. Empty lines and #-prefixed lines are ignored; malformed JSON logs a warning and the loop continues.

{
  "to": "0x4200000000000000000000000000000000000022",
  "signature": "validateMessage(bytes32)",
  "args": ["0x0102030405060708091011121314151617181920212223242526272829303132"],
  "access_list": [
    {
      "address": "0x4200000000000000000000000000000000000022",
      "storageKeys": ["0x0100000000000000000000000000000000000000000000000000000000000000"]
    }
  ],
  "gas_limit": 200000
}

Private keys never appear in the stream. Producers describe txs; contender signs with an agent from its own pool (derived from --seed, funded from --priv-key at startup). A compromised producer can spam txs but can't drain accounts beyond what the pre-funded agents hold.

Architecture

All new code lives in crates/cli/src/commands/spam_stream.rs. No contender_core changes. The flow:

stdin/file → reader task → mpsc<FunctionCallDefinition> → drive_stream loop
  for each spec:
    Generator::make_strict_call         (resolves from_pool + access_list)
    Templater::template_function_call   (encodes calldata, threads access_list)
    TestScenario::prepare_tx_request    (assigns nonce, gas, signs)
    txs_client.send_tx_envelope         (same path as the regular spammer)
    TxActorHandle::cache_run_tx         (queues for receipt polling)

A no-op one-step TestConfig is constructed so AgentPools::build_agent_store produces a pool with the requested name and size. The decoy spam step itself is never executed; we bypass load_txs entirely.

We don't reuse TimedSpammer/BlockwiseSpammer because their on_spam loops pull from a pre-loaded Vec<Vec<ExecutionRequest>> via get_spam_tx_chunks. Stream mode is fundamentally stream-shaped. Adding a generic SpamSource abstraction across the existing spammers would be a much larger change; see open questions below.

Dependency on #588

This PR depends on #588 (access_list field + placeholder resolution). The interop relayer use case needs access lists on executing-message calls, and one of the primary justifications for stream mode is that those access lists are computed per-message upstream.

Validation

cargo +1.94 test -p contender_cli spam_stream     # 4 passed
cargo +1.94 test -p contender_cli --lib           # 68 passed
cargo +1.94 fmt --check                           # clean
cargo +1.94 build --release --bin contender       # clean

Smoke test against interop-bench-2-0:

echo '{"to":"0xdeAD...","value":"1","gas_limit":21000}' | \
  contender spam-stream -r https://interop-bench-2-0.optimism.io -p $KEY --tps 1 --pool-size 2

Tx 0x8742f5d94cec761fd927ddcbe1cfcad7ba45e352a81cfeb87277780523ed3646 landed in block 131011, status 0x1.

Test plan

  • Unit tests pass (68 in contender_cli, 4 new for stream parsing)
  • cargo fmt --check clean
  • Manual smoke test: 1 tx via stdin lands on a real L2
  • CI on this branch
  • Manual: stream 100 lines from a file, all land
  • Manual: ctrl-c mid-stream — verify in-flight txs drain before exit
  • Manual: malformed JSON in stream — verify warning + continuation

What's deferred (follow-up work)

  • Bundle ([[spam.bundle]]) support
  • EIP-4844/7702 exercising
  • Fuzzing in stream mode
  • Gas-bump/nonce-shift retry logic
  • --rpc-batch-size / --send-raw-tx-sync integration
  • Recording the run in spam_runs
  • Refactoring TimedSpammer/BlockwiseSpammer to share a SpamSource trait

Open design questions

  1. Should stream mode live in contender_core? The prototype keeps everything in cli/. Moving it into core would let campaigns consume a stream too — but the existing Spammer trait wants a Vec<Vec<ExecutionRequest>> upfront. Natural refactor is a new SpamSource trait that TimedSpammer/BlockwiseSpammer could also adopt.
  2. JSON schema evolution. Today the stream is bare FunctionCallDefinition. A tagged envelope ({"v":1,"tx":{...}}) would give us room to add per-line metadata (e.g. correlation IDs back to the upstream event) without breaking compatibility.
  3. Backpressure feedback. The only feedback today is tx_actor's DB writes and stderr logs. A structured response stream (stdout JSON line per sent tx with hash + status) would unblock reactive callers.
  4. Concurrency for --tps 0. Drain-as-fast currently sends one tx at a time per loop iteration, bounded by the pool size only via nonce contention. Should it explicitly use pool_size parallel workers?
  5. Decoy TestConfig hack. Using a no-op spam entry to wire up the agent store. Cleaner would be to teach AgentPools::build_agent_store to accept an explicit pool list. Worth doing now or in a follow-up?

@jelias2 jelias2 force-pushed the jelias/spam-stream-mode branch from a171692 to 3c13137 Compare May 29, 2026 21:11
@zeroXbrock
Copy link
Copy Markdown
Member

really cool idea, thanks for writing this up @jelias2!

re questions:

  1. Should stream mode live in contender_core?

We probably don't need to modify contender_core for now, but it's a good idea. Refactoring the Spammer trait could be a follow-up task.

  1. JSON schema evolution.

Yeah, I think a tagged/versioned envelope is the way to go. Low effort now to support more fields down the road.

  1. Backpressure feedback.

Makes sense, maybe we could add a flag to enable structured output, or that could be the default for spam-stream mode.

  1. Concurrency for --tps

Not sure what "drain-as-fast" means but I think I understand the question -- why do we only send one tx at a time when we could send txs in parallel (barring nonce contention)?

Probably wouldn't hurt to send transactions from different accounts in parallel. Not sure how much benefit it would provide, though. Maybe in very high-tps scenarios, we could squeeze some benefit out of it but the RPC provider that sends these is shared by each account, so there'd only be one http connection if we used existing tools, so we'd need an http connection pool (maybe as a mod to the provider). Not sure it's worth the effort.

  1. Decoy TestConfig hack.

Probably worth doing now. Shouldn't be too much effort, and it's non-breaking.

jelias2 added a commit to jelias2/contender that referenced this pull request Jun 2, 2026
- Address review feedback on PR flashbots#589:
- flashbots#2/flashbots#3: emit structured, versioned/tagged JSON envelope on stdout
  (one tx_result event per spec) so the schema can evolve; default
  for spam-stream mode (logs stay on stderr).
- flashbots#5: replace the decoy zero-address TestConfig with a direct
  AgentStore pool, injecting signers into the scenario and syncing
  nonces from the RPC.
@jelias2
Copy link
Copy Markdown
Contributor Author

jelias2 commented Jun 2, 2026

Thanks for the review! Addressed the actionable items (commit 37b9e83):

#2 — Versioned/tagged JSON envelope ✅
The stdout output is now a versioned, tagged envelope so the schema can evolve:

{"version":1,"type":"tx_result","idx":0,"tx_hash":"0x...","start_timestamp_ms":1733155200000,"kind":"validate","error":null}

version pins the schema (bump on breaking changes); type discriminates the event kind (tx_result today). error/kind are omitted when absent. Implemented as a StreamEvent { version, #[serde(flatten)] payload } wrapping an internally-tagged StreamPayload enum, with a serialization unit test.

#3 — Structured output ✅ (default for spam-stream)
Per your suggestion, structured output is the default rather than a flag. One tx_result event is emitted per input spec on stdout after each send attempt (including send errors), mirroring the input stream so reactive callers can correlate. Human-readable logs stay on stderr via tracing, so the two streams don't collide.

#5 — Decoy TestConfig hack ✅
Removed the fabricated zero-address spam tx. The scenario now starts from an empty TestConfig::new(); the executor pool is provisioned directly via AgentStore::add_new_agent(...), its signers are registered into scenario.signer_map, and nonces are synced from the RPC. No contender_core changes — all in the CLI crate, non-breaking.

Deferred (per your guidance):

  • tx spammer #1 (stream mode in contender_core / Spammer trait refactor) — left as a follow-up; out of scope for the prototype.
  • sane abi parsing #4 (parallel --tps sends) — skipped; judged not worth the effort for the relayer case.

Docs (docs/stream-mode.md) updated to document the structured output format and the direct pool provisioning, and the open-questions list reflects what's resolved vs. deferred.

cargo build, cargo clippy --all-targets, cargo fmt, and cargo test -p contender_cli spam_stream (5/5 passing, incl. the new envelope test) are all green.

@jelias2 jelias2 marked this pull request as ready for review June 2, 2026 13:16
@jelias2 jelias2 requested a review from zeroXbrock as a code owner June 2, 2026 13:16
@jelias2 jelias2 changed the title Draft: spam-stream subcommand for streaming tx specs (relayer use case) spam-stream subcommand for streaming tx specs (relayer use case) Jun 2, 2026
@jelias2
Copy link
Copy Markdown
Contributor Author

jelias2 commented Jun 2, 2026

Pushed 2d5bc83 addressing the review gaps:

  • Run recording — receipts were being dumped under run_id = 0; run_txs has a FK into runs, so they were orphaned/unqueryable. Now registers a real run via insert_run.
  • Gas price — cached and refreshed every 6s instead of an RPC round-trip per tx.
  • Backpressure — the reader now emits a backpressure event when the input buffer saturates, then blocks to apply real backpressure (the feedback question from the review).
  • Summary — a terminal summary event reports sent/failed.
  • Blob/7702 — specs carrying blob_data/authorization_address are rejected up front rather than silently building an invalid EIP-1559 tx.
  • Tidy-ups: unified the stdin/file reader, dropped dead run_id plumbing.

Build + clippy clean; spam_stream tests pass (added one for the new envelope variants). Note: the uniV2/uniV3 generated-scenario tests fail locally on main too (anvil deploy timeout) — unrelated to this change.

@jelias2 jelias2 force-pushed the jelias/spam-stream-mode branch from 2d5bc83 to 4c4bcef Compare June 2, 2026 14:35
jelias2 added 3 commits June 2, 2026 11:03
Reads newline-delimited JSON FunctionCallDefinitions from stdin or a
file and spams them via the existing TestScenario pipeline. Reuses
agent pools, rate limiting, nonce management, and receipt tracking.

See docs/stream-mode.md for the design note and scope.
- Address review feedback on PR flashbots#589:
- flashbots#2/flashbots#3: emit structured, versioned/tagged JSON envelope on stdout
  (one tx_result event per spec) so the schema can evolve; default
  for spam-stream mode (logs stay on stderr).
- flashbots#5: replace the decoy zero-address TestConfig with a direct
  AgentStore pool, injecting signers into the scenario and syncing
  nonces from the RPC.
…ummary

Address review gaps on the prototype:
- record a real run via insert_run so dumped receipts aren't orphaned
  under run_id 0 (run_txs has a foreign key into runs)
- cache the gas price, refreshing every 6s instead of once per tx
- emit `backpressure` and a terminal `summary` event; track sent vs failed
- reject blob (4844) / setCode (7702) specs up front
- unify the stdin/file reader and drop dead run_id plumbing
@jelias2 jelias2 force-pushed the jelias/spam-stream-mode branch from 4c4bcef to 3e53066 Compare June 2, 2026 17:15
jelias2 added 2 commits June 3, 2026 22:42
…sing

TestScenario::sync_nonces() is gated on should_sync_nonces (= the
sync_nonces_after_batch param). spam-stream set it false, making its two
explicit sync_nonces() calls silent no-ops, so pool accounts' nonces were
never loaded and prepare_tx_request failed every send with NonceMissing
(surfaced as 'core error'). Stream mode sends one tx at a time and never hits
the post-batch sync path, so enabling this only makes the initial pool-nonce
sync run.
prepare_tx_request advances an account's local nonce before the send. When the
send is rejected (e.g. an interop access-list filter rejecting a not-yet-valid
or forged executing message), the tx never enters the mempool but the local
nonce stays advanced, leaving a gap that stalls every later tx from that
account behind it. Roll the nonce back by one on a failed send. The stream
sends serially, so no concurrent send touched the account in between.
Copy link
Copy Markdown
Member

@zeroXbrock zeroXbrock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm seeing some odd behavior with the --tps flag, not sure if I'm misinterpreting the results or if there's a bug.

When I run the following:

echo '{"to":"0xdeAD000000000000000000000000000000000000","value":"1 wei","gas_limit":21000}' | cargo run -- spam-stream -p 0xac0974bec39a17e36ba4a6b4d238ff944bacb478cbed5efcae784d7bf4f2ff80 --tps 10

if I provide --tps 0 or --tps 1000 it doesn't make a difference. I believe this is to be expected, because I only have one transaction to send, but the help menu doesn't explain this. Could you add some more detail in the long_help designation for the --tps flag?

I'm also seeing weird behavior with --skip-funding. I try to fund the accounts with --min-balance 10000000000000000000 in a prior step, then run again with --skip-funding instead of --min-balance, and I get an "insufficient funds" error. The accounts should be funded. It looks like we're not using the same RandSeed every time -- why not? We do this in every other command.

On top of that, --min-balance should support unit-value strings (e.g. "10 eth"). Use the parse_value function from utils -- it's used in several other places in the cli crate's command interfaces, via serde's deserialize_with macro.

jelias2 added 3 commits June 5, 2026 15:25
--tps paces how fast specs are pulled off the input stream; each spec is
sent exactly once, so a one-line input sends a single tx regardless of the
value. Add long_help explaining this, plus tests asserting the help text is
present and the clap arg config is valid (debug_assert).
…hbots#589)

--min-balance now parses unit-value strings ("10 eth", "0.5 ether",
"100 gwei") via util::parse_value, matching the other CLI balance/value
flags; a plain number is still wei. Default expressed as "0.01 ether"
(unchanged 1e16 wei). Tests cover unit parsing, the wei fallback, and the
default round-trip through the value_parser.
…eview flashbots#589)

When --seed was unset, spam-stream generated a fresh random RandSeed each
invocation, so the executor pool's addresses differed every run. Funding the
pool with --min-balance in one run then re-running with --skip-funding hit
"insufficient funds" because the second run derived a different (unfunded)
pool.

Fall back to the persisted seedfile (data_dir/seed) when --seed is unset,
matching spam/setup/campaign. Threads data_dir into spam_stream(). Pool
addresses are now stable across invocations for a given data-dir.

Test: build_pool_agent_store is deterministic per seed (same seed -> same
addresses, different seed -> different). Manually verified against anvil:
run 1 funds the pool + writes the seedfile; run 2 with --skip-funding and no
funder key sends successfully (previously failed with insufficient funds).
@jelias2
Copy link
Copy Markdown
Contributor Author

jelias2 commented Jun 5, 2026

Thanks @zeroXbrock! Addressed all three in separate commits on top of the branch:

--tps behavior + docs (c8657ad) — You read it right: with a single input spec, --tps 0 vs 1000 make no difference because the rate is bounded by the stream, not by tx duplication. Each spec is sent exactly once; --tps only paces how fast specs are pulled off the input. Added a long_help spelling this out:

--tps <TPS>
    Target transactions per second. This paces how fast specs are pulled off the
    input stream, NOT how many txs are duplicated: each input spec is sent exactly
    once. With `0` (the default) specs are sent as fast as they arrive. If the stream
    supplies fewer specs per second than `--tps`, the rate is bounded by the stream,
    so a one-line input sends a single tx regardless of the value.

--skip-funding across runs (c2b3d02) — Good catch, this was a real bug. Stream mode was calling RandSeed::new() (fresh random seed) whenever --seed was unset, so the executor pool's addresses differed every invocation — funding in run 1 then --skip-funding in run 2 derived a different, unfunded pool. Now it falls back to the persisted seedfile (<data_dir>/seed) when --seed is unset, exactly like spam/setup/campaign. Verified against anvil: fund the pool in run 1, then run 2 with --skip-funding and no funder key sends successfully (previously failed with "insufficient funds").

--min-balance unit strings (584057b) — Done; switched it to value_parser = parse_value with default_value = "0.01 ether", matching the other balance/value flags. It now accepts "10 eth", "0.5 ether", "100 gwei", etc., with a plain number still parsed as wei.

Added unit tests for each (help text present, unit parsing + wei fallback + default round-trip, and pool-address determinism per seed); all spam_stream tests pass.

Copy link
Copy Markdown
Member

@zeroXbrock zeroXbrock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nicely done, thank you!

@zeroXbrock zeroXbrock merged commit c91598a into flashbots:main Jun 5, 2026
6 of 7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants