bench(fullhistory): unify ingest into hot-ingest + cold-ingest commands by tamirms · Pull Request #752 · stellar/stellar-rpc

tamirms · 2026-05-23T19:07:31Z

Replace six per-type ingest commands (hot-{ledgers,txhash,events}-ingest, cold-{ledgers,events}-ingest, ingest-raw-txhash) with two unified commands:

hot-ingest --types=<subset of ledgers,txhash,events> --source=pack|bsb
cold-ingest --types=<subset of ledgers,events,txhash> --source=pack|bsb

Both accept --xdr-views (view vs parsed extract strategy), --parallel (fan ingesters out across goroutines per ledger via errgroup), --cpuprofile, --memprofile. Cold-ingest takes per-type packfile tuning flags (--ledgers-packfile-concurrency, --events-packfile-concurrency, etc). Source is treated uniformly as a ledgerbackend.LedgerBackend so the per-chunk loop is identical for local cold packfiles (packBackend adapter) and BSB-from-GCS (BufferedStorageBackend). build-txhash-index stays a separate command (phase 2 of the cold txhash MPHF build).

Design highlights:

One Ingester interface { Ingest(ctx, Ledger); io.Closer }; cold ingesters additionally implement ColdIngester (adds Finalize). Hot driver holds []Ingester, cold driver holds []ColdIngester — no type assertion in either per-ledger loop.
Per-data-type files (ingest_ledgers.go / ingest_txhash.go / ingest_events.go) each contain a Hot + Cold pair plus a typed Collector. View vs parsed selected by a constructor bool, switched inside the ingester via a method (no closure-in-field).
Ledger value type carries (Seq, Raw, *xdr.LedgerCloseMeta); driver unmarshals once when at least one ingester needs the parsed struct (needsLCM optimization avoids the decode for ledgers-only runs).
Aggregation-only CSV output: one row per stage with n, n_items, total_ns, p50/p90/p99/max. No per-ledger CSV (only aggregates were ever consumed in the old benches).
Driver records read_blocked, lcm_decode (parsed only), fan_out_per_ledger, total_per_ledger (hot only); prepareRange threaded through from openSourceBackend so wall throughput rates include the real prepare cost.
Crash-handling: Close always deferred via errors.Join; Finalize called explicitly on the success path with error check; main dispatcher returns int exit code from cmdHotIngest / cmdColdIngest so deferred cleanup always runs before os.Exit.

Replace six per-type ingest commands (hot-{ledgers,txhash,events}-ingest, cold-{ledgers,events}-ingest, ingest-raw-txhash) with two unified commands: hot-ingest --types=<subset of ledgers,txhash,events> --source=pack|bsb cold-ingest --types=<subset of ledgers,events,txhash> --source=pack|bsb Both accept --xdr-views (view vs parsed extract strategy), --parallel (fan ingesters out across goroutines per ledger via errgroup), --cpuprofile, --memprofile. Cold-ingest takes per-type packfile tuning flags (--ledgers-packfile-concurrency, --events-packfile-concurrency, etc). Source is treated uniformly as a ledgerbackend.LedgerBackend so the per-chunk loop is identical for local cold packfiles (packBackend adapter) and BSB-from-GCS (BufferedStorageBackend). build-txhash-index stays a separate command (phase 2 of the cold txhash MPHF build). Design highlights: - One Ingester interface { Ingest(ctx, Ledger); io.Closer }; cold ingesters additionally implement ColdIngester (adds Finalize). Hot driver holds []Ingester, cold driver holds []ColdIngester — no type assertion in either per-ledger loop. - Per-data-type files (ingest_ledgers.go / ingest_txhash.go / ingest_events.go) each contain a Hot + Cold pair plus a typed Collector. View vs parsed selected by a constructor bool, switched inside the ingester via a method (no closure-in-field). - Ledger value type carries (Seq, Raw, *xdr.LedgerCloseMeta); driver unmarshals once when at least one ingester needs the parsed struct (needsLCM optimization avoids the decode for ledgers-only runs). - Aggregation-only CSV output: one row per stage with n, n_items, total_ns, p50/p90/p99/max. No per-ledger CSV (only aggregates were ever consumed in the old benches). - Driver records read_blocked, lcm_decode (parsed only), fan_out_per_ledger, total_per_ledger (hot only); prepareRange threaded through from openSourceBackend so wall throughput rates include the real prepare cost. - Crash-handling: Close always deferred via errors.Join; Finalize called explicitly on the success path with error check; main dispatcher returns int exit code from cmdHotIngest / cmdColdIngest so deferred cleanup always runs before os.Exit. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Pull ctx out of hot/coldDeps to satisfy containedctx, drop the loopvar copy and unused gocognit nolints, demote duplicate package-level godocs to file comments, swap fmt.Errorf for errors.New on non-formatted literals, and add a !linux stub for evictFile so the bench builds off Linux. PR 752's diff is now lint-clean; the remaining repo findings predate rpc-hack and live in files this branch doesn't touch. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Remove four single-purpose scripts that no longer carry their weight: verify-pack (self-labeled throwaway, fully subsumed by spot-check+cold-read before those went too), cold-read (demo with no production caller), migrate-cold (one-shot format converter whose migration completed long ago), and spot-check (random-sample verifier never invoked from CI or operator workflows). In bench-fullhistory, delete tier_adapters.go (a one-callsite shim from the previous tier-dispatch design — inline its IterateLedgers usage at the cold-ledgers callsite) and sweep doc comments / one user-facing fatal message that still pointed at the per-type ingest commands PR 752 removed (hot-ledgers-ingest, ingest-raw-txhash, cold-events-ingest, etc). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…chunk cold side Replace the (n × workers) 2D grid on cold/hot-ledgers and the single-threaded loops on cold/hot-{txpage,txhash,events} with a unified 1D --workers comma-list sweep. All 8 read benches now share runConcurrentSweep (cold) / runConcurrentSweepWithWarmup (hot) plus the printSweepHeader/Row/reportSaturation helpers in bench_grid.go. The 4 cold non-ledger benches (cold-txpage, cold-txhash, cold-events) were fixed-chunk single-threaded by design — under workers>1 they'd race on the same packfile's eviction. They now span multiple chunks: cold-txpage builds a per-chunk preflight set and the cursor stays within the picked chunk; cold-txhash samples hashes across discovered chunks and uses a startup MPHF coverage probe to drop chunks the MPHF doesn't index; cold-events builds per-chunk corpora and the iter picks a random chunk + evicts its three event files. The MPHF probe uses --mphf-probe-hashes (default 32) actual hashes sliced from a 3-ledger sample, giving ~1.3% per-uncovered-chunk false-coverage on a 10-chunk MPHF. Each cold bench gains optional --chunk-lo / --chunk-hi flags so operators can constrain runs for reproducibility (defaults to discoverChunks under --cold-dir). iterOp closures take a `measured bool` so CSV writes and per-cell accumulator appends only fire on the measured pass; warmup actually exercises the code paths but doesn't contaminate output. RNG seeds mix `workers` into the PCG state so cells with different worker counts produce genuinely independent streams instead of sharing prefixes. kLabels in corpus.go switched from a lazy-intern map (a data race under concurrent Next()) to an immutable [16]string precomputed via IIFE. latencyStats.line omits ops/s; the cell-level wall-clock ops/s is printed by printSweepRow, and per-class / hit-miss sub-summary lines no longer print the misleading sum-of-latencies/total form. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…ncurrency refactor Update the package-level Go doc and mains usage() text to describe the post-refactor command shape: 1D --workers comma-list sweep on all 8 read benches, --n single-valued on ledger benches, multi-chunk methodology on cold-txpage/txhash/events, and optional --chunk-lo/--chunk-hi to constrain the chunk range. Both surfaces were left referencing the deleted n-by-workers 2D grid sweep. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Reuse buffers instead of allocating per ledger/event/record on the cold ingest path: - ledger.ColdReader.GetLedgerRawInto: read a ledger into a caller-reused buffer (the per-ledger source clone was the single dominant allocator). - events.Payload.MarshalInto + eventstore.ColdWriter scratch: marshal each event into a reused buffer instead of a fresh slice. - events.LCMToPayloadsFromRawInto: thread a reusable []Payload accumulator through V3/V4 extraction so the per-ledger and per-tx payload slices are not reallocated. - packfile.Writer record-buffer pool: borrow the per-record payload buffer in buildRecord, return it in writeRecord (the single terminal consumer for both the async and passthrough paths), recycling it across records. The original Marshal / GetLedgerRaw / LCMToPayloadsFromRaw remain as owned-buffer wrappers; the packfile pooling is transparent to callers. Verified byte-identical cold artifacts across ledgers/events/txhash and packfile tests pass under -race. Cuts cold-ingest allocation ~91% and the saturated multi-chunk wall ~38% (4m35 -> 2m50 at chunk-workers=10). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

- Fix --source=bsb with --chunk-workers>1: open one BufferedStorageBackend session per chunk inside each worker rather than sharing a single monotonic-cursor instance (sharing raced the cursor under RLock and rejected each worker's out-of-order forward range). Latent — every multi-chunk sweep had used --source=pack. bsb-buffer-size / -num-workers are now documented as per-chunk-worker. - Collector refactor: one perLedgerRows() filtering pass per collector, shared by PrintSummary/WriteCSV/InPipelineTime (removes the duplicated per-stage filtering); drop dead chunkResult.chunkIdx + its parameter; add LedgerCollector.Writes() instead of reaching into samples. - Wire the new buffer-reuse APIs (GetLedgerRawInto / MarshalInto / LCMToPayloadsFromRawInto) into the pack backend + events ingesters. - Delete superseded tools: full-history-backfill (its BSB->ColdWriter per-chunk path is covered by cold-ingest --source=bsb) and the verify-ingest scratch verifier. - Inline the single-chunk openSourceBackend shim into the hot driver. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Document that the reusable-buffer arg to GetLedgerRawInto / MarshalInto / LCMToPayloadsFromRawInto is single-owner and must not be shared across goroutines, and note that packBackend.GetLedger intentionally uses the cloning (owned) read rather than the borrowing GetLedgerRaw. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Address golangci-lint (CI v2.11.3) findings from the cold-ingest allocation + per-chunk-BSB work: - goconst: extract sourcePack/sourceBSB constants for --source values. - forcetypeassert: comma-ok assertion in packfile getRecordBuf. - copyloopvar/intrange: drop redundant loop-var copy; use integer range. - nolintlint: drop now-unused cyclop/gocognit suppressions. - funcorder: annotate the perLedgerRows / getRecordBuf helpers. - misspell/embeddedstructfieldcheck/gci: minor formatting. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Describe all sub-commands — the per-tier read benches, the unified hot/cold ingest, and build-txhash-index — with their methodology, key flags, output format, and usage examples. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…oncurrency The read benches model concurrent *query load* — N goroutines issuing queries back-to-back, i.e. N requests in flight at once — not internal worker threads. --query-concurrency names that directly and disambiguates from ingest's --chunk-workers / --parallel. Renamed across all 8 read benches, with the per-iter/summary CSV column matched (workers -> query_concurrency) and usage/help/error messages updated. build-txhash-index's separate --workers (streamhash block-builder count) is intentionally left as-is. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…ency Remove the build/CGO instructions from the README and rewrite the read-bench concurrency section around the renamed --query-concurrency flag (the closed-loop concurrent-query load model). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Rewire the cold/hot ingest drivers to consume a ledgerbackend.LedgerStream (RawLedgers) instead of constructing a LedgerBackend: a packStream over the local cold packfile and NewBufferedStorageStream for GCS, each yielding per-ledger borrows the driver consumes in-scope (extract + write) before the next yield. ColdReader.IterateLedgers and HotStore.IterateLedgers now yield packfile / decode-scratch borrows instead of cloning each ledger; callers that retain Entry.Bytes copy it (the read benches and tests do). This removes the per-ledger clone that dominated ingest allocation. Point go.mod at the go-stellar-sdk ledger-stream branch (drops the dev-only local-path replace). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Re-point at the SDK ledger-stream branch after factoring the three LedgerStream implementations onto a shared streamRaw skeleton + rawReader, so all RawLedgers methods share one build->prepare->loop-> yield->close shape. No public API change. Also document HotStore.GetLedgerRaw's owned-copy contract to match ColdReader.GetLedgerRaw, stating the copy-vs-borrow policy at both point-read call sites: iterators borrow, point getters own. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

golangci-lint (gci) failed on bench_hot_ingest.go: the hotDeps struct fields were aligned to a wider column than gofmt's, since the later-added cpuProfile/memProfile fields set a narrower one. gofmt -w normalizes the whole block; gci reports gofmt deviations as its own failure. Also re-point go-stellar-sdk at the ledger-stream tip (30fa0a16), which carries the gofmt cleanup of the GetLedgerRaw removal (blank-line collapse in metrics.go/rpc_backend.go/loadtest) that fixed the SDK's gogenerate check. No behavior change. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

… tip Picks up the Copilot-review fixes (lock-contract docs, fetchSequence rename, decompress hint clamp). No behavior change here. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

socket-security · 2026-05-25T18:41:31Z

Review the following changes in direct dependencies. Learn more about Socket for GitHub.

Diff	Package	Supply Chain Security	Vulnerability	Quality	Maintenance	License
	golang/github.com/stellar/go-stellar-sdk@v0.5.1-0.20260512162508-8d3416569d39 ⏵ v0.5.1-0.20260525183917-db1dd340f455	⁺¹

View full report

The ingest driver reads through ledgerbackend.LedgerStream (RawLedgers), not the old LedgerBackend interface. Update the README description. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Tamir Sen and others added 12 commits May 23, 2026 09:34

docs(bench-fullhistory): add README

de7e05f

Describe all sub-commands — the per-tier read benches, the unified hot/cold ingest, and build-txhash-index — with their methodology, key flags, output format, and usage examples. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

tamirms force-pushed the events-auto-corpus-bench branch from c8daf0e to 0f645d3 Compare May 24, 2026 21:57

Tamir Sen and others added 4 commits May 25, 2026 14:22

chore(bench-fullhistory): bump go-stellar-sdk to ledger-stream review…

a35fd43

… tip Picks up the Copilot-review fixes (lock-contract docs, fetchSequence rename, decompress hint clamp). No behavior change here. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

docs(bench-fullhistory): ingest driver streams via LedgerStream

91df571

The ingest driver reads through ledgerbackend.LedgerStream (RawLedgers), not the old LedgerBackend interface. Update the README description. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

tamirms merged commit 4e2ed21 into rpc-hack May 25, 2026
15 checks passed

tamirms deleted the events-auto-corpus-bench branch May 25, 2026 19:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bench(fullhistory): unify ingest into hot-ingest + cold-ingest commands#752

bench(fullhistory): unify ingest into hot-ingest + cold-ingest commands#752
tamirms merged 17 commits into
rpc-hackfrom
events-auto-corpus-bench

tamirms commented May 23, 2026

Uh oh!

socket-security Bot commented May 25, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

tamirms commented May 23, 2026

Uh oh!

socket-security Bot commented May 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

socket-security Bot commented May 25, 2026 •

edited

Loading