LLM-based optimizations for `BatchCompleter` by brandur · Pull Request #1293 · riverqueue/river

brandur · 2026-06-20T15:56:48Z

This one's just throwing Codex at BatchCompleter to see what
optimizations it can find in there as this seems to be by far our
slowest spot in River with respect to benchmarking at least.

It didn't do anything huge, but ended up putting in a variety of minor
optimizations:

Removed the separate setStateStartTimes map by storing StartTime
in batchCompleterSetState.
Changed queued state storage from pointer values to map values,
removing a wrapper allocation.
Collapsed backlog checking + enqueue into one lock acquisition on the
common path.
Replaced sliceutil.Map with a direct event-building loop.
Use one batch completion timestamp instead of calling Now() once per
completed row.
Avoid repeatedly assigning params.Schema inside the batch mapping
loop.

Benchmarks seem to indicate a previous 12-13 us/op coming down to 10-11
us/op, and with one fewer allocation per op.

I'm seeing a definite speedup when I run the full benchmark. Our bench
has some consistency problems (can show considerably different results
per run), but whereas before ~46k jobs/sec was about the best I ever saw
on my commodity MBP here, I'm now seeing up to ~56k jobs/sec, so at best
a 10k jobs/sec improvement:

$ go run ./cmd/river bench --database-url $DATABASE_URL --num-total-jobs 1_000_000
bench: jobs worked [          0 ], inserted [    1000000 ], job/sec [        0.0 ] [0s]
bench: jobs worked [     106472 ], inserted [          0 ], job/sec [    53236.0 ] [2s]
bench: jobs worked [     108440 ], inserted [          0 ], job/sec [    54220.0 ] [2s]
bench: jobs worked [     114035 ], inserted [          0 ], job/sec [    57017.5 ] [2s]
bench: jobs worked [     107402 ], inserted [          0 ], job/sec [    53701.0 ] [2s]
bench: jobs worked [     114433 ], inserted [          0 ], job/sec [    57216.5 ] [2s]
bench: jobs worked [     105701 ], inserted [          0 ], job/sec [    52850.5 ] [2s]
bench: jobs worked [     116051 ], inserted [          0 ], job/sec [    58025.5 ] [2s]
bench: jobs worked [     108054 ], inserted [          0 ], job/sec [    54027.0 ] [2s]
bench: jobs worked [     119412 ], inserted [          0 ], job/sec [    59706.0 ] [2s]
bench: total jobs worked [    1000000 ], total jobs inserted [    1000000 ], overall job/sec [    55710.8 ], running 17.949838958s

The number should be even better on faster computers.

Nothing in the code gets any worse (and I think some of it is actually
an improvement?) so I think it's probably worthwhile to bring these in.

bgentry

Nice! Just one minor code quality concern from me but overall a nice batch of improvements, good stuff 🙌

bgentry · 2026-06-29T00:54:21Z

+	for {
+		// Keep the common enqueue path to one lock acquisition. If the
+		// completer is behind, wait for the current backlog gate to open and
+		// retry so the threshold is checked against fresh state.
+		c.setStateParamsMu.Lock()
+		if c.waitOnBacklogWaiting {
+			waitChan := c.waitOnBacklogChan
+			c.setStateParamsMu.Unlock()
+			<-waitChan
+			continue
+		}

-	statsSnapshot := *stats
+		waitAt := c.backlogWaitThresholdEffective()
+		backlogSize = len(c.setStateParams)
+		if backlogSize >= waitAt {
+			c.initBacklogWaitLocked(ctx, backlogSize, waitAt)
+		}

-	c.setStateParams[params.ID] = &batchCompleterSetState{params, &statsSnapshot}
-	c.setStateStartTimes[params.ID] = now
-	backlogSize := len(c.setStateParams)
-	c.setStateParamsMu.Unlock()
+		statsSnapshot := *stats
+		c.setStateParams[params.ID] = batchCompleterSetState{Params: params, StartTime: now, Stats: &statsSnapshot}
+		backlogSize = len(c.setStateParams)
+		c.setStateParamsMu.Unlock()
+		break
+	}


This section feels a little sketchier to me in terms of mutex handling fwiw, might be worth asking if a refactor could clean it up and make it easier to read.

Good call. I had Codex refactor it into JobSetStateIfRunning plus a helper function and it's much cleaner now. All mutex handling happens with a single lock and a deferred unlock.

This one's just throwing Codex at `BatchCompleter` to see what optimizations it can find in there as this seems to be by far our slowest spot in River with respect to benchmarking at least. It didn't do anything huge, but ended up putting in a variety of minor optimizations: - Removed the separate `setStateStartTimes` map by storing `StartTime` in `batchCompleterSetState`. - Changed queued state storage from pointer values to map values, removing a wrapper allocation. - Collapsed backlog checking + enqueue into one lock acquisition on the common path. - Replaced `sliceutil.Map` with a direct event-building loop. - Use one batch completion timestamp instead of calling `Now()` once per completed row. - Avoid repeatedly assigning `params.Schema` inside the batch mapping loop. Benchmarks seem to indicate a previous 12-13 us/op coming down to 10-11 us/op, and with one fewer allocation per op. I'm seeing a definite speedup when I run the full benchmark. Our bench has some consistency problems (can show considerably different results per run), but whereas before ~46k jobs/sec was about the best I ever saw on my commodity MBP here, I'm now seeing up to ~56k jobs/sec, so at best a 10k jobs/sec improvement: $ go run ./cmd/river bench --database-url $DATABASE_URL --num-total-jobs 1_000_000 bench: jobs worked [ 0 ], inserted [ 1000000 ], job/sec [ 0.0 ] [0s] bench: jobs worked [ 106472 ], inserted [ 0 ], job/sec [ 53236.0 ] [2s] bench: jobs worked [ 108440 ], inserted [ 0 ], job/sec [ 54220.0 ] [2s] bench: jobs worked [ 114035 ], inserted [ 0 ], job/sec [ 57017.5 ] [2s] bench: jobs worked [ 107402 ], inserted [ 0 ], job/sec [ 53701.0 ] [2s] bench: jobs worked [ 114433 ], inserted [ 0 ], job/sec [ 57216.5 ] [2s] bench: jobs worked [ 105701 ], inserted [ 0 ], job/sec [ 52850.5 ] [2s] bench: jobs worked [ 116051 ], inserted [ 0 ], job/sec [ 58025.5 ] [2s] bench: jobs worked [ 108054 ], inserted [ 0 ], job/sec [ 54027.0 ] [2s] bench: jobs worked [ 119412 ], inserted [ 0 ], job/sec [ 59706.0 ] [2s] bench: total jobs worked [ 1000000 ], total jobs inserted [ 1000000 ], overall job/sec [ 55710.8 ], running 17.949838958s The number should be even better on faster computers. Nothing in the code gets any worse (and I think some of it is actually an improvement?) so I think it's probably worthwhile to bring these in.

brandur force-pushed the brandur-smoother-completion branch from 168f657 to d177425 Compare June 20, 2026 16:07

brandur force-pushed the brandur-completion-optimizations branch 2 times, most recently from 1e5df34 to 4a982be Compare June 20, 2026 16:19

brandur force-pushed the brandur-smoother-completion branch 2 times, most recently from fb7ef2c to 4acc752 Compare June 20, 2026 19:17

brandur force-pushed the brandur-completion-optimizations branch from 4a982be to d165299 Compare June 20, 2026 19:48

brandur requested a review from bgentry June 20, 2026 19:58

brandur force-pushed the brandur-smoother-completion branch from 4acc752 to 2bb559c Compare June 20, 2026 20:01

brandur force-pushed the brandur-completion-optimizations branch from d165299 to 9b268fa Compare June 20, 2026 20:03

bgentry approved these changes Jun 29, 2026

View reviewed changes

Base automatically changed from brandur-smoother-completion to master June 29, 2026 06:04

brandur force-pushed the brandur-completion-optimizations branch from 9b268fa to 5dbdca4 Compare June 29, 2026 06:04

brandur force-pushed the brandur-completion-optimizations branch from 5dbdca4 to 6bb88ad Compare June 29, 2026 19:30

brandur merged commit 944f077 into master Jun 29, 2026
15 checks passed

brandur deleted the brandur-completion-optimizations branch June 29, 2026 19:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

LLM-based optimizations for `BatchCompleter`#1293

LLM-based optimizations for `BatchCompleter`#1293
brandur merged 1 commit into
masterfrom
brandur-completion-optimizations

brandur commented Jun 20, 2026

Uh oh!

bgentry left a comment

Uh oh!

bgentry Jun 29, 2026

Uh oh!

brandur Jun 29, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

brandur commented Jun 20, 2026

Uh oh!

bgentry left a comment

Choose a reason for hiding this comment

Uh oh!

bgentry Jun 29, 2026

Choose a reason for hiding this comment

Uh oh!

brandur Jun 29, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants