chore(zql): penalize flip in the planner's cost model#5992
Draft
tantaman wants to merge 3 commits into
Draft
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
| Branch | mlaw/penalize-flip-cost |
| Testbed | Linux |
Click to view all benchmark results
| Benchmark | File Size | Benchmark Result kilobytes (KB) (Result Δ%) | Upper Boundary kilobytes (KB) (Limit %) |
|---|---|---|---|
| zero-package.tgz | 📈 view plot 🚷 view threshold | 2,113.78 KB(+0.08%)Baseline: 2,112.06 KB | 2,154.30 KB (98.12%) |
| zero.js | 📈 view plot 🚷 view threshold | 280.12 KB(+0.02%)Baseline: 280.07 KB | 285.67 KB (98.06%) |
| zero.js.br | 📈 view plot 🚷 view threshold | 74.40 KB(+0.01%)Baseline: 74.39 KB | 75.88 KB (98.05%) |
|
| Branch | mlaw/penalize-flip-cost |
| Testbed | self-hosted-metal |
Click to view all benchmark results
| Benchmark | Throughput | Benchmark Result operations / second (ops/s) x 1e3 (Result Δ%) | Lower Boundary operations / second (ops/s) x 1e3 (Limit %) |
|---|---|---|---|
| src/client/custom.bench.ts > big schema | 📈 view plot 🚷 view threshold | 37.17 ops/s x 1e3(+5.48%)Baseline: 35.24 ops/s x 1e3 | 33.19 ops/s x 1e3 (89.30%) |
| src/client/zero.bench.ts > basics > All 1000 rows x 10 columns (numbers) | 📈 view plot 🚷 view threshold | 1.11 ops/s x 1e3(+1.45%)Baseline: 1.09 ops/s x 1e3 | 1.02 ops/s x 1e3 (91.44%) |
| src/client/zero.bench.ts > pk compare > pk = N | 📈 view plot 🚷 view threshold | 19.97 ops/s x 1e3(+2.22%)Baseline: 19.54 ops/s x 1e3 | 17.57 ops/s x 1e3 (87.98%) |
| src/client/zero.bench.ts > with filter > Lower rows 500 x 10 columns (numbers) | 📈 view plot 🚷 view threshold | 1.53 ops/s x 1e3(+2.12%)Baseline: 1.50 ops/s x 1e3 | 1.25 ops/s x 1e3 (81.83%) |
|
| Branch | mlaw/penalize-flip-cost |
| Testbed | self-hosted-metal |
Click to view all benchmark results
| Benchmark | Throughput | Benchmark Result operations / second (ops/s) (Result Δ%) | Lower Boundary operations / second (ops/s) (Limit %) |
|---|---|---|---|
| src/db/pg-copy.bench.ts > pg-copy benchmark > copy | 📈 view plot 🚷 view threshold | 17.75 ops/s(-1.38%)Baseline: 18.00 ops/s | 17.73 ops/s (99.86%) |
|
| Branch | mlaw/penalize-flip-cost |
| Testbed | self-hosted-metal |
🐰 View full continuous benchmarking report in Bencher
⚠️ WARNING: Truncated view!The full continuous benchmarking report exceeds the maximum length allowed on this platform.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
The planner's cost model under-counts the runtime cost of
FlippedJoinbecause it doesn't model two IVM-level coststhat are invisible to SQLite's
scanstatus:FlippedJoin.fetchreads ALLchildren into an array before any parent work, so even with
a downstream
LIMITevery child pays IVM cost (generatoryields, debug accounting, btree-set inserts).
mergeSortedStreamsopens everychunk to seed its heap before yielding the first row; each
open runs an IN-list SQL to its first match.
Concretely this means the cost model treats SQLite row scans
and IVM row processing as if they cost the same — but in
practice IVM is ~100× slower per row. For limited queries
with large child cardinality this leads the planner to pick
a flipped plan when semi would short-circuit far earlier.
This change adds
child.scanEst * FLIP_IVM_PER_CHILD_OVERHEAD(constant3) to theflipped-join cost, gated on
parent.limit !== undefined && child.scanEst > getMultiConstraintChunkSize(). The gatematters:
anyway, so flipped's eager-load isn't wasted. Don't
penalize.
child.scanEst ≤ 256) → nomergeSortedStreamspriming happens, so no IVM tax. Don'tpenalize.
This preserves flipped wins on full-scan queries and on
small child sets, while penalizing the case it's actually
wrong for: limited
TAKEqueries with huge childcardinality.
Measured impact
Benchmarked on a 211 GB zbugs replica via
zql-benchmarks/src/zbugs-profile.ts. The pathologicalquery is
project=gatewaycore AND open=true AND whereExists(label=api-gateway) AND whereExists(label=async-processing) ORDER BY modified DESC LIMIT 50:Other zbugs benchmark variants (with assignee, single-label,
no-label) are unchanged in plan choice and row counts —
their costs don't trip the gate.
Cost model details
For a flipped join with parent
P, childC, and chunksize
K:cost = C.cost
+ ceil(C.scanEst / K) * P.startupCost //
per-chunk prepare
+ C.scanEst * (P.cost + P.scanEst) //
per-child seek
+ (P.limit !== undefined && C.scanEst > K
? C.scanEst * 3 // ← new
IVM-overhead term
: 0)
The constant
3is calibrated against the observed gap:pre-change flipped at 401k vs semi-semi at 1.24M for the v6
query; multiplying the 416k-row child scan by 3 (≈ 1.25M
added) tips the choice. Other calibrations tried:
0— baseline, v6 picks the bad plan (65 s).1,2— v6 still picks flipped (semi cost not reached).3— v6 picks semi-semi. Used.legitimate flipped wins.
Test impact
Existing
planner-join.test.tschunk-boundary tests nowexercise a connection with
limit=50so the gate fires andthe IVM term is exercised. The helper
ovhFor(n, chunkSize)makes assertions read symmetrically across the
n ≤ chunkSize/n > chunkSizecases.Integration tests
All
chinook/planner.pg.test.ts(5/5) andpagila/planner-exec.pg.test.ts(18/18) pass unchanged.chinook/planner-exec.pg.test.tsis 25/26. The oneregression is correlation-only, not a plan-choice
regression:
'extreme selectivity - artist to album to long tracks'(indexed variant)within-optimal: 1.00x,passes)
headroom)
The cost model's ranking of all-flipped plans against semi
alternatives drifts below this test's correlation
threshold, but the planner still picks the actually-optimal
plan. The base-DB threshold for this same test is already
0.15, so loosening the indexed threshold to0.20wouldbe consistent with the existing tolerance. Per the project's
CI margin convention (~30% headroom for cost-model
thresholds), the threshold was tighter than the model's
underlying noise floor; this PR just exposes it.
Depends on
This PR can be reviewed independently but its impact is best
measured on top of the
Debug.rowVendedO(N²) fix(separate PR — branch
mlaw/fix-debug-rowvended-quadratic).Without that fix the v6 query takes 260 s instead of 65 s
and the 7.4 s post-fix number can't be reproduced.
Test plan
npm --workspace=zql run test -- planner(90 tests,all pass)
npm --workspace=zql-integration-tests run test -- --project='*pg-18*' src/chinook/planner-exec.pg.test.ts(25/26 — known soft fail on
extreme selectivitycorrelation)
npm --workspace=zql-integration-tests run test -- --project='*pg-18*' src/pagila/planner-exec.pg.test.ts(18/18)
npm --workspace=zql-integration-tests run test -- --project='*pg-18*' src/chinook/planner.pg.test.ts(5/5)--project='*pg-15*','*pg-16*','*pg-17*'before mergeextreme selectivityindexedcorrelation threshold: leave the test failing as a known
follow-up, or relax
0.35 → 0.20in this PRFollow-ups
extreme selectivityindexed correlation thresholdabove.
3is fitted to one workload.If other workloads surface where flipped is the right choice
but gets penalized, consider scaling the overhead by
chunksinstead ofchild.scanEst, or making itproportional to the existing SQL cost rather than a flat
constant.
lbl_proj_000_15andlbl_proj_000_45both at 416k rows,planner gets ~393k estimates), so the cost model is
correctly informed; this PR is about cost formula, not
stats.