fix(kv-cache): drop superseded continued snapshots on store by unsaltedbutter-ai · Pull Request #175 · antirez/ds4

unsaltedbutter-ai · 2026-05-17T02:33:04Z

Fixes #174

Implements the supersession proposed in #174.
Adds kv_cache_prune_supersedes in the store path, called before kv_cache_evict for cold and continued stores (skipped for evict/shutdown stores where the saved live state may diverge from upcoming requests). Includes four unit tests.

A chat request whose prompt exceeds the continued-interval threshold writes nested snapshots at every 10240-token boundary (10k, 20k, 30k, ...) plus a final cold/evict snapshot at the trim length. The intermediate continued snapshots strictly dominate each other on prefix lookup, but they all compete for the disk-cache budget. In production logs the older snapshots are picked as eviction victims within seconds of being written, ending the request with no usable cache entry. Add kv_cache_prune_supersedes, called from the store path before kv_cache_evict for cold and continued stores. It walks the index and unlinks any CONTINUED entry whose text_bytes are a strict prefix of the new snapshot (verified by recomputing SHA-1 over the new text at the older entry's byte length). Cold/evict/shutdown entries on disk are intentional checkpoints (for example the anchor cold at the chat-task boundary) and are left alone so workloads that diverge past their length can still hit them. The prune is skipped for evict and shutdown stores because those save a live state at the moment it has just diverged from an incoming request; the saved content can include post-divergence tokens that no future prompt will match, so shorter same-session continued snapshots remain strictly pre-divergence prefixes that still serve correctly. Deleting them on an evict store would replace a near-match disk hit with a much shorter one, wasting all the prefill work from earlier in the same session. Eviction's refresh then sees the pruned directory and runs against a smaller candidate set. Result: one snapshot per workload prefix instead of N nested copies. A 49k-token prefill that previously wrote 2.1 GiB of snapshots, all evicted within the request, ends with a single 666 MiB entry on disk that the next same-prompt request can hit. Cross-conversation hits on explicit cold checkpoints are preserved, and shorter pre- divergence snapshots survive evict stores so a live-miss followed by a near-match request still loads from the longest available prefix. Adds four unit tests: strict-prefix continued unlink, unrelated entry kept, self-skip, cold-checkpoint kept. test_kv_text_stub_file grows a reason parameter; both existing callers pass KV_REASON_COLD.

unsaltedbutter-ai mentioned this pull request May 18, 2026

Disk KV cache: cold/evict checkpoints get no eviction-score advantage over continued #176

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(kv-cache): drop superseded continued snapshots on store#175

fix(kv-cache): drop superseded continued snapshots on store#175
unsaltedbutter-ai wants to merge 1 commit into
antirez:mainfrom
unsaltedbutter-ai:fix/kv-cache-prune-superseded

unsaltedbutter-ai commented May 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

unsaltedbutter-ai commented May 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant