Skip to content

feat(memory): bundle #533 + #549 — facet guard, length limits, concise extraction#1530

Open
ZaynJarvis wants to merge 3 commits intomainfrom
memory-prompt-bundle
Open

feat(memory): bundle #533 + #549 — facet guard, length limits, concise extraction#1530
ZaynJarvis wants to merge 3 commits intomainfrom
memory-prompt-bundle

Conversation

@ZaynJarvis
Copy link
Copy Markdown
Collaborator

Cherry-picked bundle of two in-flight prompt-quality PRs from @lishixiang0705. Original commits preserved — authorship is unchanged; my only new commit is a fixup addressing the inline review feedback.

Commits

SHA Author Source Change
1c98e28 @lishixiang0705 #533 facet guard + length limits in memory_merge_bundle.yaml
2566355 @lishixiang0705 #549 length targets + vectorize-on-abstract
508576a @ZaynJarvis review fixup drop BAD/GOOD blocks, restore English examples

What each piece does

#533 — memory_merge_bundle.yaml v1 → v2

  • Facet coherence check: even within the same category, don't merge memories covering different facets (Python code style + food preferences stay separate). decision: skip added.
  • Hard length limits: abstract ≤ 80, overview ≤ 200, content ≤ 300.
  • Condensed snapshot strategy: on conflict keep newer value, do not accumulate superseded details.

#549 — memory_extraction.yaml + memory_extractor.py

  • Explicit length targets on the Three-Level Structure section (~50-80 / 3-5 bullets / 2-4 sentences).
  • set_vectorize(abstract or content) — shorter, keyword-dense text produces more discriminative embeddings.

Fixup — addresses inline review on #549

  • Drop the BAD/GOOD content example blocks. They duplicate the Few-shot Examples section below (my comment on feat: optimize memory extraction for concise output and precise retrieval #549).
  • Restore English values in the L0 bullet examples; the Chinese values would bias output_language: auto for non-Chinese users (@yangxinxin-7's point).
  • Did not update the existing Few-shot Examples to match the new length targets (yangxinxin-7's first point) — separate scope, follow-up.

Memory-quality effect

Pending eval. Will run retrieval precision / embedding score-spread numbers before merge.

Not in this PR

The queue stall fix from #951 is kept standalone — see its own companion PR (unrelated code path).

Test plan

  • Both YAML templates parse with yaml.safe_load
  • Retrieval precision / embedding score-spread eval on a real memory corpus

lishixiang and others added 3 commits April 17, 2026 13:57
- Add 'skip' decision: reject merging memories with different facets
  even if they share the same category, preventing semantic dilution
- Add hard character limits: abstract ≤80, overview ≤200, content ≤300
- Change merge strategy from accumulate-all to condensed snapshot:
  conflicts resolved by keeping newer version only
- Bump template version from 1.0.0 to 2.0.0

Motivation: without facet checking, the merge prompt would combine
unrelated facts (e.g. 'Python code style' + 'food preferences') into
a single bloated memory just because both were categorized as
'preferences'. Without length limits, merged memories grew unbounded
(some exceeding 1000+ chars), causing embedding dilution and low
retrieval precision in downstream vector search.
…eval

- Prompt (memory_extraction.yaml):
  - Add explicit length targets for abstract (~50-80 chars) and content (2-4 sentences)
  - Add good/bad examples showing concise vs verbose memory patterns
  - Guide LLM to split multi-topic memories into separate atomic items
  - Emphasize fact-dense 'sticky note' style over narrative expansion

- Vectorization (memory_extractor.py):
  - Use abstract instead of content for embedding generation
  - Shorter text produces more discriminative vectors, improving retrieval precision
  - Reduces score clustering (e.g., 0.18-0.21 all similar) by focusing embeddings

Background:
  In production, extracted memories averaged 500-2000 chars per item, causing:
  1. Embedding vector dilution — any query fuzzy-matches long content
  2. Poor score discrimination — relevant and irrelevant items score similarly
  3. Context bloat — 5 injected memories could exceed 5000 chars per turn

  After this change, new memories will be shorter and more atomic, and
  vector search will match on focused abstract text rather than diluted content.
…lish examples

Addresses review feedback on top of the cherry-picked #549 commit:

- Remove the BAD/GOOD content example blocks — they duplicate the
  Few-shot Examples section immediately below them (ZaynJarvis's inline
  comment on #549).
- Restore English values in the L0 bullet examples; the Chinese values
  introduced by #549 would bias `output_language: auto` for non-Chinese
  users (yangxinxin-7's inline comment on #549).

Keeps the substantive contribution from #549: the length targets
(~50-80 chars / 3-5 bullets / 2-4 sentences) and the vectorize-on-
abstract switch in memory_extractor.py.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

🎫 Ticket compliance analysis ✅

533 - Fully compliant

Compliant requirements:

  • Added facet coherence check with "skip" decision examples
  • Added hard character limits with summarization instructions
  • Updated to condensed snapshot strategy

549 - Fully compliant

Compliant requirements:

  • Added explicit length targets (abstract ~50-80 chars, content 2-4 sentences)
  • Updated vectorization to use abstract or content
⏱️ Estimated effort to review: 2 🔵🔵⚪⚪⚪
🏅 Score: 85
🧪 No relevant tests
🔒 No security concerns identified
✅ No TODO sections
🔀 Multiple PR themes

Sub-PR theme: feat: optimize memory extraction with concise prompts and abstract vectorization

Relevant files:

  • openviking/session/memory_extractor.py
  • openviking/prompts/templates/compression/memory_extraction.yaml

Sub-PR theme: feat: add facet guard and length limits to memory_merge_bundle

Relevant files:

  • openviking/prompts/templates/compression/memory_merge_bundle.yaml

⚡ Recommended focus areas for review

Breaking Change Risk

The template now returns decision: "skip", but the PR does not include updates to code that consumes this template. Callers expecting only "merge" will fail to handle the new decision value.

If the two memories cover DIFFERENT facets, you MUST output decision "skip".
Do NOT merge unrelated information just because they share the same category.

## Step 2: Merge (only if same facet)

If merging:
- When facts conflict, keep the NEWER version only.
- Condense to essential facts. Do NOT accumulate every historical detail.
- The merged memory should read as a clean, up-to-date snapshot — not a changelog.

## Hard length limits

- `abstract`: ≤ 80 characters
- `overview`: ≤ 200 characters
- `content`: ≤ 300 characters

If the merged result would exceed these limits, aggressively summarize.
Drop older, less important details first. Preserve specific values (names, numbers, versions) over narrative.

## Output

Return JSON only. Two possible decisions:

When merging:
{
  "decision": "merge",
  "abstract": "one-line L0 (≤80 chars)",
  "overview": "structured L1 (≤200 chars)",
  "content": "condensed L2 (≤300 chars)",
  "reason": "short reason"
}

When skipping (different facets):
{
  "decision": "skip",
  "reason": "short reason why these are different facets"
}

@github-actions
Copy link
Copy Markdown

PR Code Suggestions ✨

No code suggestions found for the PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: Backlog

Development

Successfully merging this pull request may close these issues.

1 participant