From 1c98e28de9e7162496bba987d3ff1b707a2b25b7 Mon Sep 17 00:00:00 2001 From: lishixiang Date: Thu, 12 Mar 2026 01:27:07 +0800 Subject: [PATCH 1/3] feat(prompts): add facet guard and length limits to memory_merge_bundle MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - Add 'skip' decision: reject merging memories with different facets even if they share the same category, preventing semantic dilution - Add hard character limits: abstract ≤80, overview ≤200, content ≤300 - Change merge strategy from accumulate-all to condensed snapshot: conflicts resolved by keeping newer version only - Bump template version from 1.0.0 to 2.0.0 Motivation: without facet checking, the merge prompt would combine unrelated facts (e.g. 'Python code style' + 'food preferences') into a single bloated memory just because both were categorized as 'preferences'. Without length limits, merged memories grew unbounded (some exceeding 1000+ chars), causing embedding dilution and low retrieval precision in downstream vector search. --- .../compression/memory_merge_bundle.yaml | 64 +++++++++++++++---- 1 file changed, 50 insertions(+), 14 deletions(-) diff --git a/openviking/prompts/templates/compression/memory_merge_bundle.yaml b/openviking/prompts/templates/compression/memory_merge_bundle.yaml index 14bc77698..d1ab862b1 100644 --- a/openviking/prompts/templates/compression/memory_merge_bundle.yaml +++ b/openviking/prompts/templates/compression/memory_merge_bundle.yaml @@ -1,8 +1,8 @@ metadata: id: "compression.memory_merge_bundle" name: "Memory Merge Bundle" - description: "Merge memory and return L0/L1/L2 in one structured response" - version: "1.0.0" + description: "Merge memory and return L0/L1/L2 in one structured response, with facet guard and length limits" + version: "2.0.0" language: "en" category: "compression" @@ -46,7 +46,7 @@ variables: default: "auto" template: | - You are merging one existing memory with one new memory update. + You are deciding whether to merge two memories of the same category, or skip (keep them separate). Category: {{ category }} Target Output Language: {{ output_language }} @@ -61,26 +61,62 @@ template: | - Overview (L1): {{ new_overview }} - Content (L2): {{ new_content }} - Requirements: - - Merge into a single coherent memory. - - Keep non-conflicting details from existing memory. - - Update conflicting details to reflect the newer fact. - - Output language must be {{ output_language }}. - - Return JSON only. + ## Step 1: Facet coherence check + + Before merging, determine whether the two memories describe the SAME specific facet/topic. + + Same facet examples (should merge): + - "Python code style: no type hints" + "Python code style: concise comments" → same facet (Python code style) + - "OpenViking project: memory extraction" + "OpenViking project: added dedup" → same facet (OpenViking project) + + Different facet examples (should skip): + - "Python code style: no type hints" + "Food preference: likes apples" → different facets + - "Server 192.168.2.75: runs agent-helper" + "OpenViking project: memory extraction" → different facets + - "Git commit style: zh-CN verbs" + "Exit code semantics: 0/1/2" → different facets + + If the two memories cover DIFFERENT facets, you MUST output decision "skip". + Do NOT merge unrelated information just because they share the same category. + + ## Step 2: Merge (only if same facet) + + If merging: + - When facts conflict, keep the NEWER version only. + - Condense to essential facts. Do NOT accumulate every historical detail. + - The merged memory should read as a clean, up-to-date snapshot — not a changelog. + + ## Hard length limits - Output JSON schema: + - `abstract`: ≤ 80 characters + - `overview`: ≤ 200 characters + - `content`: ≤ 300 characters + + If the merged result would exceed these limits, aggressively summarize. + Drop older, less important details first. Preserve specific values (names, numbers, versions) over narrative. + + ## Output + + Return JSON only. Two possible decisions: + + When merging: { "decision": "merge", - "abstract": "one-line L0 summary", - "overview": "structured markdown L1 summary", - "content": "full merged L2 content", + "abstract": "one-line L0 (≤80 chars)", + "overview": "structured L1 (≤200 chars)", + "content": "condensed L2 (≤300 chars)", "reason": "short reason" } + When skipping (different facets): + { + "decision": "skip", + "reason": "short reason why these are different facets" + } + Constraints: - `abstract` must be concise and specific. - - `overview` and `content` must be non-empty. + - `overview` and `content` must be non-empty when decision is "merge". - Do not output any text outside JSON. + - Output language must be {{ output_language }}. llm_config: temperature: 0.0 From 256635511e90cd36b1ff4f33b99fd606aaff9fa0 Mon Sep 17 00:00:00 2001 From: lishixiang Date: Thu, 12 Mar 2026 16:48:20 +0800 Subject: [PATCH 2/3] feat: optimize memory extraction for concise output and precise retrieval MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - Prompt (memory_extraction.yaml): - Add explicit length targets for abstract (~50-80 chars) and content (2-4 sentences) - Add good/bad examples showing concise vs verbose memory patterns - Guide LLM to split multi-topic memories into separate atomic items - Emphasize fact-dense 'sticky note' style over narrative expansion - Vectorization (memory_extractor.py): - Use abstract instead of content for embedding generation - Shorter text produces more discriminative vectors, improving retrieval precision - Reduces score clustering (e.g., 0.18-0.21 all similar) by focusing embeddings Background: In production, extracted memories averaged 500-2000 chars per item, causing: 1. Embedding vector dilution — any query fuzzy-matches long content 2. Poor score discrimination — relevant and irrelevant items score similarly 3. Context bloat — 5 injected memories could exceed 5000 chars per turn After this change, new memories will be shorter and more atomic, and vector search will match on focused abstract text rather than diluted content. --- .../compression/memory_extraction.yaml | 46 ++++++++++++++++--- openviking/session/memory_extractor.py | 6 ++- 2 files changed, 43 insertions(+), 9 deletions(-) diff --git a/openviking/prompts/templates/compression/memory_extraction.yaml b/openviking/prompts/templates/compression/memory_extraction.yaml index 704726b89..ebc6ec260 100644 --- a/openviking/prompts/templates/compression/memory_extraction.yaml +++ b/openviking/prompts/templates/compression/memory_extraction.yaml @@ -204,25 +204,57 @@ template: | # Three-Level Structure - Each memory contains three levels, each serving a purpose: + Each memory contains three levels. **Keep all levels concise — memories are sticky notes, not essays.** - **abstract (L0)**: Index layer, plain text one-liner + **abstract (L0)**: Index layer, plain text one-liner. **Target: 1 sentence, ~50-80 characters.** + - This is the PRIMARY retrieval key — it MUST be specific enough to distinguish this memory from others. - Merge types (preferences/entities/profile/patterns): `[Merge key]: [Description]` - preferences: `Python code style: No type hints, concise and direct` - - entities: `OpenViking project: AI Agent long-term memory management system` - - profile: `User basic info: AI development engineer, 3 years experience` + - entities: `OpenViking: AI Agent 长期记忆系统,Python+AGFS,本地 oMLX embedding` + - profile: `AI 开发工程师,3年 LLM 应用经验,专注 Agent 架构` - patterns: `Teaching topic handling: Outline→Plan→Generate PPT` - Independent types (events/cases): Specific description - - events: `Decided to refactor memory system: Simplify to 5 categories` + - events: `2026-03-10 禁用 Lossless-Claw:CJK token 低估 3x + 预算失控` - cases: `Band not recognized → Request member/album/style details` - **overview (L1)**: Structured summary layer, organized with Markdown headings + **overview (L1)**: Structured summary layer, organized with Markdown headings. **Target: 3-5 bullet points.** - preferences: `## Preference Domain` / `## Specific Preferences` - entities: `## Basic Info` / `## Core Attributes` - events: `## Decision Content` / `## Reason` / `## Result` - cases: `## Problem` / `## Solution` - **content (L2)**: Detailed expansion layer, free Markdown, includes background, timeline, complete narrative + **content (L2)**: Core facts layer. **Target: 2-4 sentences with all essential specifics.** + - Capture ONLY the facts that would be lost if this memory were deleted. + - Include: names, versions, numbers, configurations, error messages, solutions. + - Exclude: background narratives, general explanations, elaboration of obvious points. + - Think: "What would I write on a sticky note to remind myself?" + + **❌ BAD content** (too long, narrative-heavy): + ``` + OpenViking is a long-term memory management system for AI Agents, originally open-sourced by + Volcengine, with the user currently maintaining a local instance and developing it as a + memory-openviking plugin compatible with the OpenClaw environment. The system employs a + front-end/back-end separated architecture built around an AGFS foundation... [1200+ chars] + ``` + + **✅ GOOD content** (concise, fact-dense): + ``` + OpenViking (OV): 火山引擎开源 AI Agent 长期记忆系统,用户本地维护。Python+AGFS,viking:// URI, + L0-L4 分层上下文。本地 oMLX 4bit-DWQ embedding,dashscope/qwen3.5-plus VLM。 + 296 记忆文件,2872 向量,34MB vectordb。 + ``` + + **❌ BAD content** (single memory with too many topics): + ``` + Lossless-Claw (LCM v0.2.8) was a third-party LLM context auto-compression plugin... + The disablement resulted from fatal defects... The Meridian project now serves as + the successor... [1400+ chars mixing entity + cause + successor] + ``` + + **✅ GOOD** (split into separate memories): + Memory 1 [entities]: `Lossless-Claw (LCM v0.2.8): OpenClaw LLM 压缩插件,Martian-Engineering 开发,2026-03-10 已禁用。` + Memory 2 [cases]: `LCM 禁用原因:estimateTokens 对 CJK 低估 3x;assemble() 无预算控制致注入膨胀。` + Memory 3 [entities]: `Meridian: LCM 后继,复用~1200行。SQLite+FTS5、tiktoken、三区预算硬分配。` # Few-shot Examples diff --git a/openviking/session/memory_extractor.py b/openviking/session/memory_extractor.py index fefef0ba7..cff41e88f 100644 --- a/openviking/session/memory_extractor.py +++ b/openviking/session/memory_extractor.py @@ -474,7 +474,8 @@ async def create_memory( owner_space=owner_space, ) logger.info(f"uri {memory_uri} abstract: {payload.abstract} content: {payload.content}") - memory.set_vectorize(Vectorize(text=payload.content)) + # Use abstract for vectorization — shorter text produces more discriminative embeddings + memory.set_vectorize(Vectorize(text=payload.abstract or payload.content)) return memory # Determine parent URI based on category @@ -514,7 +515,8 @@ async def create_memory( owner_space=owner_space, ) logger.info(f"uri {memory_uri} abstract: {candidate.abstract} content: {candidate.content}") - memory.set_vectorize(Vectorize(text=candidate.content)) + # Use abstract for vectorization — shorter text produces more discriminative embeddings + memory.set_vectorize(Vectorize(text=candidate.abstract or candidate.content)) return memory async def _append_to_profile( From 508576a83c33ae989b343b14a79136ec9f3b9fae Mon Sep 17 00:00:00 2001 From: "zhiheng.liu" Date: Fri, 17 Apr 2026 13:57:46 +0800 Subject: [PATCH 3/3] fixup(memory_extraction): drop redundant BAD/GOOD blocks, restore English examples MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Addresses review feedback on top of the cherry-picked #549 commit: - Remove the BAD/GOOD content example blocks — they duplicate the Few-shot Examples section immediately below them (ZaynJarvis's inline comment on #549). - Restore English values in the L0 bullet examples; the Chinese values introduced by #549 would bias `output_language: auto` for non-Chinese users (yangxinxin-7's inline comment on #549). Keeps the substantive contribution from #549: the length targets (~50-80 chars / 3-5 bullets / 2-4 sentences) and the vectorize-on- abstract switch in memory_extractor.py. Co-Authored-By: Claude Opus 4.7 --- .../compression/memory_extraction.yaml | 34 ++----------------- 1 file changed, 3 insertions(+), 31 deletions(-) diff --git a/openviking/prompts/templates/compression/memory_extraction.yaml b/openviking/prompts/templates/compression/memory_extraction.yaml index ebc6ec260..ee097145c 100644 --- a/openviking/prompts/templates/compression/memory_extraction.yaml +++ b/openviking/prompts/templates/compression/memory_extraction.yaml @@ -210,11 +210,11 @@ template: | - This is the PRIMARY retrieval key — it MUST be specific enough to distinguish this memory from others. - Merge types (preferences/entities/profile/patterns): `[Merge key]: [Description]` - preferences: `Python code style: No type hints, concise and direct` - - entities: `OpenViking: AI Agent 长期记忆系统,Python+AGFS,本地 oMLX embedding` - - profile: `AI 开发工程师,3年 LLM 应用经验,专注 Agent 架构` + - entities: `OpenViking project: AI Agent long-term memory management system` + - profile: `User basic info: AI development engineer, 3 years experience` - patterns: `Teaching topic handling: Outline→Plan→Generate PPT` - Independent types (events/cases): Specific description - - events: `2026-03-10 禁用 Lossless-Claw:CJK token 低估 3x + 预算失控` + - events: `Decided to refactor memory system: Simplify to 5 categories` - cases: `Band not recognized → Request member/album/style details` **overview (L1)**: Structured summary layer, organized with Markdown headings. **Target: 3-5 bullet points.** @@ -227,34 +227,6 @@ template: | - Capture ONLY the facts that would be lost if this memory were deleted. - Include: names, versions, numbers, configurations, error messages, solutions. - Exclude: background narratives, general explanations, elaboration of obvious points. - - Think: "What would I write on a sticky note to remind myself?" - - **❌ BAD content** (too long, narrative-heavy): - ``` - OpenViking is a long-term memory management system for AI Agents, originally open-sourced by - Volcengine, with the user currently maintaining a local instance and developing it as a - memory-openviking plugin compatible with the OpenClaw environment. The system employs a - front-end/back-end separated architecture built around an AGFS foundation... [1200+ chars] - ``` - - **✅ GOOD content** (concise, fact-dense): - ``` - OpenViking (OV): 火山引擎开源 AI Agent 长期记忆系统,用户本地维护。Python+AGFS,viking:// URI, - L0-L4 分层上下文。本地 oMLX 4bit-DWQ embedding,dashscope/qwen3.5-plus VLM。 - 296 记忆文件,2872 向量,34MB vectordb。 - ``` - - **❌ BAD content** (single memory with too many topics): - ``` - Lossless-Claw (LCM v0.2.8) was a third-party LLM context auto-compression plugin... - The disablement resulted from fatal defects... The Meridian project now serves as - the successor... [1400+ chars mixing entity + cause + successor] - ``` - - **✅ GOOD** (split into separate memories): - Memory 1 [entities]: `Lossless-Claw (LCM v0.2.8): OpenClaw LLM 压缩插件,Martian-Engineering 开发,2026-03-10 已禁用。` - Memory 2 [cases]: `LCM 禁用原因:estimateTokens 对 CJK 低估 3x;assemble() 无预算控制致注入膨胀。` - Memory 3 [entities]: `Meridian: LCM 后继,复用~1200行。SQLite+FTS5、tiktoken、三区预算硬分配。` # Few-shot Examples