[bug]Preserve Claude subagent token totals during condensation#1394
Open
stale2000 wants to merge 1 commit into
Open
[bug]Preserve Claude subagent token totals during condensation#1394stale2000 wants to merge 1 commit into
stale2000 wants to merge 1 commit into
Conversation
39234cb to
a0b732a
Compare
Condensation was recalculating Claude Code token usage without the session subagent transcript directory, so metadata written on user commits dropped Task-spawned token totals even though live session state could see them. The fix reuses the transcript-dir/session-id convention already used by lifecycle hooks and locks the behavior with a focused condensation regression. Constraint: Condensation must keep full-session raw transcripts while scoping token usage to CheckpointTranscriptStart. Rejected: Rework token parsing or checkpoint storage | the loss was caused by a missing subagent transcript directory argument, not the parser or metadata schema. Confidence: high Scope-risk: narrow Directive: Keep condensation subagent path derivation aligned with lifecycle hook path derivation when changing transcript layouts. Tested: go test ./cmd/entire/cli/strategy -run 'TestCondenseSession' -count=1 Tested: go test ./cmd/entire/cli/agent/claudecode -run 'TestCalculateTotalTokenUsage|TestExtractAllModifiedFiles' -count=1 Tested: mise run lint Entire-Checkpoint: 6656450350ce
a0b732a to
3a07223
Compare
stale2000
commented
Jun 9, 2026
| // extract them from offset 0; consumers can filter by checkpoint_transcript_start | ||
| // if they only render the checkpoint-scoped slice. | ||
| if len(data.Transcript) > 0 { | ||
| data.TokenUsage = agent.CalculateTokenUsage(ctx, ag, data.Transcript, checkpointTranscriptStart, "") //TODO: why do we not use here subagents dir? |
Author
There was a problem hiding this comment.
Answer:
"So the nuanced answer is:
For live transcript condensation: bug.
For shadow-only fallback with no live path: empty dir may be unavoidable.
The TODO was probably someone noticing the bug-prone gap during cleanup but not resolving the live-vs-shadow distinction.
Our fix handles exactly that: derive the dir when a live transcript path exists, otherwise helper returns "".
"
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Entire Logs: https://entire.io/gh/stale2000/cli/session/019eaa64-5270-74c0-9f8a-e4df88c7014e
SUMMARY:
After finding and making the fix, I noticed the log saying that the "bug" might be intentional? Or it is unknown why it is written this way?
I did an analysis as for why it might have been written this way and got the following response that implies that its indeed a bug, and the previous implementation didn't know how to solve it. But this PR does fix the issue.
Explanation/response to the TODO statement:
"
So the nuanced answer is:
For live transcript condensation: bug.
For shadow-only fallback with no live path: empty dir may be unavoidable.
The TODO was probably someone noticing the bug-prone gap during cleanup but not resolving the live-vs-shadow distinction.
Our fix handles exactly that: derive the dir when a live transcript path exists, otherwise helper returns "".
"
Summary
This fixes a metadata loss bug in checkpoint condensation for Claude Code sessions that spawn Task subagents.
Before this change, live session handling could see subagent token usage, but the later condensation step recalculated token usage without passing the session's subagent transcript directory. As a result,
SubagentTokenswere silently dropped from the committed checkpoint metadata written toentire/checkpoints/v1.Issue
Claude Code stores Task subagent transcripts separately from the main transcript:
The lifecycle path already knows that convention and passes the subagent directory into token/file extraction. Condensation did not. Both condensation paths called:
For subagent-aware agents, the empty directory intentionally disables subagent transcript reads. That means the main transcript token totals survived, but spawned subagent token totals disappeared when checkpoint metadata was committed.
How to reproduce
One reproducible shape is:
Enable Entire in a repo and start a Claude Code session.
Have Claude invoke the
Tasktool so the main transcript contains a Tasktool_useand a correspondingtool_resultwith anagentId, for exampleagentId: sub1.Ensure the subagent transcript exists at:
and contains assistant usage metadata.
Let the normal stop hook run. At this point live session state can include subagent token usage because the lifecycle path passes the subagent directory.
Make a user commit so the post-commit hook condenses the session into committed checkpoint metadata.
Inspect the committed checkpoint metadata from
entire/checkpoints/v1.Before this fix, the condensed checkpoint metadata had main token usage but no
subagent_tokensfield, even though the subagent transcript was present and readable.The added regression test builds that shape directly: a main Claude transcript with a Task result referencing
sub1, a matchingagent-sub1.jsonlsubagent transcript, and a condensation call that verifiesSubagentTokenssurvive in committed metadata.Impact
This affects checkpoint metadata for Claude Code sessions that use Task subagents.
Observed effects:
entire/checkpoints/v1underreports total work done by spawned subagents.The code path is shared through the
SubagentAwareExtractortoken interface, so passing the directory also keeps condensation aligned with other subagent-aware implementations that use the same transcript layout convention.Fix
subagentsDirForTranscript(transcriptPath, sessionID)in the strategy package.agent.CalculateTokenUsagefrom both condensation paths:TestCondenseSession_ClaudeSubagentTokenUsageto lock the end-to-end condensation behavior.Verification