Skip to content

Conversation

@franklevasseur
Copy link
Member

I ran a small benchmark on a 5.09MB, 1.02M tokens and this is what I got:

┌───────────────────┬──────┬──────┬─────────┐
│ (index)           │ slow │ fast │ speedup │
├───────────────────┼──────┼──────┼─────────┤
│ first 100 tokens  │ 1349 │ 814  │ '1.66'  │
│ first 1k tokens   │ 1547 │ 892  │ '1.73'  │
│ last 100 tokens   │ 1432 │ 819  │ '1.75'  │
│ last 1k tokens    │ 1417 │ 824  │ '1.72'  │
│ middle 100 tokens │ 1385 │ 820  │ '1.69'  │
│ middle 1k tokens  │ 1399 │ 810  │ '1.73'  │
│ edge 100 tokens   │ 1389 │ 826  │ '1.68'  │
│ edge 1k tokens    │ 1343 │ 825  │ '1.63'  │
│ random 100 tokens │ 1496 │ 847  │ '1.77'  │
│ random 1k tokens  │ 1392 │ 874  │ '1.59'  │
└───────────────────┴──────┴──────┴─────────┘

I could have offered a simpler API like

type TokenSlice = {
  preserve: "start" | "end" | "both",
  tokenCount: number
}

But I think I prefer letting LLMz decide how to slice. If the slicing algorithm changes in LLMz, the thicktoken lib can remain unchanged.

@franklevasseur franklevasseur requested a review from a team as a code owner January 27, 2026 16:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants