Skip to content

Commit de1b0ea

Browse files
Update tokenizer for llama3 (#144)
Co-authored-by: Sergei <[email protected]>
1 parent 1ce9a0f commit de1b0ea

File tree

2 files changed

+2
-2
lines changed

2 files changed

+2
-2
lines changed

AGENTS.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@ Working notes for future agents hacking on `tinker-cookbook`. Additional docs ca
1919
- Launch scripts define a CLI-facing `CLIConfig` (parsed by `chz`) that instantiates the richer training `Config`. This gives every recipe a consistent `python -m ... key=value` interface.
2020
- Env builders compose like `RLDatasetBuilder → EnvGroupBuilder → Env`. Groups let us share metadata (tags, pairwise comparisons) and center rewards across related rollouts.
2121
- **Completers:** algorithms interact with the `TokenCompleter` interface. `TinkerTokenCompleter` (wrapping a `SamplingClient`) is the default implementation, but evaluators may accept any `TokenCompleter` or `MessageCompleter`.
22-
- **Renderers & tokenizer utils:** pick the renderer that matches your tokenizer/model pair (e.g., `role_colon`, `llama3`, `qwen3`). `TrainOnWhat` controls which tokens get weight=1 in SFT. Tokenizers are cached via `tokenizer_utils.get_tokenizer`, with Llama-3 names remapped to `baseten/Meta-Llama-3-tokenizer` to bypass HF gating.
22+
- **Renderers & tokenizer utils:** pick the renderer that matches your tokenizer/model pair (e.g., `role_colon`, `llama3`, `qwen3`). `TrainOnWhat` controls which tokens get weight=1 in SFT. Tokenizers are cached via `tokenizer_utils.get_tokenizer`, with Llama-3 names remapped to `thinkingmachineslabinc/meta-llama-3-tokenizer` to bypass HF gating.
2323
- **Loss plumbing:** every `tinker.Datum` bundles a `model_input` plus `loss_fn_inputs` (`TensorData`). Use helpers such as `conversation_to_datum`, `datum_from_tokens_weights`, and `_remove_mask` instead of constructing dicts manually. Built-in losses: `cross_entropy`, `importance_sampling`, `ppo`; `forward_backward_custom` covers bespoke differentiable objectives.
2424

2525
## Conventions & Notation (from CONTRIBUTING)

tinker_cookbook/tokenizer_utils.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,6 @@ def get_tokenizer(model_name: str) -> Tokenizer:
2626

2727
# Avoid gating of Llama 3 models:
2828
if model_name.startswith("meta-llama/Llama-3"):
29-
model_name = "baseten/Meta-Llama-3-tokenizer"
29+
model_name = "thinkingmachineslabinc/meta-llama-3-tokenizer"
3030

3131
return AutoTokenizer.from_pretrained(model_name, use_fast=True)

0 commit comments

Comments
 (0)