Skip to content

NVFP4 e2e training with Wikitext#125

Draft
matthiasdiener wants to merge 1 commit intomdiener/nvfp4-llamafrom
mdiener/nvfp4-e2etrain
Draft

NVFP4 e2e training with Wikitext#125
matthiasdiener wants to merge 1 commit intomdiener/nvfp4-llamafrom
mdiener/nvfp4-e2etrain

Conversation

@matthiasdiener
Copy link
Copy Markdown

@matthiasdiener matthiasdiener commented Apr 16, 2026

To run it:

$ pip install datasets regex transformers

$ python -c "
from datasets import load_dataset
import json
ds = load_dataset('wikitext', 'wikitext-103-raw-v1', split='train')
with open('wikitext103_train.jsonl', 'w') as f:
    for row in ds:
        text = row['text'].strip()
        if text:
            json.dump({'text': text}, f)
            f.write('\n')
"

$ python tools/preprocess_data.py \
    --input wikitext103_train.jsonl \
    --output-prefix wikitext103_train_text_document \
    --tokenizer-type HuggingFaceTokenizer \
    --tokenizer-model NousResearch/Meta-Llama-3-8B \
    --workers 8 \
    --append-eod

$ ./‎run_loss_wikitext.sh

$ ./plot_losses.py

Motivation

Technical Details

Test Plan

Test Result

Submission Checklist

@matthiasdiener matthiasdiener self-assigned this Apr 16, 2026
@matthiasdiener matthiasdiener force-pushed the mdiener/nvfp4-e2etrain branch from 61b4cb3 to 89806bf Compare April 16, 2026 15:16
@matthiasdiener matthiasdiener force-pushed the mdiener/nvfp4-e2etrain branch from 89806bf to 3d841a0 Compare April 16, 2026 15:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant