feat: add MTP to ds-r1 ref. impl #2403

viraatc · 2025-12-04T01:07:45Z

No description provided.

github-actions · 2025-12-04T01:07:54Z

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

viraatc · 2025-12-04T01:18:57Z

@mrmhodak @hanyunfan @tanvi-mlcommons
@nvzhihanj

had to create new mr due to some artifacts

language/deepseek-r1/utils/backend_registry.py

tanvi-mlcommons

Thanks, LGTM

viraatc

ready for merge

loadgen/mlperf.conf

viraatc · 2025-12-16T17:38:21Z

language/deepseek-r1/utils/backend_registry.py

            "api_key": None,
            "tensor_parallel_size": 8,
            # NOTE(vir): sg-lang crash without +2 additional
            "context_length": MAX_ISL + MAX_OSL + MAX_TEMPLATE_TOKS + 2,


i see warning regarding max-sequence-length when when MTP is enabled.
possible SGLANG bug:

Skipping prompt 2738 due to error: Error code: 400 - {'object': 'error', 'message': "Requested token count exceeds the model's maximum context length of 23142 tokens. You requested a total of 23144 tokens: 3144 tokens from the input messages and 20000 tokens for the completion. Please reduce the number of tokens in the input messages or the completion to fit within the limit.", 'type': 'BadRequestError', 'param': None, 'code': 400}

Could it be the last MTP emitting +2 tokens 🤔 . But good to know

actual token-len (after applying chat-template) of prompt#2738 is 3140 (also the max-ISL in ds-r1 dataset).

sglang server returns HTTP400 (invalid request),
calculating input length 3144 (With MTP) and 3142 (Without MTP)

i suspect application of chat-template / related. arguments
(sglang auto-applies chat-template on v1/chat/completions)

for now, WAR is to update the +2 to +5 to now also account for MTP (+4 was not sufficient);
max output length is anyways bounded by sampling parameter max_tokens (no output changes)

ive started thread in sglang slack channel, asking for clarifications:

https://sgl-fru7574.slack.com/archives/C07PEP77X6F/p1765921172198089

tools/submission/submission_checker.py

viraatc

submission-checker changes:

language/deepseek-r1/docker/Dockerfile.sglang

nvzhihanj · 2025-12-16T22:53:25Z

@tanvi-mlcommons @mrmhodak we need a approval for this PR

mrmhodak · 2025-12-17T04:41:56Z

@nvzhihanj : There is a failed check

viraatc · 2025-12-17T06:30:10Z

mlc.script_action.ScriptExecutionError: Script run execution failed. Error : 504 Server Error: Gateway Time-out for url: https://zenodo.org/record/6617879/files/resnext50_32x4d_fpn.onnx

seems transient?
pushed an empty to redo checks.

arjunsuresh · 2025-12-18T21:42:00Z

@viraatc please fix the merge conflicts

feat: add MTP to ds-r1 ref. impl

674f4d3

viraatc requested a review from a team as a code owner December 4, 2025 01:07

viraatc and others added 2 commits December 3, 2025 17:17

cleanup diff

9099495

[Automated Commit] Format Codebase

d4eec6e

viraatc commented Dec 4, 2025

View reviewed changes

language/deepseek-r1/utils/backend_registry.py Show resolved Hide resolved

tanvi-mlcommons previously approved these changes Dec 4, 2025

View reviewed changes

cleanup

2e599ae

viraatc dismissed tanvi-mlcommons’s stale review via 2e599ae December 8, 2025 21:23

update readme

b0d66e7

viraatc commented Dec 16, 2025

View reviewed changes

loadgen/mlperf.conf Outdated Show resolved Hide resolved

viraatc commented Dec 16, 2025

View reviewed changes

update submission checker

116bcd7

viraatc commented Dec 16, 2025

View reviewed changes

tools/submission/submission_checker.py Show resolved Hide resolved

viraatc commented Dec 16, 2025

View reviewed changes

nvzhihanj reviewed Dec 16, 2025

View reviewed changes

language/deepseek-r1/docker/Dockerfile.sglang Show resolved Hide resolved

nvzhihanj approved these changes Dec 16, 2025

View reviewed changes

viraatc and others added 7 commits December 16, 2025 22:31

chore: trigger CI

ec17ed7

address comments #1

2e25357

fix --extra-mounts and update readme

d3f800b

[Automated Commit] Format Codebase

5a3fe38

updates

64c4ee9

update max-osl for sglang backend

4a56af1

add note on sglang version

52533c0

fix sglang max-seq-len for mtp

66f30c1

feat: add MTP to ds-r1 ref. impl #2403

Are you sure you want to change the base?

feat: add MTP to ds-r1 ref. impl #2403

Uh oh!

Conversation

viraatc commented Dec 4, 2025

Uh oh!

github-actions bot commented Dec 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

viraatc commented Dec 4, 2025

Uh oh!

Uh oh!

tanvi-mlcommons left a comment

Choose a reason for hiding this comment

Uh oh!

viraatc left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

viraatc Dec 16, 2025

Choose a reason for hiding this comment

Uh oh!

nvzhihanj Dec 16, 2025

Choose a reason for hiding this comment

Uh oh!

viraatc Dec 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

viraatc left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

nvzhihanj commented Dec 16, 2025

Uh oh!

mrmhodak commented Dec 17, 2025

Uh oh!

viraatc commented Dec 17, 2025

Uh oh!

arjunsuresh commented Dec 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

github-actions bot commented Dec 4, 2025 •

edited

Loading

viraatc Dec 17, 2025 •

edited

Loading