Skip to content

Conversation

@hmellor
Copy link
Member

@hmellor hmellor commented Nov 12, 2025

In Transformers v5:

  • rope_scaling is now called rope_parameters
  • rope_theta now lives inside rope_parameters
  • rope_parameters may be nested for models which have different RoPE parameters for each layer type (i.e. Gemma & ModernBERT)

This PR adds forward compatibility for Transformesr v5 RoPE config by:

  • Moving any found config.rope_scaling to config.rope_parameters
  • Moving any found config.rope_theta to config.rope_parameters.rope_theta
  • Performs patch_rope_parameters_dict on all nested RoPE parameters if present
  • Globally renaming rope_scaling to rope_parameters
  • Updating the logic for retrieving base and rope_parameters for the get_rope method

@mergify
Copy link

mergify bot commented Nov 12, 2025

Documentation preview: https://vllm--28542.org.readthedocs.build/en/28542/

@mergify mergify bot added documentation Improvements or additions to documentation llama Related to Llama models performance Performance-related issues qwen Related to Qwen models gpt-oss Related to GPT-OSS models speculative-decoding labels Nov 12, 2025
@mergify mergify bot added the ci/build label Nov 13, 2025
Signed-off-by: Harry Mellor <[email protected]>
Signed-off-by: Harry Mellor <[email protected]>
Signed-off-by: Harry Mellor <[email protected]>
Signed-off-by: Harry Mellor <[email protected]>
Signed-off-by: Harry Mellor <[email protected]>
Signed-off-by: Harry Mellor <[email protected]>
@hmellor hmellor added the ready ONLY add when PR is ready to merge/full CI is needed label Nov 13, 2025
@mergify mergify bot added the deepseek Related to DeepSeek models label Nov 13, 2025
Signed-off-by: Harry Mellor <[email protected]>
Signed-off-by: Harry Mellor <[email protected]>
Signed-off-by: Harry Mellor <[email protected]>
@hmellor
Copy link
Member Author

hmellor commented Nov 13, 2025

(marking as ready so that I can run CI)

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci/build deepseek Related to DeepSeek models documentation Improvements or additions to documentation gpt-oss Related to GPT-OSS models llama Related to Llama models performance Performance-related issues qwen Related to Qwen models ready ONLY add when PR is ready to merge/full CI is needed speculative-decoding

Projects

Status: To Triage

Development

Successfully merging this pull request may close these issues.

1 participant