[megatron] support qwen3.5 models for megatron, bump mbridge + megatron-core to latest by erictang000 · Pull Request #1425 · NovaSky-AI/SkyRL

erictang000 · 2026-04-01T20:25:55Z

GPU CI: https://github.com/NovaSky-AI/SkyRL/actions/runs/23869520430
Megatron GPU CI: https://github.com/NovaSky-AI/SkyRL/actions/runs/23869278330
Megatron GPU CI #2: https://github.com/NovaSky-AI/SkyRL/actions/runs/24045414612
megatron gpu CI #3: https://github.com/NovaSky-AI/SkyRL/actions/runs/24054807024

WandB run for Qwen3.5-0.8B:
https://wandb.ai/sky-posttraining-uc-berkeley/gsm8k_megatron/runs/5cm9tg0j

gemini-code-assist

Code Review

This pull request adds a Megatron training script for Qwen 3.5, updates dependencies, and introduces a monkey-patch for transformers v5 compatibility. Review feedback identifies a likely version typo in pyproject.toml, an undefined variable and inconsistent model naming in the shell script, and suggests more specific exception handling for the vLLM engine workaround.

gemini-code-assist · 2026-04-01T20:27:04Z

examples/train/megatron/run_megatron_qwen3.5.sh

+  trainer.policy.megatron_config.expert_model_parallel_size=$MEGATRON_EP \
+  trainer.policy.megatron_config.expert_tensor_parallel_size=$MEGATRON_ETP \
+  trainer.use_sample_packing=false \
+  trainer.flash_attn=$FLASH_ATTN \


The variable $FLASH_ATTN is used here, but its definition on line 30 is commented out. This will result in an empty value being passed to the trainer, which may cause a parsing error in the entrypoint.

Suggested change

trainer.flash_attn=$FLASH_ATTN \

trainer.flash_attn=false \

pyproject.toml

gemini-code-assist · 2026-04-01T20:27:04Z

examples/train/megatron/run_megatron_qwen3.5.sh

@@ -0,0 +1,77 @@
+set -x
+
+# Colocated GRPO training+generation for Moonlight-16B-A3B-Instruct on GSM8K with Megatron.


The comment mentions Moonlight-16B-A3B-Instruct, but the script is configured for Qwen/Qwen3.5-0.8B (line 12). This should be updated to reflect the correct model.

gemini-code-assist · 2026-04-01T20:27:04Z

examples/train/megatron/run_megatron_qwen3.5.sh

+  generator.inference_engine.gpu_memory_utilization=0.6 \
+  trainer.logger="$LOGGER" \
+  trainer.project_name="gsm8k_megatron" \
+  trainer.run_name="gsm8k_megatron_tp${MEGATRON_TP}_pp${MEGATRON_PP}_cp${MEGATRON_CP}_ep${MEGATRON_EP}_etp${MEGATRON_ETP}_moonlight16b-a3b" \


The run_name suffix refers to moonlight16b-a3b. It should be updated to match the Qwen 3.5 model being used.

Suggested change

trainer.run_name="gsm8k_megatron_tp${MEGATRON_TP}_pp${MEGATRON_PP}_cp${MEGATRON_CP}_ep${MEGATRON_EP}_etp${MEGATRON_ETP}_moonlight16b-a3b" \

trainer.run_name="gsm8k_megatron_tp${MEGATRON_TP}_pp${MEGATRON_PP}_cp${MEGATRON_CP}_ep${MEGATRON_EP}_etp${MEGATRON_ETP}_qwen3.5-0.8b" \

gemini-code-assist · 2026-04-01T20:27:04Z

skyrl/backends/skyrl_train/inference_engines/vllm/vllm_engine.py

+except Exception:
+    pass


Catching a broad Exception and passing silently is generally discouraged. While this is a monkey-patch workaround, it would be safer to catch specific errors (like ImportError or AttributeError) or at least log a warning if the patch fails, to aid in debugging if the library structure changes unexpectedly.

…3_5_megatron

The uv.lock regeneration in #1425 bumped griffe2md from 1.3.4 to 1.5.0, which renders docstring content slightly differently. A `<=` operator at the start of a line in the FullyAsyncConfig docstring was left unescaped in the MDX output, causing the Next.js MDX parser to interpret `<` as the start of a JSX tag and fail with: Unexpected character `=` (U+003D) before name Add a second regex pass in `sanitize_for_mdx()` that escapes any `<` not followed by a valid tag-start character (letter, `/`, `!`), converting patterns like `<=` to `<=`. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

## Summary - The `uv.lock` regeneration in #1425 bumped `griffe2md` from 1.3.4 to 1.5.0, which renders docstring content slightly differently - A `<=` operator at the start of a line in the `FullyAsyncConfig` docstring (`skyrl/train/config/config.py:394`) was left unescaped in the MDX output, causing the Next.js MDX parser to fail with: `Unexpected character '=' (U+003D) before name` - Adds a second regex pass in `sanitize_for_mdx()` that escapes any bare `<` not followed by a valid HTML tag-start character (letter, `/`, `!`)  --- <a href="https://app.devin.ai/review/novasky-ai/skyrl/pull/1478" target="_blank"> <picture> <source media="(prefers-color-scheme: dark)" srcset="https://static.devin.ai/assets/gh-open-in-devin-review-dark.svg?v=1"> <img src="https://static.devin.ai/assets/gh-open-in-devin-review-light.svg?v=1" alt="Open with Devin"> </picture> </a>  Co-authored-by: Tyler <tyler@oci.local> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

x

470ad3c

gemini-code-assist bot reviewed Apr 1, 2026

View reviewed changes

This comment was marked as resolved.

Sign in to view

erictang000 mentioned this pull request Apr 1, 2026

Support Qwen3.5 with FSDP + Megatron Backends #1254

Open

erictang000 added 3 commits April 1, 2026 21:10

fix tx v5 issues

830eae2

x

9b861a9

x

77ab6d9

erictang000 mentioned this pull request Apr 3, 2026

[dependencies] Upgrade transformers to >=5.0.0,<=5.3.0 #1426

Merged

erictang000 added 3 commits April 6, 2026 17:41

Merge branch 'main' of https://github.com/erictang000/SkyRL into qwen…

ae8e858

…3_5_megatron

x

9614cec

x'

e9150f8

This comment was marked as resolved.

Sign in to view

erictang000 added 5 commits April 6, 2026 22:07

Merge branch 'main' of https://github.com/erictang000/SkyRL into qwen…

957e934

…3_5_megatron

x

1474665

x

8aa2051

x

8e59819

x

a552d4b

erictang000 merged commit 29c11ba into main Apr 7, 2026
3 of 4 checks passed

erictang000 deleted the qwen3_5_megatron branch April 7, 2026 00:00

tyler-griggs mentioned this pull request Apr 8, 2026

fix: escape bare < in generated MDX to fix Vercel doc builds #1478

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[megatron] support qwen3.5 models for megatron, bump mbridge + megatron-core to latest#1425

[megatron] support qwen3.5 models for megatron, bump mbridge + megatron-core to latest#1425
erictang000 merged 12 commits intomainfrom
qwen3_5_megatron

erictang000 commented Apr 1, 2026 •

edited

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Apr 1, 2026

Uh oh!

Uh oh!

gemini-code-assist bot Apr 1, 2026

Uh oh!

gemini-code-assist bot Apr 1, 2026

Uh oh!

gemini-code-assist bot Apr 1, 2026

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		@@ -0,0 +1,77 @@
		set -x

		# Colocated GRPO training+generation for Moonlight-16B-A3B-Instruct on GSM8K with Megatron.

	trainer.run_name="gsm8k_megatron_tp${MEGATRON_TP}_pp${MEGATRON_PP}_cp${MEGATRON_CP}_ep${MEGATRON_EP}_etp${MEGATRON_ETP}_moonlight16b-a3b" \
	trainer.run_name="gsm8k_megatron_tp${MEGATRON_TP}_pp${MEGATRON_PP}_cp${MEGATRON_CP}_ep${MEGATRON_EP}_etp${MEGATRON_ETP}_qwen3.5-0.8b" \

Conversation

erictang000 commented Apr 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

gemini-code-assist bot Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

erictang000 commented Apr 1, 2026 •

edited

Loading