feat: DTensorPolicyV2 GPT-OSS support #1470

adil-a · 2025-11-04T18:23:52Z

What does this PR do ?

Adds GPT-OSS SFT using AutoModel custom models + DeepEP.

To run, launch the nightly container and run

NRL_FORCE_REBUILD_VENVS=true uv run examples/run_sft.py --config examples/configs/recipes/llm/sft-gpt-oss-20b-1n8g-fsdp8ep8-automodel.yaml cluster.gpus_per_node=8 logger.wandb_enabled=false

Signed-off-by: adil-a <[email protected]>

Signed-off-by: Hemil Desai <[email protected]>

Signed-off-by: adil-a <[email protected]>

github-actions · 2025-11-05T06:50:10Z

⚠️ File Consistency Check

Check based on commit: e936ebf (PR #1470 from hemil/automodel-moe)

⚠️ DTensor Policy Worker Synchronization Warning

The file nemo_rl/models/policy/dtensor_policy_worker_v2.py was modified in this PR, but nemo_rl/models/policy/dtensor_policy_worker.py was not updated.

Why this matters:
These files contain related DTensor policy worker implementations that should be kept synchronized to ensure consistency across different versions.

Action required:

Please review if the changes in nemo_rl/models/policy/dtensor_policy_worker_v2.py should also be applied to nemo_rl/models/policy/dtensor_policy_worker.py
Update nemo_rl/models/policy/dtensor_policy_worker.py if necessary to maintain consistency
If the files are intentionally different, please add a comment in the PR explaining why

Files to check:

Modified: nemo_rl/models/policy/dtensor_policy_worker_v2.py
Not modified: nemo_rl/models/policy/dtensor_policy_worker.py

_{This check ensures that related file implementations remain synchronized across the codebase. If you believe this warning is incorrect or the files should intentionally differ, please add a comment explaining the reasoning.}

github-actions · 2025-11-05T06:50:34Z

❌ Submodule Fast-Forward Check Failed

Check based on commit: e936ebf (PR #1470 from hemil/automodel-moe)

❌ Submodules that need attention:

Automodel: ❌ Commits have DIVERGED from a common ancestor
TARGET (main branch): https://github.com/NVIDIA-NeMo/Automodel/commits/a2db048383cd54b3fafc928df4c30bf7bbf7c430/
CURRENT (PR #1470 from hemil/automodel-moe): https://github.com/NVIDIA-NeMo/Automodel/commits/5e995e9535e63cbe3358dc2bd81a8ed3a696cee7/

Please ensure all submodule commits are fast-forwards of the main branch before merging.

Signed-off-by: adil-a <[email protected]>

github-actions · 2025-11-05T06:52:10Z

⚠️ File Consistency Check

Check based on commit: 7df0cc5 (PR #1470 from hemil/automodel-moe)

⚠️ DTensor Policy Worker Synchronization Warning

The file nemo_rl/models/policy/dtensor_policy_worker_v2.py was modified in this PR, but nemo_rl/models/policy/dtensor_policy_worker.py was not updated.

Why this matters:
These files contain related DTensor policy worker implementations that should be kept synchronized to ensure consistency across different versions.

Action required:

Please review if the changes in nemo_rl/models/policy/dtensor_policy_worker_v2.py should also be applied to nemo_rl/models/policy/dtensor_policy_worker.py
Update nemo_rl/models/policy/dtensor_policy_worker.py if necessary to maintain consistency
If the files are intentionally different, please add a comment in the PR explaining why

Files to check:

Modified: nemo_rl/models/policy/dtensor_policy_worker_v2.py
Not modified: nemo_rl/models/policy/dtensor_policy_worker.py

_{This check ensures that related file implementations remain synchronized across the codebase. If you believe this warning is incorrect or the files should intentionally differ, please add a comment explaining the reasoning.}

github-actions · 2025-11-05T06:52:42Z

❌ Submodule Fast-Forward Check Failed

Check based on commit: 7df0cc5 (PR #1470 from hemil/automodel-moe)

❌ Submodules that need attention:

Automodel: ❌ Commits have DIVERGED from a common ancestor
TARGET (main branch): https://github.com/NVIDIA-NeMo/Automodel/commits/a2db048383cd54b3fafc928df4c30bf7bbf7c430/
CURRENT (PR #1470 from hemil/automodel-moe): https://github.com/NVIDIA-NeMo/Automodel/commits/5e995e9535e63cbe3358dc2bd81a8ed3a696cee7/

Please ensure all submodule commits are fast-forwards of the main branch before merging.

Signed-off-by: adil-a <[email protected]>

github-actions · 2025-11-05T19:34:39Z

⚠️ File Consistency Check

Check based on commit: 1eef903 (PR #1470 from hemil/automodel-moe)

⚠️ DTensor Policy Worker Synchronization Warning

The file nemo_rl/models/policy/dtensor_policy_worker_v2.py was modified in this PR, but nemo_rl/models/policy/dtensor_policy_worker.py was not updated.

Why this matters:
These files contain related DTensor policy worker implementations that should be kept synchronized to ensure consistency across different versions.

Action required:

Please review if the changes in nemo_rl/models/policy/dtensor_policy_worker_v2.py should also be applied to nemo_rl/models/policy/dtensor_policy_worker.py
Update nemo_rl/models/policy/dtensor_policy_worker.py if necessary to maintain consistency
If the files are intentionally different, please add a comment in the PR explaining why

Files to check:

Modified: nemo_rl/models/policy/dtensor_policy_worker_v2.py
Not modified: nemo_rl/models/policy/dtensor_policy_worker.py

_{This check ensures that related file implementations remain synchronized across the codebase. If you believe this warning is incorrect or the files should intentionally differ, please add a comment explaining the reasoning.}

github-actions · 2025-11-05T19:34:49Z

❌ Submodule Fast-Forward Check Failed

Check based on commit: 1eef903 (PR #1470 from hemil/automodel-moe)

❌ Submodules that need attention:

Automodel: ❌ Commits have DIVERGED from a common ancestor
TARGET (main branch): https://github.com/NVIDIA-NeMo/Automodel/commits/a2db048383cd54b3fafc928df4c30bf7bbf7c430/
CURRENT (PR #1470 from hemil/automodel-moe): https://github.com/NVIDIA-NeMo/Automodel/commits/5e995e9535e63cbe3358dc2bd81a8ed3a696cee7/

Please ensure all submodule commits are fast-forwards of the main branch before merging.

Signed-off-by: Adil Asif <[email protected]>

github-actions · 2025-11-06T07:30:14Z

⚠️ File Consistency Check

Check based on commit: 24214e9 (PR #1470 from hemil/automodel-moe)

⚠️ DTensor Policy Worker Synchronization Warning

The file nemo_rl/models/policy/dtensor_policy_worker_v2.py was modified in this PR, but nemo_rl/models/policy/dtensor_policy_worker.py was not updated.

Why this matters:
These files contain related DTensor policy worker implementations that should be kept synchronized to ensure consistency across different versions.

Action required:

Please review if the changes in nemo_rl/models/policy/dtensor_policy_worker_v2.py should also be applied to nemo_rl/models/policy/dtensor_policy_worker.py
Update nemo_rl/models/policy/dtensor_policy_worker.py if necessary to maintain consistency
If the files are intentionally different, please add a comment in the PR explaining why

Files to check:

Modified: nemo_rl/models/policy/dtensor_policy_worker_v2.py
Not modified: nemo_rl/models/policy/dtensor_policy_worker.py

_{This check ensures that related file implementations remain synchronized across the codebase. If you believe this warning is incorrect or the files should intentionally differ, please add a comment explaining the reasoning.}

github-actions · 2025-11-06T07:30:47Z

❌ Submodule Fast-Forward Check Failed

Check based on commit: 24214e9 (PR #1470 from hemil/automodel-moe)

❌ Submodules that need attention:

Automodel: ❌ Commits have DIVERGED from a common ancestor
TARGET (main branch): https://github.com/NVIDIA-NeMo/Automodel/commits/a2db048383cd54b3fafc928df4c30bf7bbf7c430/
CURRENT (PR #1470 from hemil/automodel-moe): https://github.com/NVIDIA-NeMo/Automodel/commits/5e995e9535e63cbe3358dc2bd81a8ed3a696cee7/

Please ensure all submodule commits are fast-forwards of the main branch before merging.

terrykong · 2025-11-18T05:58:19Z

@adil-a what's the current status of this PR?

Signed-off-by: adil-a <[email protected]>

github-actions · 2025-11-25T22:48:47Z

⚠️ File Consistency Check

Check based on commit: 2ed872a (PR #1470 from hemil/automodel-moe)

⚠️ DTensor Policy Worker Synchronization Warning

The file nemo_rl/models/policy/dtensor_policy_worker_v2.py was modified in this PR, but nemo_rl/models/policy/dtensor_policy_worker.py was not updated.

Why this matters:
These files contain related DTensor policy worker implementations that should be kept synchronized to ensure consistency across different versions.

Action required:

Please review if the changes in nemo_rl/models/policy/dtensor_policy_worker_v2.py should also be applied to nemo_rl/models/policy/dtensor_policy_worker.py
Update nemo_rl/models/policy/dtensor_policy_worker.py if necessary to maintain consistency
If the files are intentionally different, please add a comment in the PR explaining why

Files to check:

Modified: nemo_rl/models/policy/dtensor_policy_worker_v2.py
Not modified: nemo_rl/models/policy/dtensor_policy_worker.py

_{This check ensures that related file implementations remain synchronized across the codebase. If you believe this warning is incorrect or the files should intentionally differ, please add a comment explaining the reasoning.}

github-actions · 2025-11-25T22:49:57Z

❌ Submodule Fast-Forward Check Failed

Check based on commit: 2ed872a (PR #1470 from hemil/automodel-moe)

❌ Submodules that need attention:

Automodel: ❌ Commits have DIVERGED from a common ancestor
TARGET (main branch): https://github.com/NVIDIA-NeMo/Automodel/commits/a2db048383cd54b3fafc928df4c30bf7bbf7c430/
CURRENT (PR #1470 from hemil/automodel-moe): https://github.com/NVIDIA-NeMo/Automodel/commits/5e995e9535e63cbe3358dc2bd81a8ed3a696cee7/

Please ensure all submodule commits are fast-forwards of the main branch before merging.

Signed-off-by: adil-a <[email protected]>

github-actions · 2025-11-25T23:43:24Z

⚠️ File Consistency Check

Check based on commit: 5489b21 (PR #1470 from hemil/automodel-moe)

⚠️ DTensor Policy Worker Synchronization Warning

The file nemo_rl/models/policy/dtensor_policy_worker_v2.py was modified in this PR, but nemo_rl/models/policy/dtensor_policy_worker.py was not updated.

Why this matters:
These files contain related DTensor policy worker implementations that should be kept synchronized to ensure consistency across different versions.

Action required:

Please review if the changes in nemo_rl/models/policy/dtensor_policy_worker_v2.py should also be applied to nemo_rl/models/policy/dtensor_policy_worker.py
Update nemo_rl/models/policy/dtensor_policy_worker.py if necessary to maintain consistency
If the files are intentionally different, please add a comment in the PR explaining why

Files to check:

Modified: nemo_rl/models/policy/dtensor_policy_worker_v2.py
Not modified: nemo_rl/models/policy/dtensor_policy_worker.py

_{This check ensures that related file implementations remain synchronized across the codebase. If you believe this warning is incorrect or the files should intentionally differ, please add a comment explaining the reasoning.}

github-actions · 2025-11-25T23:43:48Z

❌ Submodule Fast-Forward Check Failed

Check based on commit: 5489b21 (PR #1470 from hemil/automodel-moe)

❌ Submodules that need attention:

Automodel: ❌ Commits have DIVERGED from a common ancestor
TARGET (main branch): https://github.com/NVIDIA-NeMo/Automodel/commits/a2db048383cd54b3fafc928df4c30bf7bbf7c430/
CURRENT (PR #1470 from hemil/automodel-moe): https://github.com/NVIDIA-NeMo/Automodel/commits/d7f248adf367585f0bd9c5febea6401a6cd6ea4f/

Please ensure all submodule commits are fast-forwards of the main branch before merging.

Signed-off-by: adil-a <[email protected]>

github-actions · 2025-11-25T23:46:09Z

⚠️ File Consistency Check

Check based on commit: ed69abd (PR #1470 from hemil/automodel-moe)

⚠️ DTensor Policy Worker Synchronization Warning

The file nemo_rl/models/policy/dtensor_policy_worker_v2.py was modified in this PR, but nemo_rl/models/policy/dtensor_policy_worker.py was not updated.

Why this matters:
These files contain related DTensor policy worker implementations that should be kept synchronized to ensure consistency across different versions.

Action required:

Please review if the changes in nemo_rl/models/policy/dtensor_policy_worker_v2.py should also be applied to nemo_rl/models/policy/dtensor_policy_worker.py
Update nemo_rl/models/policy/dtensor_policy_worker.py if necessary to maintain consistency
If the files are intentionally different, please add a comment in the PR explaining why

Files to check:

Modified: nemo_rl/models/policy/dtensor_policy_worker_v2.py
Not modified: nemo_rl/models/policy/dtensor_policy_worker.py

_{This check ensures that related file implementations remain synchronized across the codebase. If you believe this warning is incorrect or the files should intentionally differ, please add a comment explaining the reasoning.}

github-actions · 2025-11-25T23:46:19Z

❌ Submodule Fast-Forward Check Failed

Check based on commit: ed69abd (PR #1470 from hemil/automodel-moe)

❌ Submodules that need attention:

Automodel: ❌ Commits have DIVERGED from a common ancestor
TARGET (main branch): https://github.com/NVIDIA-NeMo/Automodel/commits/a2db048383cd54b3fafc928df4c30bf7bbf7c430/
CURRENT (PR #1470 from hemil/automodel-moe): https://github.com/NVIDIA-NeMo/Automodel/commits/d7f248adf367585f0bd9c5febea6401a6cd6ea4f/

Please ensure all submodule commits are fast-forwards of the main branch before merging.

Signed-off-by: adil-a <[email protected]>

github-actions · 2025-11-26T00:10:15Z

⚠️ File Consistency Check

Check based on commit: b754c7c (PR #1470 from hemil/automodel-moe)

⚠️ DTensor Policy Worker Synchronization Warning

The file nemo_rl/models/policy/dtensor_policy_worker_v2.py was modified in this PR, but nemo_rl/models/policy/dtensor_policy_worker.py was not updated.

Why this matters:
These files contain related DTensor policy worker implementations that should be kept synchronized to ensure consistency across different versions.

Action required:

Please review if the changes in nemo_rl/models/policy/dtensor_policy_worker_v2.py should also be applied to nemo_rl/models/policy/dtensor_policy_worker.py
Update nemo_rl/models/policy/dtensor_policy_worker.py if necessary to maintain consistency
If the files are intentionally different, please add a comment in the PR explaining why

Files to check:

Modified: nemo_rl/models/policy/dtensor_policy_worker_v2.py
Not modified: nemo_rl/models/policy/dtensor_policy_worker.py

_{This check ensures that related file implementations remain synchronized across the codebase. If you believe this warning is incorrect or the files should intentionally differ, please add a comment explaining the reasoning.}

github-actions · 2025-11-26T00:10:39Z

❌ Submodule Fast-Forward Check Failed

Check based on commit: b754c7c (PR #1470 from hemil/automodel-moe)

❌ Submodules that need attention:

Automodel: ❌ Commits have DIVERGED from a common ancestor
TARGET (main branch): https://github.com/NVIDIA-NeMo/Automodel/commits/a2db048383cd54b3fafc928df4c30bf7bbf7c430/
CURRENT (PR #1470 from hemil/automodel-moe): https://github.com/NVIDIA-NeMo/Automodel/commits/d7f248adf367585f0bd9c5febea6401a6cd6ea4f/

Please ensure all submodule commits are fast-forwards of the main branch before merging.

Signed-off-by: adil-a <[email protected]>

github-actions · 2025-11-26T05:25:38Z

⚠️ File Consistency Check

Check based on commit: 3877e79 (PR #1470 from hemil/automodel-moe)

⚠️ DTensor Policy Worker Synchronization Warning

The file nemo_rl/models/policy/dtensor_policy_worker_v2.py was modified in this PR, but nemo_rl/models/policy/dtensor_policy_worker.py was not updated.

Why this matters:
These files contain related DTensor policy worker implementations that should be kept synchronized to ensure consistency across different versions.

Action required:

Please review if the changes in nemo_rl/models/policy/dtensor_policy_worker_v2.py should also be applied to nemo_rl/models/policy/dtensor_policy_worker.py
Update nemo_rl/models/policy/dtensor_policy_worker.py if necessary to maintain consistency
If the files are intentionally different, please add a comment in the PR explaining why

Files to check:

Modified: nemo_rl/models/policy/dtensor_policy_worker_v2.py
Not modified: nemo_rl/models/policy/dtensor_policy_worker.py

_{This check ensures that related file implementations remain synchronized across the codebase. If you believe this warning is incorrect or the files should intentionally differ, please add a comment explaining the reasoning.}

github-actions · 2025-11-26T05:26:04Z

❌ Submodule Fast-Forward Check Failed

Check based on commit: 3877e79 (PR #1470 from hemil/automodel-moe)

❌ Submodules that need attention:

Automodel: ❌ Commits have DIVERGED from a common ancestor
TARGET (main branch): https://github.com/NVIDIA-NeMo/Automodel/commits/a2db048383cd54b3fafc928df4c30bf7bbf7c430/
CURRENT (PR #1470 from hemil/automodel-moe): https://github.com/NVIDIA-NeMo/Automodel/commits/d7f248adf367585f0bd9c5febea6401a6cd6ea4f/

Please ensure all submodule commits are fast-forwards of the main branch before merging.

github-actions · 2025-11-26T05:27:56Z

⚠️ File Consistency Check

Check based on commit: 661b596 (PR #1470 from hemil/automodel-moe)

⚠️ DTensor Policy Worker Synchronization Warning

The file nemo_rl/models/policy/dtensor_policy_worker_v2.py was modified in this PR, but nemo_rl/models/policy/dtensor_policy_worker.py was not updated.

Why this matters:
These files contain related DTensor policy worker implementations that should be kept synchronized to ensure consistency across different versions.

Action required:

Please review if the changes in nemo_rl/models/policy/dtensor_policy_worker_v2.py should also be applied to nemo_rl/models/policy/dtensor_policy_worker.py
Update nemo_rl/models/policy/dtensor_policy_worker.py if necessary to maintain consistency
If the files are intentionally different, please add a comment in the PR explaining why

Files to check:

Modified: nemo_rl/models/policy/dtensor_policy_worker_v2.py
Not modified: nemo_rl/models/policy/dtensor_policy_worker.py

_{This check ensures that related file implementations remain synchronized across the codebase. If you believe this warning is incorrect or the files should intentionally differ, please add a comment explaining the reasoning.}

github-actions · 2025-11-26T05:28:09Z

❌ Submodule Fast-Forward Check Failed

Check based on commit: 661b596 (PR #1470 from hemil/automodel-moe)

❌ Submodules that need attention:

Automodel: ❌ Commits have DIVERGED from a common ancestor
TARGET (main branch): https://github.com/NVIDIA-NeMo/Automodel/commits/a2db048383cd54b3fafc928df4c30bf7bbf7c430/
CURRENT (PR #1470 from hemil/automodel-moe): https://github.com/NVIDIA-NeMo/Automodel/commits/756ed10c29039cd9af551761d054a526021f559d/

Please ensure all submodule commits are fast-forwards of the main branch before merging.

nemo_rl/models/policy/dtensor_policy_worker_v2.py

hemildesai · 2025-11-26T12:14:37Z

nemo_rl/models/policy/dtensor_policy_worker_v2.py

                            # when FSDP reduces the gradients over the DP dim, they're automatically averaged
                            # but we want to sum them so we cancel out the average here
-                            loss *= self.dp_size * self.cp_size
+                            # loss *= self.dp_size * self.cp_size


Let's remove this line and ensure that grad norm + loss matches for HF models with different TP sizes

hemildesai · 2025-11-26T12:15:13Z

nemo_rl/models/policy/dtensor_policy_worker_v2.py


                with get_train_context(False, False, context_parallel_ctx)():
-                    with torch.autocast(device_type="cuda", dtype=self.dtype):
+                    with nullcontext():


Make this configurable with default to use autocast to maintain backwards compatibility.

nemo_rl/models/policy/dtensor_policy_worker_v2.py

yuki-97

@adil-a @hemildesai thanks for the great effort! left some comments.

yuki-97 · 2025-11-26T13:02:07Z

examples/configs/recipes/llm/sft-gpt-oss-20b-1n8g-fsdp8ep8-automodel.yaml

@@ -0,0 +1,29 @@
+defaults: ../../sft.yaml


can you add the nightly test for this?
you can refer to tests/test_suites/llm/grpo-deepscaler-1.5b-8K.sh.

yuki-97 · 2025-11-26T13:02:18Z

nemo_rl/models/policy/dtensor_policy_worker_v2.py

-            else OffloadPolicy(),
-            sequence_parallel=sequence_parallel_enabled,
+            else None,
+            backend="nccl",


just curious, don't we need to set backend=backend here?

yuki-97 · 2025-11-26T13:02:30Z

nemo_rl/models/policy/dtensor_policy_worker_v2.py

-        # Manually broadcast buffers
-        for _, buf in self.model.named_buffers():
-            torch.distributed.broadcast(to_local_if_dtensor(buf), src=0)
-


do you know will this affect other models? @ffrujeri

yuki-97 · 2025-11-26T13:02:37Z

nemo_rl/models/policy/dtensor_policy_worker_v2.py

+        # Load base model weights across all ranks using Automodel Checkpointer
+        # This mirrors build_model_and_optimizer's is_meta_device + load_weights path
+        print(self.model)
+        self._ensure_checkpointer(


Do you mind to move all the checkpoint related code to nemo_rl/utils/automodel_checkpoint.py to make the code more clear?

I think you can add a class in automodel_checkpoint.py and only call its functions in dtensor_policy_worker_v2.py.

Also we should have unit tests for the new automodel's checkpoint.

cc @hemildesai @ffrujeri @joyang-nv

e.g.,

class AutoModelCheckpointer: def __init__(self, ): ... def save_checkpoint(self, ): ... def load_checkpoint(self, ): ...

yuki-97 · 2025-11-26T13:07:24Z

examples/configs/recipes/llm/sft-gpt-oss-20b-1n8g-fsdp8ep8-automodel.yaml

@@ -0,0 +1,29 @@
+defaults: ../../sft.yaml
+policy:
+  model_name: openai/gpt-oss-20b


I believe you have some plots for the convergence of gpt-oss, can you paste them to the PR? so that others can know this recipe's results.

Also do you have tested other models (e.g., llama, qwen) using this PR to make sure this PR won't affect other models? there's a lot of changes in the dtensor v2 worker.

Signed-off-by: adil-a <[email protected]>

github-actions · 2025-11-26T17:28:30Z

⚠️ File Consistency Check

Check based on commit: d89180c (PR #1470 from hemil/automodel-moe)

⚠️ DTensor Policy Worker Synchronization Warning

The file nemo_rl/models/policy/dtensor_policy_worker_v2.py was modified in this PR, but nemo_rl/models/policy/dtensor_policy_worker.py was not updated.

Why this matters:
These files contain related DTensor policy worker implementations that should be kept synchronized to ensure consistency across different versions.

Action required:

Please review if the changes in nemo_rl/models/policy/dtensor_policy_worker_v2.py should also be applied to nemo_rl/models/policy/dtensor_policy_worker.py
Update nemo_rl/models/policy/dtensor_policy_worker.py if necessary to maintain consistency
If the files are intentionally different, please add a comment in the PR explaining why

Files to check:

Modified: nemo_rl/models/policy/dtensor_policy_worker_v2.py
Not modified: nemo_rl/models/policy/dtensor_policy_worker.py

_{This check ensures that related file implementations remain synchronized across the codebase. If you believe this warning is incorrect or the files should intentionally differ, please add a comment explaining the reasoning.}

github-actions · 2025-11-26T17:28:43Z

❌ Submodule Fast-Forward Check Failed

Check based on commit: d89180c (PR #1470 from hemil/automodel-moe)

❌ Submodules that need attention:

Automodel: ❌ Commits have DIVERGED from a common ancestor
TARGET (main branch): https://github.com/NVIDIA-NeMo/Automodel/commits/a2db048383cd54b3fafc928df4c30bf7bbf7c430/
CURRENT (PR #1470 from hemil/automodel-moe): https://github.com/NVIDIA-NeMo/Automodel/commits/756ed10c29039cd9af551761d054a526021f559d/

Please ensure all submodule commits are fast-forwards of the main branch before merging.

adil-a and others added 17 commits November 1, 2025 17:23

automodel on latest main

36db136

Signed-off-by: adil-a <[email protected]>

new automodel checkpointing

8e562e8

Signed-off-by: adil-a <[email protected]>

adding automodel sharding

5a1cff1

Signed-off-by: adil-a <[email protected]>

adding moe init

acff747

Signed-off-by: adil-a <[email protected]>

fix

0cbc3ac

Signed-off-by: adil-a <[email protected]>

removing legacy checkpointing utils

3336fe9

Signed-off-by: adil-a <[email protected]>

linting

19d29aa

Signed-off-by: adil-a <[email protected]>

adding moe check

dcb4cb2

Signed-off-by: adil-a <[email protected]>

automodel

738338f

Signed-off-by: adil-a <[email protected]>

latest automodel bump

62acdfc

Signed-off-by: adil-a <[email protected]>

changes

b6a3fdd

Signed-off-by: adil-a <[email protected]>

cfg

2b86310

Signed-off-by: adil-a <[email protected]>

eof fix

dd634b6

Signed-off-by: adil-a <[email protected]>

feat: automodel moe integration

2f74d79

Signed-off-by: Hemil Desai <[email protected]>

bump

1163407

Signed-off-by: adil-a <[email protected]>

adding torch arch list for grouped gemm isntall

d270a5b

Signed-off-by: adil-a <[email protected]>

linting

d038aca

Signed-off-by: adil-a <[email protected]>

adil-a changed the title ~~Hemil/automodel moe~~ feat: DTensorPolicyV2 GPT-OSS support Nov 5, 2025

main merge

e936ebf

Signed-off-by: adil-a <[email protected]>

adil-a marked this pull request as ready for review November 5, 2025 06:50

adil-a requested review from a team as code owners November 5, 2025 06:50

uv lock

7df0cc5

Signed-off-by: adil-a <[email protected]>

fix

b4139f1

Signed-off-by: adil-a <[email protected]>

adil-a requested a review from a team as a code owner November 5, 2025 19:34

adding fixes from unit tests

24214e9

Signed-off-by: Adil Asif <[email protected]>

merging main

2ed872a

Signed-off-by: adil-a <[email protected]>

bumping automodel + v2 fixes

5489b21

Signed-off-by: adil-a <[email protected]>

pre-commit

ed69abd

Signed-off-by: adil-a <[email protected]>

ckpt fix

b754c7c

Signed-off-by: adil-a <[email protected]>

pre commit

3877e79

Signed-off-by: adil-a <[email protected]>

Sync Automodel submodule to origin/main

661b596

hemildesai reviewed Nov 26, 2025

View reviewed changes

yuki-97 reviewed Nov 26, 2025

View reviewed changes

removing RL specific changes for future PR

d89180c

Signed-off-by: adil-a <[email protected]>

feat: DTensorPolicyV2 GPT-OSS support #1470

Are you sure you want to change the base?

feat: DTensorPolicyV2 GPT-OSS support #1470

Conversation

adil-a commented Nov 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do ?

Uh oh!

github-actions bot commented Nov 5, 2025

⚠️ File Consistency Check

⚠️ DTensor Policy Worker Synchronization Warning

Uh oh!

github-actions bot commented Nov 5, 2025

❌ Submodule Fast-Forward Check Failed

❌ Submodules that need attention:

Uh oh!

github-actions bot commented Nov 5, 2025

⚠️ File Consistency Check

⚠️ DTensor Policy Worker Synchronization Warning

Uh oh!

github-actions bot commented Nov 5, 2025

❌ Submodule Fast-Forward Check Failed

❌ Submodules that need attention:

Uh oh!

github-actions bot commented Nov 5, 2025

⚠️ File Consistency Check

⚠️ DTensor Policy Worker Synchronization Warning

Uh oh!

github-actions bot commented Nov 5, 2025

❌ Submodule Fast-Forward Check Failed

❌ Submodules that need attention:

Uh oh!

github-actions bot commented Nov 6, 2025

⚠️ File Consistency Check

⚠️ DTensor Policy Worker Synchronization Warning

Uh oh!

github-actions bot commented Nov 6, 2025

❌ Submodule Fast-Forward Check Failed

❌ Submodules that need attention:

Uh oh!

terrykong commented Nov 18, 2025

Uh oh!

github-actions bot commented Nov 25, 2025

⚠️ File Consistency Check

⚠️ DTensor Policy Worker Synchronization Warning

Uh oh!

github-actions bot commented Nov 25, 2025

❌ Submodule Fast-Forward Check Failed

❌ Submodules that need attention:

Uh oh!

github-actions bot commented Nov 25, 2025

⚠️ File Consistency Check

⚠️ DTensor Policy Worker Synchronization Warning

Uh oh!

github-actions bot commented Nov 25, 2025

❌ Submodule Fast-Forward Check Failed

❌ Submodules that need attention:

Uh oh!

github-actions bot commented Nov 25, 2025

⚠️ File Consistency Check

⚠️ DTensor Policy Worker Synchronization Warning

Uh oh!

github-actions bot commented Nov 25, 2025

❌ Submodule Fast-Forward Check Failed

❌ Submodules that need attention:

Uh oh!

github-actions bot commented Nov 26, 2025

⚠️ File Consistency Check

⚠️ DTensor Policy Worker Synchronization Warning

Uh oh!

github-actions bot commented Nov 26, 2025

❌ Submodule Fast-Forward Check Failed

❌ Submodules that need attention:

Uh oh!

github-actions bot commented Nov 26, 2025

⚠️ File Consistency Check

⚠️ DTensor Policy Worker Synchronization Warning

Uh oh!

github-actions bot commented Nov 26, 2025

❌ Submodule Fast-Forward Check Failed

adil-a commented Nov 4, 2025 •

edited

Loading