docs: clarify PAIBench-C reproduction seed and prompt format by Muneerali199 · Pull Request #211 · NVIDIA/cosmos

Muneerali199 · 2026-06-13T15:50:30Z

Adds a clarifying note to the transfer cookbook README addressing the two remaining questions from bhack on the PAIBench-C reproducibility issue:

Seed: All clips use --seed 2026 as the canonical reference seed
Prompt format: Prompts follow the structured prompt.json format shown in assets/*/

The evaluation non-determinism concern is tracked separately at SHI-Labs/physical-ai-bench#7.

Signed-off-by: Muneerali199 <alimuneerali245@gmail.com>

bhack · 2026-06-13T16:21:23Z

But how the structured prompts are generated for the dataset?

Also the problem is not only about reproducibility striclty it is that if you compared with the official PAIBench-C precomputed dataset seg GT it is not reproducible. Have you recomputed source segmentation for your paper/model card?

lfengad · 2026-06-15T02:58:20Z

@trungtpham for review? THX!

Muneerali199 · 2026-06-15T18:19:39Z

Thanks for looking at this. The structured prompts follow the format in assets/*/prompt.json — basically load that template and fill in the scene params per clip. The generation code is in cookbooks/cosmos3/generator/transfer/.

About the source segmentation — I haven't compared against the official PAIBench-C precomputed GT yet. I'll add a note in the cookbook saying that's still pending and link to the non-determinism tracker (#7) for now. Will follow up once I've done the validation.

bhack · 2026-06-15T20:09:43Z

I think we are quite far from reproducibility of the model card.

The remaining blocker for PAIBench-C reproduction is the prompt artifact.

PAIBench-C public prompts are natural-language captions in metadata.csv / captions/*.json, while the Cosmos3 cookbook uses a structured prompt.json schema.

Could you clarify exactly how the PAIBench-C captions were converted into Cosmos3 structured prompt.json files for Table 16?

In particular:

Were the public PAIBench-C captions used directly, or converted into structured Cosmos3 prompt.json?
If converted, was the input metadata.csv caption_text, captions/{task_id}.json, the source video, or some combination?
Is the conversion script / system prompt / model available?
Are the per-clip structured prompt.json files used for the 600 PAIBench-C examples available?
Did the reported Table 16 segmentation result use those structured prompts plus official HF sam2_vids/sam2_pkls, or were source segmentations recomputed?

Without those prompt files or the conversion recipe, the released specs/seed/control settings define the inference shape, but not an exact reproduction of the PAIBench-C table, because the prompt conditioning differs from the public PAIBench-C dataset.

docs: clarify PAIBench-C reproduction seed and prompt format

19708dc

Signed-off-by: Muneerali199 <alimuneerali245@gmail.com>

Muneerali199 mentioned this pull request Jun 13, 2026

Need released-code recipe to reproduce Cosmos3 PAIBench-C transfer results NVIDIA/cosmos-framework#14

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs: clarify PAIBench-C reproduction seed and prompt format#211

docs: clarify PAIBench-C reproduction seed and prompt format#211
Muneerali199 wants to merge 1 commit into
NVIDIA:mainfrom
Muneerali199:patch-transfer-readme

Muneerali199 commented Jun 13, 2026

Uh oh!

bhack commented Jun 13, 2026

Uh oh!

lfengad commented Jun 15, 2026

Uh oh!

Muneerali199 commented Jun 15, 2026

Uh oh!

bhack commented Jun 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Muneerali199 commented Jun 13, 2026

Uh oh!

bhack commented Jun 13, 2026

Uh oh!

lfengad commented Jun 15, 2026

Uh oh!

Muneerali199 commented Jun 15, 2026

Uh oh!

bhack commented Jun 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants