Skip to content

[skyrl][tinker] Multi-modal Tinker Sampling#1484

Open
nithinvc wants to merge 6 commits intoNovaSky-AI:mainfrom
nithinvc:nithinc/mm-sample
Open

[skyrl][tinker] Multi-modal Tinker Sampling#1484
nithinvc wants to merge 6 commits intoNovaSky-AI:mainfrom
nithinvc:nithinc/mm-sample

Conversation

@nithinvc
Copy link
Copy Markdown
Contributor

@nithinvc nithinvc commented Apr 9, 2026

Summary

Adds VLM sampling support to RemoteInferenceClient sample endpoint. Finalizes inference side changes for #1200.

  • Extract _render_for_sample to handle both text-only and image-containing prompts. For text-only prompts, it flattens chunk tokens directly. When images are present, it calls /v1/chat/completions/render to process images, then splices placeholder tokens into the pre-tokenized text stream with adjusted offsets.
  • Update sample() to pass multi-modal features through to the generate payload when present.

Test plan

  • Verify text-only sampling still works (no render call made, features is None)
  • Verify image sampling works end-to-end (render call is made, placeholder tokens are spliced correctly, features are included in the generate payload)
  • New multi-modal sampling tests in test_vlm_inference_generation.py

Open with Devin

@nithinvc nithinvc marked this pull request as ready for review April 9, 2026 16:46
Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 4 additional findings.

Open in Devin Review

gemini-code-assist[bot]

This comment was marked as resolved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant