Skip to content

feat: enable hybrid FP8 dtypes on Triton grouped GEMM backends#288

Draft
sarthak-amd wants to merge 1 commit intomainfrom
sararora/triton-fp8-hybrid-dtypes
Draft

feat: enable hybrid FP8 dtypes on Triton grouped GEMM backends#288
sarthak-amd wants to merge 1 commit intomainfrom
sararora/triton-fp8-hybrid-dtypes

Conversation

@sarthak-amd
Copy link
Copy Markdown

@sarthak-amd sarthak-amd commented Apr 15, 2026

Summary

  • Enable hybrid (mixed e4m3/e5m2) FP8 dtype support on the two Triton grouped GEMM backends (GroupedGEMMFP8TritonBackend and GroupedGEMMFP8VariableKTritonBackend)
# GroupedGEMMFP8TritonBackend (line ~325)
# GroupedGEMMFP8VariableKTritonBackend (line ~442)
# Before:
SUPPORTED_DTYPES = set(_COMMON_SUPPORTED_DTYPES)
# After:
SUPPORTED_DTYPES = set(_COMMON_SUPPORTED_DTYPES + _HYBRID_SUPPORTED_DTYPES)

Made with Cursor

The Triton JIT kernels are dtype-agnostic -- tl.dot handles mixed FP8
natively on gfx950 via v_mfma_f32_32x32x64_f8f6f4 with per-operand
cbsz/blgp modifiers. This aligns the Triton dispatcher dtype gates with
hipBLASLt (which already supports hybrid pairs) by including
_HYBRID_SUPPORTED_DTYPES in GroupedGEMMFP8TritonBackend and
GroupedGEMMFP8VariableKTritonBackend.

Made-with: Cursor
Copilot AI review requested due to automatic review settings April 15, 2026 15:36
@sarthak-amd sarthak-amd requested a review from wenxie-amd as a code owner April 15, 2026 15:36
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

@sarthak-amd sarthak-amd marked this pull request as draft April 15, 2026 15:37
@kyle-256
Copy link
Copy Markdown
Contributor

Thank you for your PR! We already have a related PR at #278, but since there are still issues with hybrid computation in Triton version 3.6.0 on mi300/mi355, we are waiting for the release of 3.7.0. The code compiled from the latest Triton repository should not have any issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants