Recommendation for enabling MXLinear on XPU backends #2120

slabhs-aws · 2025-12-06T06:24:48Z

slabhs-aws
Dec 6, 2025

Hi team,

Currently MXLinear runs only when CUDA capability ≥ SM100, due to:

assert has_cuda_capability(10, 0)

This makes MXLinear inaccessible on XPUs or other non-SM100 accelerator paths.

What would be the recommended strategy for enabling MXLinear usage on XPU-based backends?

This will help ensure that frameworks relying on MXLinear can onboard non-CUDA backends without diverging from intended usage patterns.

Related question raised in TorchAO: pytorch/ao#3457

Thanks!

Answered by albanD

Dec 16, 2025

Having an option to check for mx capability sounds fair yes. We have other similar features in accelerator already and we can continue adding to it for capabilities that are broadly used. We can follow up at pytorch/pytorch#143887 for this.

That being said, we obviously don't want to have too narrow of a flag there, as each new flag needs logic across all backends.
The criterion there will be based on where this capability check is needed and how many of our accelerator support that feature.

View full answer

tianyu-l · 2025-12-07T08:40:59Z

tianyu-l
Dec 7, 2025
Collaborator

Is MXLinear already supported on XPU today? If so we could start with adding another utils function similar to https://github.com/pytorch/torchtitan/blob/main/torchtitan/tools/utils.py#L20 to unblock.

0 replies

HahTK · 2025-12-08T18:36:43Z

HahTK
Dec 8, 2025

@tianyu-l : Trainium 3 has support for MXFP8 and MXFP4 (see recent announcement : https://aws.amazon.com/ai/machine-learning/trainium/).

Edit : Technically not an Intel XPU but backend extensions could be generic.

For internal prototypes we have already implemented a similar function. We were never blocked and are not concerned about short term progress. In fact, the question here is the result of our internal review of the prototype

IMHO, we would generally want to avoid such a list of "has_X_capability" for every XPU which is check in multiple places. That seems error prone and not scalable. The question is really : what is the long term design direction here? Is this really it or are we already planning something else?

It would seem like one possible solution is to create a registry for different backends to register ops, customizations and what not. Does the TorchAO team have something else in mind?

5 replies

HahTK Dec 15, 2025

Zooming out it seems assert has_cuda_capability(10, 0) is used as a check for mx support in cuda.One would imagine that if we go down this path we would get a set of custom has_x_capabilities(major_ver, minor_ver) everywhere. We are aiming for device agnostic implementation ala torch.accelerator which we believe will be cleaner.

Example : torch.accelerator.current_accelerator().has_mx_capability()
This would of course require accelerators/devices to implement a has_mx_capability Or perhaps a has_capability(some_capability)

@tianyu-l @vkuzo @danielvegamyhre : thoughts on doing it this way instead?

slabhs-aws Dec 16, 2025
Author

To be precise, the proposal is torch.accelerator.has_mx_capability()

tianyu-l Dec 16, 2025
Collaborator

To be precise, the proposal is torch.accelerator.has_mx_capability()

@albanD does this sound reasonable to you?

albanD Dec 16, 2025

Having an option to check for mx capability sounds fair yes. We have other similar features in accelerator already and we can continue adding to it for capabilities that are broadly used. We can follow up at pytorch/pytorch#143887 for this.

That being said, we obviously don't want to have too narrow of a flag there, as each new flag needs logic across all backends.
The criterion there will be based on where this capability check is needed and how many of our accelerator support that feature.

Answer selected by tianyu-l

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Recommendation for enabling MXLinear on XPU backends #2120

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 2 comments 5 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Recommendation for enabling MXLinear on XPU backends #2120

Uh oh!

Uh oh!

slabhs-aws Dec 6, 2025

Replies: 2 comments · 5 replies

Uh oh!

tianyu-l Dec 7, 2025 Collaborator

Uh oh!

Uh oh!

HahTK Dec 8, 2025

Uh oh!

HahTK Dec 15, 2025

Uh oh!

slabhs-aws Dec 16, 2025 Author

Uh oh!

tianyu-l Dec 16, 2025 Collaborator

Uh oh!

albanD Dec 16, 2025

slabhs-aws
Dec 6, 2025

Replies: 2 comments 5 replies

tianyu-l
Dec 7, 2025
Collaborator

HahTK
Dec 8, 2025

slabhs-aws Dec 16, 2025
Author

tianyu-l Dec 16, 2025
Collaborator