Recommendation for enabling MXLinear on XPU backends #2120
-
|
Hi team, Currently assert has_cuda_capability(10, 0)This makes MXLinear inaccessible on XPUs or other non-SM100 accelerator paths. What would be the recommended strategy for enabling MXLinear usage on XPU-based backends? This will help ensure that frameworks relying on MXLinear can onboard non-CUDA backends without diverging from intended usage patterns. Related question raised in TorchAO: pytorch/ao#3457 Thanks! |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 5 replies
-
|
Is MXLinear already supported on XPU today? If so we could start with adding another utils function similar to https://github.com/pytorch/torchtitan/blob/main/torchtitan/tools/utils.py#L20 to unblock. |
Beta Was this translation helpful? Give feedback.
-
|
@tianyu-l : Trainium 3 has support for MXFP8 and MXFP4 (see recent announcement : https://aws.amazon.com/ai/machine-learning/trainium/). Edit : Technically not an Intel XPU but backend extensions could be generic. For internal prototypes we have already implemented a similar function. We were never blocked and are not concerned about short term progress. In fact, the question here is the result of our internal review of the prototype IMHO, we would generally want to avoid such a list of "has_X_capability" for every XPU which is check in multiple places. That seems error prone and not scalable. The question is really : what is the long term design direction here? Is this really it or are we already planning something else? It would seem like one possible solution is to create a registry for different backends to register ops, customizations and what not. Does the TorchAO team have something else in mind? |
Beta Was this translation helpful? Give feedback.
Having an option to check for mx capability sounds fair yes. We have other similar features in accelerator already and we can continue adding to it for capabilities that are broadly used. We can follow up at pytorch/pytorch#143887 for this.
That being said, we obviously don't want to have too narrow of a flag there, as each new flag needs logic across all backends.
The criterion there will be based on where this capability check is needed and how many of our accelerator support that feature.