Support for loading in FP8 mode by Yahweasel · Pull Request #138 · ByteDance-Seed/Bagel

Yahweasel · 2025-06-10T01:08:24Z

Unfortunately, the 7900XTX and other consumer-grade AMD GPUs don't really support bitsandbytes. Or dfloat11. But, FP8 support works fine. With this patch, I can load and run BAGEL on a 7900XTX. Unfortunately, it still takes about 17GB at max, so it's too much for a 16GB card, but it does work, and it's a heck of a lot faster than CPU.

Use --mode 4 to load in FP8 mode. Loads the FP16 model with on-the-fly quantization. Because most math isn't supported in FP8, the weights are changed dynamically as needed, so a few checks are added to perform the upconversion.

Loads the FP16 model with on-the-fly quantization. Because most math isn't supported in FP8, the weights are changed dynamically as needed, so a few checks are added to perform the upconversion.

Support for loading in FP8 mode

4d8a488

Loads the FP16 model with on-the-fly quantization. Because most math isn't supported in FP8, the weights are changed dynamically as needed, so a few checks are added to perform the upconversion.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for loading in FP8 mode#138

Support for loading in FP8 mode#138
Yahweasel wants to merge 1 commit intoByteDance-Seed:mainfrom
Yahweasel:main

Yahweasel commented Jun 10, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Yahweasel commented Jun 10, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant