Skip to content

[Bug]: Does llmcompressor support MXFP4、MXFP8 quantization #2065

@artificialzjy

Description

@artificialzjy

⚙️ Your current environment

  1. quantization model: Llama3-8B
  2. device:A100 * 4
  3. code:
    from transformers import AutoModelForCausalLM, AutoTokenizer
    from llmcompressor import oneshot
    from llmcompressor.modifiers.quantization import QuantizationModifier
    from llmcompressor.utils import dispatch_for_generation

MODEL_ID = "root/model/Meta-Llama-3-8B"

Load model.

model = AutoModelForCausalLM.from_pretrained(MODEL_ID, torch_dtype="auto")
tokenizer = AutoTokenizer.from_pretrained(MODEL_ID)

Configure the quantization algorithm and scheme.

In this case, we:

* quantize the weights to fp4 with per group 32 via ptq

recipe = QuantizationModifier(targets="Linear", scheme="MXFP4", ignore=["lm_head"])

Apply quantization.

oneshot(model=model, recipe=recipe)

print("\n\n")
print("========== SAMPLE GENERATION ==============")
dispatch_for_generation(model)
input_ids = tokenizer("Hello my name is", return_tensors="pt").input_ids.to(
model.device
)
output = model.generate(input_ids, max_new_tokens=100)
print(tokenizer.decode(output[0]))
print("==========================================\n\n")

Save to disk in compressed-tensors format.

SAVE_DIR = MODEL_ID.rstrip("/").split("/")[-1] + "-MXFP4"
model.save_pretrained(SAVE_DIR, save_compressed=True)
tokenizer.save_pretrained(SAVE_DIR)

🐛 Describe the bug

I see pr of supporting mxfp4 in:#2042,But meet the problem:Value error, scheme must either be a preset scheme name or a dictionary of preset scheme names [type=value_error, input_value='MXFP4', input_type=str]

whole problem:
Traceback (most recent call last):
File "/root/model/compressor/llm-compressor/examples/quantization_w4a16_mxfp4/llama3_example.py", line 52, in
recipe = QuantizationModifier(targets="Linear", scheme="MXFP4", ignore=["lm_head"])
File "/root/anaconda3/envs/llmcompress/lib/python3.10/site-packages/pydantic/main.py", line 250, in init
validated_self = self.pydantic_validator.validate_python(data, self_instance=self)
pydantic_core._pydantic_core.ValidationError: 1 validation error for AutoRoundModifier
scheme
Value error, scheme must either be a preset scheme name or a dictionary of preset scheme names [type=value_error, input_value='MXFP4', input_type=str]
For further information visit https://errors.pydantic.dev/2.12/v/value_error

🛠️ Steps to reproduce

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions