-
Notifications
You must be signed in to change notification settings - Fork 316
Description
⚙️ Your current environment
- quantization model: Llama3-8B
- device:A100 * 4
- code:
from transformers import AutoModelForCausalLM, AutoTokenizer
from llmcompressor import oneshot
from llmcompressor.modifiers.quantization import QuantizationModifier
from llmcompressor.utils import dispatch_for_generation
MODEL_ID = "root/model/Meta-Llama-3-8B"
Load model.
model = AutoModelForCausalLM.from_pretrained(MODEL_ID, torch_dtype="auto")
tokenizer = AutoTokenizer.from_pretrained(MODEL_ID)
Configure the quantization algorithm and scheme.
In this case, we:
* quantize the weights to fp4 with per group 32 via ptq
recipe = QuantizationModifier(targets="Linear", scheme="MXFP4", ignore=["lm_head"])
Apply quantization.
oneshot(model=model, recipe=recipe)
print("\n\n")
print("========== SAMPLE GENERATION ==============")
dispatch_for_generation(model)
input_ids = tokenizer("Hello my name is", return_tensors="pt").input_ids.to(
model.device
)
output = model.generate(input_ids, max_new_tokens=100)
print(tokenizer.decode(output[0]))
print("==========================================\n\n")
Save to disk in compressed-tensors format.
SAVE_DIR = MODEL_ID.rstrip("/").split("/")[-1] + "-MXFP4"
model.save_pretrained(SAVE_DIR, save_compressed=True)
tokenizer.save_pretrained(SAVE_DIR)
🐛 Describe the bug
I see pr of supporting mxfp4 in:#2042,But meet the problem:Value error, scheme must either be a preset scheme name or a dictionary of preset scheme names [type=value_error, input_value='MXFP4', input_type=str]
whole problem:
Traceback (most recent call last):
File "/root/model/compressor/llm-compressor/examples/quantization_w4a16_mxfp4/llama3_example.py", line 52, in
recipe = QuantizationModifier(targets="Linear", scheme="MXFP4", ignore=["lm_head"])
File "/root/anaconda3/envs/llmcompress/lib/python3.10/site-packages/pydantic/main.py", line 250, in init
validated_self = self.pydantic_validator.validate_python(data, self_instance=self)
pydantic_core._pydantic_core.ValidationError: 1 validation error for AutoRoundModifier
scheme
Value error, scheme must either be a preset scheme name or a dictionary of preset scheme names [type=value_error, input_value='MXFP4', input_type=str]
For further information visit https://errors.pydantic.dev/2.12/v/value_error
🛠️ Steps to reproduce
No response