-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Closed
Labels
bugSomething isn't workingSomething isn't working
Description
Reminder
- I have read the above rules and searched the existing issues.
System Info
kt-kernel 0.4.1, Ubuntu 24.04.
This server ran ktransformers 0.3.2 success.
And I create new conda env to run kt-kernel 0.4.1.
The new conda env with kt-kernel 0.4.1 conver cpu-weights and run Qwen-30b success
But I tried to conver DeepSeek R1 0528 to cpu weights failed.
Reproduction
python scripts/convert_cpu_weights.py \
--input-path /path/to/model \
--input-type bf16 \
--output /path/to/output \
--quant-method int4
when procee to layer 55, it crashed and return:
Processing layer 55 (53/59)...
Converting layer 55 with 256 experts via online quantization...
Loaded weights shapes:
gate_proj: torch.Size([256, 2048, 7168])
up_proj: torch.Size([256, 2048, 7168])
down_proj: torch.Size([256, 7168, 2048])
TP MOE layer 55, pool: 0x4019aca0, expert num: 256, num_experts_per_tok: 8
Creating AMX_MOE_TP 1 at numa 0
Creating AMX_MOE_TP 0 at numa 0
Creating "/opt/ai-models/r1/DeepSeek-R1-0528-CPU/_layer_55/_numa_1"Creating
"/opt/ai-models/r1/DeepSeek-R1-0528-CPU/_layer_55/_numa_0"
alloc 1 from other numa for 7160d0052660
From BF16
段错误 (核心已转储)
Error message so less, it's this script have any log settings?
Others
I found the memory add and add, the error seems OOM, this server has 768G RAM, did it enough?
How big memory need to conver DeepSeek R1 671b?
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working