Skip to content

convert_cpu_weights DeepSeek R1 0528 crashed #1627

@mrgaolei

Description

@mrgaolei

Reminder

  • I have read the above rules and searched the existing issues.

System Info

kt-kernel 0.4.1, Ubuntu 24.04.
This server ran ktransformers 0.3.2 success.
And I create new conda env to run kt-kernel 0.4.1.
The new conda env with kt-kernel 0.4.1 conver cpu-weights and run Qwen-30b success
But I tried to conver DeepSeek R1 0528 to cpu weights failed.

Reproduction

python scripts/convert_cpu_weights.py \
  --input-path /path/to/model \
  --input-type bf16 \
  --output /path/to/output \
  --quant-method int4

when procee to layer 55, it crashed and return:

Processing layer 55 (53/59)...
Converting layer 55 with 256 experts via online quantization...
  Loaded weights shapes:
    gate_proj: torch.Size([256, 2048, 7168])
    up_proj: torch.Size([256, 2048, 7168])
    down_proj: torch.Size([256, 7168, 2048])
TP MOE layer 55, pool: 0x4019aca0, expert num: 256, num_experts_per_tok: 8
Creating AMX_MOE_TP 1 at numa 0
Creating AMX_MOE_TP 0 at numa 0
Creating "/opt/ai-models/r1/DeepSeek-R1-0528-CPU/_layer_55/_numa_1"Creating
"/opt/ai-models/r1/DeepSeek-R1-0528-CPU/_layer_55/_numa_0"
alloc 1 from other numa for 7160d0052660
From BF16
段错误 (核心已转储)

Error message so less, it's this script have any log settings?

Others

I found the memory add and add, the error seems OOM, this server has 768G RAM, did it enough?
How big memory need to conver DeepSeek R1 671b?

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions