nvfp4 gemm example. #75

vickiw973 · 2025-11-11T13:43:38Z

Tested results on B200 chip with python3.13.8 and CuTe DSL 4.3.0.dev0

(env13_8) nvfp4_gemm$ python3 eval.py test task.yml
compile: start
compile: pass
test-count: 10
test.0.spec: m: 128; n: 256; k: 256; l: 1; seed: 1111
test.0.status: pass
test.1.spec: m: 128; n: 1536; k: 7168; l: 1; seed: 1111
test.1.status: pass
test.2.spec: m: 128; n: 3072; k: 1536; l: 1; seed: 1111
test.2.status: pass
test.3.spec: m: 256; n: 7168; k: 256; l: 1; seed: 1111
test.3.status: pass
test.4.spec: m: 256; n: 7168; k: 2048; l: 1; seed: 1111
test.4.status: pass
test.5.spec: m: 2304; n: 4608; k: 7168; l: 1; seed: 1111
test.5.status: pass
test.6.spec: m: 384; n: 7168; k: 2304; l: 1; seed: 1111
test.6.status: pass
test.7.spec: m: 512; n: 512; k: 7168; l: 1; seed: 1111
test.7.status: pass
test.8.spec: m: 512; n: 4096; k: 512; l: 1; seed: 1111
test.8.status: pass
test.9.spec: m: 512; n: 1536; k: 7168; l: 1; seed: 1111
test.9.status: pass
check: pass
(env13_8) nvfp4_gemm$ python3 eval.py benchmark task.yml
compile: start
compile: pass
benchmark-count: 3
benchmark.0.spec: m: 7168; n: 128; k: 16384; l: 1; seed: 1111
benchmark.0.runs: 200
benchmark.0.mean: 122583.36134254932
benchmark.0.std: 24482.462646219574
benchmark.0.err: 1731.1715357288206
benchmark.0.best: 115712.0019197464
benchmark.0.worst: 455711.9905948639
benchmark.1.spec: m: 4096; n: 128; k: 7168; l: 1; seed: 1111
benchmark.1.runs: 200
benchmark.1.mean: 80429.76021766663
benchmark.1.std: 27085.550673724414
benchmark.1.err: 1915.2376553562394
benchmark.1.best: 72704.0022611618
benchmark.1.worst: 353311.9857311249
benchmark.2.spec: m: 7168; n: 128; k: 2048; l: 1; seed: 1111
benchmark.2.runs: 200
benchmark.2.mean: 59335.5206400156
benchmark.2.std: 18873.720483479116
benchmark.2.err: 1334.5735740087525
benchmark.2.best: 54271.99974656105
benchmark.2.worst: 320576.012134552
check: pass
(env13_8) nvfp4_gemm$ python3 eval.py leaderboard task.yml
compile: start
compile: pass
benchmark-count: 3
benchmark.0.spec: m: 7168; n: 128; k: 16384; l: 1; seed: 1111
benchmark.0.runs: 200
benchmark.0.mean: 196519.99942958355
benchmark.0.std: 10528.200054341341
benchmark.0.err: 744.456165211334
benchmark.0.best: 174079.9993276596
benchmark.0.worst: 243744.00079250336
benchmark.1.spec: m: 4096; n: 128; k: 7168; l: 1; seed: 1111
benchmark.1.runs: 200
benchmark.1.mean: 118216.48139506578
benchmark.1.std: 26523.21540881495
benchmark.1.err: 1875.4745474444578
benchmark.1.best: 109567.99983978271
benchmark.1.worst: 486431.9860935211
benchmark.2.spec: m: 7168; n: 128; k: 2048; l: 1; seed: 1111
benchmark.2.runs: 200
benchmark.2.mean: 94226.08111053705
benchmark.2.std: 4438.020500021594
benchmark.2.err: 313.81543906101814
benchmark.2.best: 89088.00035715103
benchmark.2.worst: 129023.99897575378
check: pass

add nvfp4 gemm example.

2ed078a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

nvfp4 gemm example. #75

nvfp4 gemm example. #75

vickiw973 commented Nov 11, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

nvfp4 gemm example. #75

Are you sure you want to change the base?

nvfp4 gemm example. #75

Conversation

vickiw973 commented Nov 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

vickiw973 commented Nov 11, 2025 •

edited

Loading