Added support for cutlass_mla_get_workspace_size (with new execution engine interface) #36

pralay-das · 2025-11-10T08:24:27Z

In this PR

change the interface of chunk_prefill execution engine, currently it is more compatible with cuda cutlass implementation
support for cutlass_mla_get_workspace_size op
verified changes with test file

cmd: python -m pytest tests/test_flash_attention.py
result: 96 passed, 182 skipped, 1 warning in 3.43s

…ill interface)

… issue

kareemshaik80 · 2025-11-14T03:55:07Z

src/sycl/flash_attn_interface.cpp

+  // need to change these parameters to match the actual use case
+  // will do later
+
+  return MlaFlashAttnType::Kernel::get_workspace_size(arguments);


Do you have specific WS sizes for mla, this is returning zero size. Or do we really need this API for our implementation?

hi, I don't have any specific WS sizes right now, which can work for both mla and chunked_prefill...
I have seen in the test case it is calling this function before calling mla function, and simple returning zero is working fine with the test case...

right, let's come up with right sizes for different configurations.

pralay-das force-pushed the dev/pralay/cutlass_mla_get_workspace_size branch from ab5e654 to cf3362a Compare November 10, 2025 08:26

pralay-das added 3 commits November 10, 2025 08:29

Added support for cutlass_mla_get_workspace_size (with new chunk pref…

c119daa

…ill interface)

after rebase changes

977ebbf

commented paged_kv=false, causal=false, Local=true for register spill…

f5dde3b

… issue

pralay-das force-pushed the dev/pralay/cutlass_mla_get_workspace_size branch from cf3362a to f5dde3b Compare November 10, 2025 08:29

kareemshaik80 reviewed Nov 14, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Added support for cutlass_mla_get_workspace_size (with new execution engine interface) #36

Added support for cutlass_mla_get_workspace_size (with new execution engine interface) #36

pralay-das commented Nov 10, 2025 •

edited

Loading

Uh oh!

kareemshaik80 Nov 14, 2025

Uh oh!

pralay-das Nov 14, 2025

Uh oh!

kareemshaik80 Nov 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Added support for cutlass_mla_get_workspace_size (with new execution engine interface) #36

Are you sure you want to change the base?

Added support for cutlass_mla_get_workspace_size (with new execution engine interface) #36

Conversation

pralay-das commented Nov 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kareemshaik80 Nov 14, 2025

Choose a reason for hiding this comment

Uh oh!

pralay-das Nov 14, 2025

Choose a reason for hiding this comment

Uh oh!

kareemshaik80 Nov 14, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

pralay-das commented Nov 10, 2025 •

edited

Loading