Fix: Enhanced mha #5

JanCSEM · 2025-12-02T10:02:14Z

Deepquant MHA makes assumptions on MHA configurations, instead of reading them from the Brevitas object.

This adds support for batch_first implementation.

I was unable to implement a grouped QKV implementation due to the splitting operation which is not natively supported on Brevitas QuantTensor. Operating on QuantTensor.value results in a mismatch between tracing and execution stages.

… 1 element

…mplementations

JanCSEM added 6 commits December 1, 2025 11:54

Fix scaling factor Tensor -> Float conversion if Tensor has more than…

8890e21

… 1 element

Allow for node arguments to be a tuple or list

1d4f293

Fix MHA to return a tuple for consistency with pytorch and brevitas i…

9402356

…mplementations

Add batch first handling in unrolled MHA

eeaeb26

Add tests for channel wise weight quantization

648155a

Merge branch 'channel-wise-scales' into enhanced-mha

2fb3708

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix: Enhanced mha #5

Fix: Enhanced mha #5

Uh oh!

JanCSEM commented Dec 2, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Fix: Enhanced mha #5

Are you sure you want to change the base?

Fix: Enhanced mha #5

Uh oh!

Conversation

JanCSEM commented Dec 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

JanCSEM commented Dec 2, 2025 •

edited

Loading