Skip to content

Conversation

@JanCSEM
Copy link

@JanCSEM JanCSEM commented Dec 2, 2025

This PR fixes multiple bugs:

  1. Scaling factors were previously converted to floating point scalar regardless of the shape, which made the exportBrevitas function error.

This adds a basic check on the shape of the scaling factor tensor, and only converts to a scalar if the tensor has only 1 element.
Added single Conv layer and simple CNN with channel-wise weights quantization models to the tests to validate.

  1. In QuantDivider, each node argument is assumed to be a Tensor. For residuals, and multi-branch concatenations, the argument might be a Tuple or a List.

Added support for 1-level nested node arguments.

  1. UnrolledMHA was previously returning a single tensor (the attention outputs), which is not consistent with MHA as implemented in pytorch and Brevitas where a tuple is returned (attention outputs, attention weights).

Fixed the implementation to always return a tuple, with the value of the attention weights or None, depending on the provided option.

@JanCSEM JanCSEM changed the title Fix: Add support for channel wise scales Fix: Multiple bug fixes Dec 2, 2025
@JanCSEM
Copy link
Author

JanCSEM commented Dec 2, 2025

@Victor-Jung tagging you for visibility

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant