Official support for mgpu vectorization hints#405
Merged
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
Adds official mgpu vectorization hint support across Iris distributed-memory helpers, enabling callers to provide alignment/contiguity assumptions to improve generated memory code where valid.
Changes:
- Added an optional
hint: tl.constexprparameter to pointer translation and all distributed memory ops (load/store/copy/get/put/atomics). - Applied
tl.multiple_of+tl.max_contiguousto translated pointers whenhintis provided. - Expanded docstrings to document the new
hintparameter on the public APIs.
mawad-amd
reviewed
Feb 28, 2026
Collaborator
mawad-amd
left a comment
There was a problem hiding this comment.
Let's introduce one new argument per hint and add tests.
Member
Author
See the comment above. The current local tests I was doing was assembly generation and inspection. |
mawad-amd
approved these changes
Feb 28, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Motivation
This pull request adds support for vectorization hints to various distributed memory operations in the
iris.pymodule. By introducing an optionalhintparameter, the code now allows callers to specify alignment and vectorization preferences for pointer translation and memory access, which can lead to more efficient memory operations on supported hardware.The most important changes are:
Vectorization hint support:
hintparameter (of typetl.constexpr) to the__translatefunction and all distributed memory operation methods (e.g.,load,store,copy,get, and all atomic operations) iniris.py. This parameter allows callers to specify alignment and vectorization preferences for pointer translation. [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16]__translate,copy, and related functions to apply the vectorization hint usingtl.multiple_ofandtl.max_contiguouswhen the hint is provided. [1] [2] [3]Documentation updates:
hintparameter, including usage examples and explanations of how to specify the hint for 1-D and N-D cases. [1] [2] [3] [4] [5]These changes provide more control over memory access patterns, which can help optimize performance for distributed and vectorized workloads.
Submission Checklist