Adding LZ extension zexDriverImport/Export of ExternalPointer to hipHostRegister#1284
Merged
pvelesko merged 10 commits intoJun 8, 2026
Merged
Conversation
… has the fix for Unit_hipFreeNegativeHost and cherrypicking out 'fix: only export HIP_CLANG_PATH env when llvm-config exists there (#1212)'
Collaborator
|
/run-aurora-ci |
pvelesko
approved these changes
Jun 8, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This continues to address #1260 .
This PR adds importHostMemory and exportHostMemory to the hipHostRegister and hipHostUnregister calls. For LZ, this then calls a LZ extension (https://github.com/intel/compute-runtime/blob/master/programmers-guide/SYSTEM_MEMORY_ALLOCATIONS.md) that improves the speed of copies. The test
tests/benchmarks/hostRegisterMemcpyOverhead.hipshows a speedup for using hipHostRegister for a host memory before memcopying it many times compared to not using hipHostRegister on PVC on Aurora at least:(The first call is slower with hiphostregister, but subsequent calls are ~2x faster than without.)
If the extension is not found, it will silently continue without.
Once we get a newer Level Zero release that contains version 1.14.33, we can switch to using
ze_external_memmap_sysmem_ext_desc_twhere we will be able to directly map malloc'd memory to USM host memory, but for now this should help a bit.