Skip to content

Adding LZ extension zexDriverImport/Export of ExternalPointer to hipHostRegister#1284

Merged
pvelesko merged 10 commits into
mainfrom
zexDriverImportExternalPointer_lazy_hip_host_register
Jun 8, 2026
Merged

Adding LZ extension zexDriverImport/Export of ExternalPointer to hipHostRegister#1284
pvelesko merged 10 commits into
mainfrom
zexDriverImportExternalPointer_lazy_hip_host_register

Conversation

@colleeneb

Copy link
Copy Markdown
Contributor

This continues to address #1260 .

This PR adds importHostMemory and exportHostMemory to the hipHostRegister and hipHostUnregister calls. For LZ, this then calls a LZ extension (https://github.com/intel/compute-runtime/blob/master/programmers-guide/SYSTEM_MEMORY_ALLOCATIONS.md) that improves the speed of copies. The test tests/benchmarks/hostRegisterMemcpyOverhead.hip shows a speedup for using hipHostRegister for a host memory before memcopying it many times compared to not using hipHostRegister on PVC on Aurora at least:

1000 hipMemcpy H2D (4 MB) WITH    hipHostRegister: avg=0.117369 ms  min=0.11494 ms  max=0.964309 ms
1000 hipMemcpy H2D (4 MB) WITHOUT hipHostRegister: avg=0.254521 ms  min=0.242622 ms  max=0.345852 ms
overhead factor (avg): 0.461138x

(The first call is slower with hiphostregister, but subsequent calls are ~2x faster than without.)

If the extension is not found, it will silently continue without.

Once we get a newer Level Zero release that contains version 1.14.33, we can switch to using ze_external_memmap_sysmem_ext_desc_t where we will be able to directly map malloc'd memory to USM host memory, but for now this should help a bit.

@pvelesko

pvelesko commented Jun 8, 2026

Copy link
Copy Markdown
Collaborator

/run-aurora-ci

@pvelesko pvelesko merged commit 2d5f60f into main Jun 8, 2026
28 checks passed
@pvelesko pvelesko deleted the zexDriverImportExternalPointer_lazy_hip_host_register branch June 8, 2026 13:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants