`DLPack` to `mdspan` #7047

fbusato · 2025-12-23T01:43:27Z

Description

The PR implements conversion utilities that take a DLTensor view and produce a (host/device/managed) mdspan of the same underlying memory.

The opposite conversion is implemented in mdspan to DLPack #7027. #7027 is also a prerequisite of this PR.

Todo:

documentation

Co-authored-by: David Bayer <[email protected]>

…n-to-dlpack

…a-cuda Linker to link LTO (NVIDIA#7011) Co-authored-by: Ashwin Srinath <[email protected]>

This allows us to use it independently

…VIDIA#7026)

…NVIDIA#7024) Co-authored-by: pciolkosz <[email protected]>

* Rework hierarchy levels * add missing launches to native cluster level queries * remove dependency on runtime storage --------- Co-authored-by: pciolkosz <[email protected]>

…A#7019)

) * Fix synchronous resource adapter property passing * Hide pinned pool on older CUDA versions * Workaround MSVC bug * Missing maybe_unused

* Remove _view from the shared memory getter * Forgot about cudax

* Ignore CUDA free errors in thrust memory resource * Add a comment

@davebayer

* Don't set current device in CUDA 13 and handle extended lambda * Add extended lambda test * Compiler workarounds * Waive extended lambda test on NVRTC * Apply suggestion from @davebayer --------- Co-authored-by: David Bayer <[email protected]>

…regardless of exception support (NVIDIA#7028) Co-authored-by: David Bayer <[email protected]>

…DIA#7012)

* Move algorithm cache to a central registry * Update select benchmark * Update merge_sort benchmark --------- Co-authored-by: Ashwin Srinath <[email protected]>

…A#7008) * Move algorithm cache to a central registry * Add bench_select.py * Add tests for stateful select and transform * For the purposes of caching, hash DeviceArrayLike objects by pointer, shape, and dtype * Update select benchmark * Bump numba-cuda dependency to 0.23.0 * Add select example * Lint * Remove duplicate cache registry --------- Co-authored-by: Ashwin Srinath <[email protected]>

)

* Add explicit alignment specification in buffer * Fix shared resource test * Missed alignment in deallocate

* use the sccache-dist build cluster for RAPIDS CI jobs [skip-matrix] [skip-vdc] [skip-docs] [skip-matx] [skip-pytorch] [test-rapids] * run devcontainer-utils lifecycle scripts [skip-matrix] [skip-vdc] [skip-docs] [skip-matx] [skip-pytorch] [test-rapids] * define GH_TOKEN [skip-matrix] [skip-vdc] [skip-docs] [skip-matx] [skip-pytorch] [test-rapids] * cpu32 -> cpu16 [skip-matrix] [skip-vdc] [skip-docs] [skip-matx] [skip-pytorch] [test-rapids] * remove preprocessor cache key prefix [skip-matrix] [skip-vdc] [skip-docs] [skip-matx] [skip-pytorch] [test-rapids] * increase nofile ulimit [skip-matrix] [skip-vdc] [skip-docs] [skip-matx] [skip-pytorch] [test-rapids]

…k-to-mdspan

github-actions · 2025-12-23T19:40:53Z

😬 CI Workflow Results

🟥 Finished in 1h 02m: Pass: 93%/91 | Total: 1d 00h | Max: 50m 48s | Hits: 97%/199572

See results here.

fbusato and others added 30 commits December 18, 2025 12:16

first version

750ca5a

add unit test

f040c10

documentation

464ccc2

Update libcudacxx/include/cuda/__mdspan/mdspan_to_dlpack.h

6f32ae9

Co-authored-by: David Bayer <[email protected]>

Merge branch 'mdspan-to-dlpack' of github.com:fbusato/cccl into mdspa…

3457d3a

…n-to-dlpack

add many types

ee05eda

remove operator->

4d2e0da

formatting

f290320

fix MSVC warning

7a22848

improve documentation

f78db30

fix MSVC warning

1467ab2

first version

d844f65

complete the implementation

3843556

add unit test

977909f

cuda.coop: Use cuda.core.experimental.Linker instead of internal numb…

b0e1fbc

…a-cuda Linker to link LTO (NVIDIA#7011) Co-authored-by: Ashwin Srinath <[email protected]>

Make c2h vector comparisons constexpr (NVIDIA#7009)

50da3d4

improves comments on decoupled lookback example (NVIDIA#7015)

f8a4d06

Extract reduce_op_sync into a free function (NVIDIA#7004)

e9f0a13

This allows us to use it independently

Remove experimental namespace from cuda.core import (NVIDIA#7022)

362d316

reexpress completion signature transform alias to make clangd happy (N…

28d22c9

…VIDIA#7026)

Qualify call to __launch_impl in launch.h to avoid ambiguity errors (…

1e28e8c

…NVIDIA#7024) Co-authored-by: pciolkosz <[email protected]>

Rework hierarchy levels (NVIDIA#6957)

f21a158

* Rework hierarchy levels * add missing launches to native cluster level queries * remove dependency on runtime storage --------- Co-authored-by: pciolkosz <[email protected]>

Use vectorized tuning for triad benchmark for dtypes of size 2 (NVIDI…

1ef85d4

…A#7019)

[libcu++] Fix synchronous resource adapter property passing (NVIDIA#6976

00a1b95

) * Fix synchronous resource adapter property passing * Hide pinned pool on older CUDA versions * Workaround MSVC bug * Missing maybe_unused

[libcu++] Remove _view from the shared memory getter name (NVIDIA#6997)

adc23f5

* Remove _view from the shared memory getter * Forgot about cudax

[thrust] Ignore CUDA free errors in thrust memory resource (NVIDIA#7002)

33aa542

* Ignore CUDA free errors in thrust memory resource * Add a comment

the <stdexcept> header must be included when using _CCCL_THROW, …

6402bc6

…regardless of exception support (NVIDIA#7028) Co-authored-by: David Bayer <[email protected]>

Error out when nvrtcc cannot parse cuda_thread_count (NVIDIA#7035)

5546b87

Allow all public headers to be included with host compilers only (NVI…

58aba1d

…DIA#7012)

shwina and others added 11 commits December 22, 2025 17:38

[cuda.compute]: Fixes and updates to benchmarks (NVIDIA#6999)

e80cee2

* Move algorithm cache to a central registry * Update select benchmark * Update merge_sort benchmark --------- Co-authored-by: Ashwin Srinath <[email protected]>

Fix cuda::memcpy async edge cases and add more tests (NVIDIA#6608)

c40c68d

Explicitly set CCCL_TOPLEVEL_PROJECT to OFF when needed (NVIDIA#7016

16bdfbf

)

[libcu++] Add explicit alignment specification in buffer (NVIDIA#7005)

11d32ec

* Add explicit alignment specification in buffer * Fix shared resource test * Missed alignment in deallocate

first version

52834f8

complete the implementation

dec5dca

add unit test

b38a6a7

Merge branch 'dlpack-to-mdspan' of github.com:fbusato/cccl into dlpac…

fe72fcc

…k-to-mdspan

fix unit test

5be1893

fbusato self-assigned this Dec 23, 2025

fbusato requested review from a team as code owners December 23, 2025 01:43

fbusato added the 3.2.0 Targeted for 3.2.0 release label Dec 23, 2025

fbusato requested a review from gevtushenko December 23, 2025 01:43

fbusato added this to CCCL Dec 23, 2025

fbusato requested a review from davebayer December 23, 2025 01:43

github-project-automation bot moved this to Todo in CCCL Dec 23, 2025

cccl-authenticator-app bot moved this from Todo to In Review in CCCL Dec 23, 2025

fbusato added 2 commits December 22, 2025 17:44

formatting

f0909df

minor fixes

d149dff

This comment has been minimized.

Sign in to view

fix compiler warnings

e96ebea

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

`DLPack` to `mdspan` #7047

`DLPack` to `mdspan` #7047

fbusato commented Dec 23, 2025 •

edited

Loading

Uh oh!

This comment has been minimized.

github-actions bot commented Dec 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

10 participants

DLPack to mdspan #7047

Are you sure you want to change the base?

DLPack to mdspan #7047

Conversation

fbusato commented Dec 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Uh oh!

This comment has been minimized.

github-actions bot commented Dec 23, 2025

😬 CI Workflow Results

🟥 Finished in 1h 02m: Pass: 93%/91 | Total: 1d 00h | Max: 50m 48s | Hits: 97%/199572

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

10 participants

`DLPack` to `mdspan` #7047

`DLPack` to `mdspan` #7047

fbusato commented Dec 23, 2025 •

edited

Loading