Skip to content

Conversation

@fbusato
Copy link
Contributor

@fbusato fbusato commented Dec 23, 2025

Description

The PR implements conversion utilities that take a DLTensor view and produce a (host/device/managed) mdspan of the same underlying memory.

The opposite conversion is implemented in mdspan to DLPack #7027. #7027 is also a prerequisite of this PR.

Todo:

  • documentation

fbusato and others added 30 commits December 18, 2025 12:16
This allows us to use it independently
* Rework hierarchy levels

* add missing launches to native cluster level queries

* remove dependency on runtime storage

---------

Co-authored-by: pciolkosz <[email protected]>
)

* Fix synchronous resource adapter property passing

* Hide pinned pool on older CUDA versions

* Workaround MSVC bug

* Missing maybe_unused
* Remove _view from the shared memory getter

* Forgot about cudax
* Ignore CUDA free errors in thrust memory resource

* Add a comment
* Don't set current device in CUDA 13 and handle extended lambda

* Add extended lambda test

* Compiler workarounds

* Waive extended lambda test on NVRTC

* Apply suggestion from @davebayer

---------

Co-authored-by: David Bayer <[email protected]>
…regardless of exception support (NVIDIA#7028)

Co-authored-by: David Bayer <[email protected]>
shwina and others added 11 commits December 22, 2025 17:38
* Move algorithm cache to a central registry

* Update select benchmark

* Update merge_sort benchmark

---------

Co-authored-by: Ashwin Srinath <[email protected]>
…A#7008)

* Move algorithm cache to a central registry

* Add bench_select.py

* Add tests for stateful select and transform

* For the purposes of caching, hash DeviceArrayLike objects by pointer, shape, and dtype

* Update select benchmark

* Bump numba-cuda dependency to 0.23.0

* Add select example

* Lint

* Remove duplicate cache registry

---------

Co-authored-by: Ashwin Srinath <[email protected]>
* Add explicit alignment specification in buffer

* Fix shared resource test

* Missed alignment in deallocate
* use the sccache-dist build cluster for RAPIDS CI jobs [skip-matrix] [skip-vdc] [skip-docs] [skip-matx] [skip-pytorch] [test-rapids]

* run devcontainer-utils lifecycle scripts [skip-matrix] [skip-vdc] [skip-docs] [skip-matx] [skip-pytorch] [test-rapids]

* define GH_TOKEN [skip-matrix] [skip-vdc] [skip-docs] [skip-matx] [skip-pytorch] [test-rapids]

* cpu32 -> cpu16 [skip-matrix] [skip-vdc] [skip-docs] [skip-matx] [skip-pytorch] [test-rapids]

* remove preprocessor cache key prefix [skip-matrix] [skip-vdc] [skip-docs] [skip-matx] [skip-pytorch] [test-rapids]

* increase nofile ulimit [skip-matrix] [skip-vdc] [skip-docs] [skip-matx] [skip-pytorch] [test-rapids]
@fbusato fbusato self-assigned this Dec 23, 2025
@fbusato fbusato requested review from a team as code owners December 23, 2025 01:43
@fbusato fbusato added the 3.2.0 Targeted for 3.2.0 release label Dec 23, 2025
@fbusato fbusato requested a review from gevtushenko December 23, 2025 01:43
@fbusato fbusato added this to CCCL Dec 23, 2025
@fbusato fbusato requested a review from davebayer December 23, 2025 01:43
@github-project-automation github-project-automation bot moved this to Todo in CCCL Dec 23, 2025
@cccl-authenticator-app cccl-authenticator-app bot moved this from Todo to In Review in CCCL Dec 23, 2025
@github-actions

This comment has been minimized.

@github-actions
Copy link
Contributor

😬 CI Workflow Results

🟥 Finished in 1h 02m: Pass: 93%/91 | Total: 1d 00h | Max: 50m 48s | Hits: 97%/199572

See results here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

3.2.0 Targeted for 3.2.0 release

Projects

Status: In Review

Development

Successfully merging this pull request may close these issues.

10 participants