-
Notifications
You must be signed in to change notification settings - Fork 311
DLPack to mdspan
#7047
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
fbusato
wants to merge
44
commits into
NVIDIA:main
Choose a base branch
from
fbusato:dlpack-to-mdspan
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
DLPack to mdspan
#7047
+2,447
−2
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Co-authored-by: David Bayer <[email protected]>
…a-cuda Linker to link LTO (NVIDIA#7011) Co-authored-by: Ashwin Srinath <[email protected]>
This allows us to use it independently
…NVIDIA#7024) Co-authored-by: pciolkosz <[email protected]>
* Rework hierarchy levels * add missing launches to native cluster level queries * remove dependency on runtime storage --------- Co-authored-by: pciolkosz <[email protected]>
* Remove _view from the shared memory getter * Forgot about cudax
* Ignore CUDA free errors in thrust memory resource * Add a comment
* Don't set current device in CUDA 13 and handle extended lambda * Add extended lambda test * Compiler workarounds * Waive extended lambda test on NVRTC * Apply suggestion from @davebayer --------- Co-authored-by: David Bayer <[email protected]>
…regardless of exception support (NVIDIA#7028) Co-authored-by: David Bayer <[email protected]>
* Move algorithm cache to a central registry * Update select benchmark * Update merge_sort benchmark --------- Co-authored-by: Ashwin Srinath <[email protected]>
…A#7008) * Move algorithm cache to a central registry * Add bench_select.py * Add tests for stateful select and transform * For the purposes of caching, hash DeviceArrayLike objects by pointer, shape, and dtype * Update select benchmark * Bump numba-cuda dependency to 0.23.0 * Add select example * Lint * Remove duplicate cache registry --------- Co-authored-by: Ashwin Srinath <[email protected]>
* Add explicit alignment specification in buffer * Fix shared resource test * Missed alignment in deallocate
* use the sccache-dist build cluster for RAPIDS CI jobs [skip-matrix] [skip-vdc] [skip-docs] [skip-matx] [skip-pytorch] [test-rapids] * run devcontainer-utils lifecycle scripts [skip-matrix] [skip-vdc] [skip-docs] [skip-matx] [skip-pytorch] [test-rapids] * define GH_TOKEN [skip-matrix] [skip-vdc] [skip-docs] [skip-matx] [skip-pytorch] [test-rapids] * cpu32 -> cpu16 [skip-matrix] [skip-vdc] [skip-docs] [skip-matx] [skip-pytorch] [test-rapids] * remove preprocessor cache key prefix [skip-matrix] [skip-vdc] [skip-docs] [skip-matx] [skip-pytorch] [test-rapids] * increase nofile ulimit [skip-matrix] [skip-vdc] [skip-docs] [skip-matx] [skip-pytorch] [test-rapids]
This comment has been minimized.
This comment has been minimized.
Contributor
😬 CI Workflow Results🟥 Finished in 1h 02m: Pass: 93%/91 | Total: 1d 00h | Max: 50m 48s | Hits: 97%/199572See results here. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
The PR implements conversion utilities that take a DLTensor view and produce a (host/device/managed) mdspan of the same underlying memory.
The opposite conversion is implemented in mdspan to DLPack #7027. #7027 is also a prerequisite of this PR.
Todo: