Refine the packaging infrastructure.#207
Conversation
Use pyproject.toml and scikit-build-core to drive the CMake build. Minor CMake modernization. Add some CMake infrastructure to try to handle the three target build types (cuda, hip, and cpu-only) but the project infrastructure may not be set up for non-cuda builds at this point.
- Fix default accelerator framework. - Only define `__USE_NOTEX` for AMD
Normalize some math functions to improve compatibility and consistency. Try to use the hipify wrappers in `torch.utils`, if available, else try to call `hipify-clang` directly. Prefer torch-based hipify into `/hipified_src`. Keep `hipify-clang` as an explicitly experimental fallback, and stage fallback inputs under a separate build-local `hipify_stage` tree. Add `gpu_runtime.h` and `gpu_fft.h` compatibility shims with comments explaining why LEAP still needs them even after source translation. Update the translated source set and include ordering so generated sources consistently see translated headers and copied support headers before the original src tree. Harden tools/run_hipify_clang.py by recording retry-aware manifest files, removing stale outputs, retrying known -p/-o conflicts without compile_commands.json context, accepting stdout-only output as a compatibility fallback, and surfacing clearer diagnostics for likely CUDA-arch propagation failures inside hipify-clang. Document the supported build and wheel paths in the README, including CPU, CUDA, and AMD usage, when visible GPUs are or are not required on the build host, and the caveats around isolated builds and the experimental hipify-clang fallback.
|
Commit 6de2200 is a pretty substantial change to restore automatic hipification that Background
A complete migration off of CMake-driven hipifyWe're trying to call hipify-clang, but, at least in ROCM 7.2, we're having a hard time managing its output files correctly. In the mean time, we can use the
For best results, use the most recent ROCM available and the oldest supported CUDA available. With ROCM 7.2 and CUDA 12.9, the following seems to work python -m build . -Ccmake.define.LEAP_GPU=AMD -Ccmake.define.CMAKE_CXX_COMPILER=`which amdclang++` --no-isolation |
Continues the CMake reorganization and migrates from setup.py to pyproject.toml to drive the build and packaging.
Includes some source code patches for more consistent behavior and some shim headers for better compatibility across CUDA, HIP, and CPU runtimes.