[WIP] [Feat] Ffts D2D transport dev#990
Conversation
|
|
||
| void FftsEngine::KeepAlive(std::shared_ptr<ContextBuffer> contexts) | ||
| { | ||
| pendingContexts_.emplace_back(std::move(contexts)); |
There was a problem hiding this comment.
💡 Memory Consideration: pendingContexts_ grows without bound if Synchronize() is not called frequently. Consider adding a size limit or a cleanup mechanism to prevent unbounded memory growth in long-running operations.
There was a problem hiding this comment.
Pull request overview
Introduces an Ascend FFTS Plus-based device-to-device transport library (ucm_transport_ffts) under ucm/transport/p2p/ffts. The transport exposes an asynchronous copy API that wraps an ACL fast-launch/fast-sync stream and submits SDMA contexts via rtFftsPlusTaskLaunchWithFlag.
Changes:
- Add public
FftsTransportAPI (Setup,WaitEvent,CopyAsync, batchedSubmit,Synchronize) and a privateFftsEnginethat builds SDMA contexts, splits batches by a 128 ready-context limit, and keeps context buffers alive until stream sync. - Add CMake discovery for Ascend CANN headers (
acl,runtime, profiling, toolchain) and libraries (ascendcl,runtime), gated onRUNTIME_ENVIRONMENT == ascend. - Add gtest-based unit tests covering setup ordering, single/batched copies, large-batch splitting, two-stage pipelining, and invalid/no-op input handling.
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| ucm/transport/p2p/CMakeLists.txt | Conditionally include ffts subdirectory on Ascend builds. |
| ucm/transport/p2p/ffts/CMakeLists.txt | Locate Ascend CANN headers/libs and define the ucm_transport_ffts static library. |
| ucm/transport/p2p/ffts/include/ffts_transport.h | Public API: CopyDesc plus FftsTransport pimpl façade. |
| ucm/transport/p2p/ffts/detail/ffts_engine.h | Internal FftsEngine declaration with stream/state and pending context buffer storage. |
| ucm/transport/p2p/ffts/src/ffts_transport.cpp | Pimpl implementation forwarding to FftsEngine. |
| ucm/transport/p2p/ffts/src/ffts_engine.cpp | Engine logic: stream setup, SDMA context construction, FFTS Plus task launch, batch splitting, lifetime management. |
| ucm/transport/p2p/ffts/test/CMakeLists.txt | Build wiring for the gtest test binary. |
| ucm/transport/p2p/ffts/test/ffts_transport_test.cpp | Functional tests for the FFTS transport. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| FftsEngine(FftsEngine&&) = delete; | ||
| FftsEngine& operator=(FftsEngine&&) = delete; | ||
|
|
||
| Status Setup(int32_t deviceId); |
There was a problem hiding this comment.
is it necessary to also have a teardown/shutdown for this class?
There was a problem hiding this comment.
RAII cleanup is sufficient for the current FftsEngine. The destructor already releases pending context buffers and destroys the ACL stream. Global ACL runtime and device lifecycle are intentionally not managed by FftsEngine.
| { | ||
| std::vector<size_t> result; | ||
| size_t begin = 0; | ||
| while (begin <= value.size()) { |
There was a problem hiding this comment.
while (begin <= value.size()) could potentially cause issues. When begin == value.size(), the subsequent value.substr(begin, ...) would return an empty string, which throws an exception. While this works due to the break statement, the condition should ideally be while (begin < value.size()) for clarity and to avoid the edge case entirely. The current logic relies on the exception to handle the boundary.
Purpose
Introduce an Ascend FFTS Plus based D2D transport module for P2P cache data movement, providing an asynchronous copy interface backed by ACL stream execution and FFTS SDMA task submission.
Modifications
ucm_transport_fftsstatic library underucm/transport/p2p/ffts.ascendcl,runtime, FFTS runtime headers, profiling headers, and toolchain headers.FftsTransportAPI withSetup,WaitEvent,CopyAsync, batchedSubmit, andSynchronize.Test
FFTS transport tests covering:
Setup()is called.