Add Mooncake DataProto transfer backend#469
Open
zxpdemonio wants to merge 2 commits into
Open
Conversation
Wire Mooncake into the existing DataProto transfer backend path with a node-scoped client by default to reuse per-node store setup and registered buffer pools, while keeping process-local clients configurable. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Use Mooncake structured object transfer as the optional DataProto backend while keeping ROLL's existing transfer_backend.put API and RemoteBatch semantics. Add real RDMA-backed Mooncake tests without fake Mooncake modules. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR adds Mooncake as an optional ROLL transfer backend for structured
DataPrototransfer.The implementation keeps ROLL-side integration aligned with the existing transfer backend boundary:
transfer_backend.put/get/deleteandRemoteBatchsemantics.mooncakepackage APIs.transfer_backend.put(...)API remains unchanged.What Changed
Mooncake Backend
mooncake.structured_object_store.MooncakeBundleTransfer.DataProto-style payloads as Mooncake structured objects.non_tensor_batchColumnRemoteBatchROLL Compatibility
transfer_backend.put(partition, row_ids, fields, batch_size)API.RemoteBatch/ColumnRemoteBatchsemantics.Node-scoped Client
Documentation
DataProtobackend.Tests
mooncakecommand/package names and standard Mooncake environment variables.Validation
Result:
5 passed
Additional checks:
python -m py_compile
roll/distributed/scheduler/transfer_backend.py
tests/distributed/scheduler/test_mooncake_transfer_backend.py
git diff --check