[feat] hma connector supports GPU buffer MR for GPUDirct RDMA by relat-ivity · Pull Request #981 · ModelEngine-Group/unified-cache-management

relat-ivity · 2026-05-28T06:23:37Z

Purpose

Enable the HMA connector to provide GPU KV buffer address and size metadata to UCM stores, following the existing ucm_connector registration logic in PR #958.

Modifications

hma_connector.py: Collects GPU buffer addresses and sizes of the vLLM KV cache, and passes them to UCM store for GDR pre-registration.

Test

ygwpz

Follow-up Review - PR #981

Previous Concerns: All Addressed ✅

All 7 concerns from the previous review have been successfully addressed:

✅ Data integrity assertion (Line 167-169): Added assert len(buffer_addrs) == len(buffer_sizes)
✅ Code clarity (Line 29-30): Added comment explaining GPU buffer purpose
✅ Debug logging (Line 119-125): Added logging for GPU buffer registration
✅ Redundant int() conversion (Line 171): Simplified key creation
✅ Type safety (Line 112-116): Added validation for non-empty lists
✅ Overflow protection (Line 46): Changed to safe sum calculation
✅ Warning for missing layouts (Line 159-163): Added logging instead of silent skip

New Observations (Minor - L3-L5)

Three minor suggestions for code quality improvements. See inline comments below.

Summary

The new commits thoroughly address all previous concerns with appropriate safeguards, logging, and validation. The implementation is now more robust and maintainable. Good work!

The three minor suggestions below are optional improvements that could be addressed in a follow-up PR if desired.

ygwpz · 2026-06-16T06:18:53Z

            tensor_size = math.prod([t.shape[i] for i in size_dims]) * t.element_size()
+            # GPU buffer sizes for GPUDirect RDMA registration in store.
+            # Total buffer size = number of blocks (shape[0]) × bytes per block stride.
+            buffer_sizes.append(int(t.shape[0]) * block_stride)


🔴 Critical: Potential integer overflow. When t.shape[0] is very large (e.g., millions of blocks) and block_stride is also large, this multiplication could overflow in Python's int conversion. Consider using int(t.shape[0]) * int(block_stride) or add bounds checking to ensure the product doesn't exceed expected limits for GPUDirect RDMA registration.

ygwpz · 2026-06-16T06:18:57Z

+            ), "KV cache buffer addresses and sizes must have the same length."
+            for addr, size in zip(buffer_addrs, buffer_sizes):
+                key = (addr, size)
+                if key in gpu_kv_buffer_set:


💡 Suggestion: For better readability, use addr and size variables directly instead of key[0] and key[1] in the append statements below:

gpu_kv_buffer_set.add((addr, size)) gpu_kv_buffer_addrs.append(addr) gpu_kv_buffer_sizes.append(size)

relat-ivity requested review from Infinite666, harrisonyhq, mag1c-h, qyh111 and ygwpz as code owners May 28, 2026 06:23

ygwpz reviewed May 29, 2026

View reviewed changes

Comment thread ucm/integration/vllm/hma_connector.py

ygwpz reviewed May 29, 2026

View reviewed changes

Comment thread ucm/integration/vllm/hma_connector.py

ygwpz reviewed May 29, 2026

View reviewed changes

Comment thread ucm/integration/vllm/hma_connector.py

ygwpz reviewed May 29, 2026

View reviewed changes

Comment thread ucm/integration/vllm/hma_connector.py

ygwpz reviewed May 29, 2026

View reviewed changes

Comment thread ucm/integration/vllm/hma_connector.py

ygwpz reviewed Jun 4, 2026

View reviewed changes

Comment thread ucm/integration/vllm/hma_connector.py Outdated

ygwpz reviewed Jun 4, 2026

View reviewed changes

Comment thread ucm/integration/vllm/hma_connector.py

relat-ivity requested review from FangRun2, Tarrei, Wwwzff and flesher0813 as code owners June 5, 2026 08:12

relat-ivity marked this pull request as draft June 5, 2026 08:14

relat-ivity force-pushed the dsv4-gdr branch from 5e882f5 to 1510a85 Compare June 5, 2026 08:17

relat-ivity marked this pull request as ready for review June 5, 2026 08:17

relat-ivity force-pushed the dsv4-gdr branch from 1510a85 to 4f595bf Compare June 11, 2026 08:09

ygwpz reviewed Jun 13, 2026

View reviewed changes

Comment thread ucm/integration/vllm/hma_connector.py

ygwpz reviewed Jun 13, 2026

View reviewed changes

Comment thread ucm/integration/vllm/hma_connector.py

ygwpz reviewed Jun 13, 2026

View reviewed changes

Comment thread ucm/integration/vllm/hma_connector.py

relat-ivity added 5 commits June 16, 2026 10:47

hma connector添加GPU MR逻辑

6962645

添加日志和断言

4ca6d9e

修改断言格式

2bbdfa0

修改安全sum并添加warn

8d66c31

修改注释和安全判断

6737e8d

relat-ivity force-pushed the dsv4-gdr branch from fd2cbb9 to 6737e8d Compare June 16, 2026 02:48

relat-ivity added 2 commits June 16, 2026 10:53

按要求修改格式

222620c

增加任务队列长度

1f8aaa6

ygwpz reviewed Jun 16, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[feat] hma connector supports GPU buffer MR for GPUDirct RDMA#981

[feat] hma connector supports GPU buffer MR for GPUDirct RDMA#981
relat-ivity wants to merge 7 commits into
ModelEngine-Group:developfrom
relat-ivity:dsv4-gdr

relat-ivity commented May 28, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ygwpz left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ygwpz Jun 16, 2026

Uh oh!

ygwpz Jun 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

relat-ivity commented May 28, 2026

Purpose

Modifications

Test

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ygwpz left a comment

Choose a reason for hiding this comment

Follow-up Review - PR #981

Previous Concerns: All Addressed ✅

New Observations (Minor - L3-L5)

Summary

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ygwpz Jun 16, 2026

Choose a reason for hiding this comment

Uh oh!

ygwpz Jun 16, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants