-
-
Notifications
You must be signed in to change notification settings - Fork 11.3k
[Bugfix] Eliminate tuple inputs to submodules in graph partitioning #28533
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request effectively addresses the issue of tuple inputs to submodules in graph partitioning by moving getitem operations to the producer's subgraph. The changes in vllm/compilation/backends.py correctly implement this logic, and the new test file tests/compile/test_graph_partition.py provides good coverage for both single and multiple consumer scenarios, ensuring the fix works as intended. The addition of the test to the .buildkite/test-pipeline.yaml ensures continuous validation. The overall change improves the robustness of the graph partitioning logic for AoT Autograd modules. No critical or high-severity issues were found.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
|
This pull request has merge conflicts that must be resolved before it can be |
Move getitem operations to the same subgraph as their input to prevent submodules from receiving tuple arguments. This is done by reassigning getitem nodes during partition assignment before calling split_module. PyTorch AoT compile expects graphs that conform to AoTAutograd spec, which prohibits tuple-type inputs to (sub)modules. Signed-off-by: Yanan Cao <[email protected]>
Signed-off-by: Yanan Cao <[email protected]>
vllm/compilation/backends.py
Outdated
| if input_node in node_to_subgraph_id: | ||
| node_to_subgraph_id[node] = node_to_subgraph_id[input_node] | ||
| continue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should always be true, right?
| if input_node in node_to_subgraph_id: | |
| node_to_subgraph_id[node] = node_to_subgraph_id[input_node] | |
| continue | |
| assert input_node in node_to_subgraph_id | |
| node_to_subgraph_id[node] = node_to_subgraph_id[input_node] | |
| continue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think placeholder nodes would not be in node_to_subgraph_id map.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm I think the claim is that we should never be getitem on a placeholder node or an output node so we can just assert that the input_node is in node_to_subgraph id?
Because Dynamo produces a graph where all placeholders should be Tensors or symints or scriptobjects
So I agree with Luka's suggestion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, changed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zou3519 @ProExpertProg I had to relax the assert to only when producer node is not a placeholder, because getitem is also used to fetch item/slice from a tensor as well. In that case, there can be legitimate getitem calls on placeholders. This is seen in qwen2.
PTAL.
Signed-off-by: Yanan Cao <[email protected]>
Signed-off-by: Yanan Cao <[email protected]>
Signed-off-by: Yanan Cao <[email protected]>
zou3519
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
code reasonable, I think Luka's comment makes sense
Signed-off-by: Yanan Cao <[email protected]>
Signed-off-by: Yanan Cao <[email protected]>
Head branch was pushed to by a user without write access
Move
getitemoperations to the same subgraph as their input to prevent submodules from receiving tuple arguments.This is to fix issues like #24915 where
split_graphpartitioned graph at a node that produces a tuple, which is then passed to subsequent submodules as input. This unfortunately does not conform to the calling convention of AoT Autograd modules, which requires flattened arguments.Signed-off-by: Yanan Cao [email protected]