-
Notifications
You must be signed in to change notification settings - Fork 795
[SYCL][Graph] Graph duplication optimizations in finalize() #20547
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
reble
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
@intel/llvm-gatekeepers please consider merging |
zhimingwang36
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
|
@intel/llvm-gatekeepers please consider merging |
1 similar comment
|
@intel/llvm-gatekeepers please consider merging |
|
@intel/llvm-gatekeepers please consider merging 😄 |
When invoking
auto g = graph.finalize()on a graph with a moderate amount of nodes, duplicating the nodes of the modifiable graph into the executable graph takes a non-trivial amount of time. This PR introduces a few changes to improve the performance ofduplicateNodes():std::dequeuewith astd::vectorso that it can be directly moved, omitting unnecessary O(n) allocationsreserve()for the vector of nodes and map that we are going to create, since we know the total number of nodes.The changes for 2. make the diff seem much larger than it actually is, but the majority of it is whitespace from moving the subgraph handling into an if statement.