Skip to content

Conversation

@ElaineBao
Copy link
Contributor

Description

MFDNN-14010.

  • Use an End op to represent an intermediate output (which has consumers within the partition as well as outside the partition)
  • Add patterns and related implementation for sdpa/gqa training backward w.r.t gradients for mask
  • Add test cases, modify benchdnn to support such cases' validation.

@ElaineBao ElaineBao self-assigned this Dec 2, 2025
@ElaineBao ElaineBao requested a review from a team as a code owner December 2, 2025 06:33
@ElaineBao ElaineBao added the component:graph-api Codeowner: @oneapi-src/onednn-graph label Dec 2, 2025
@ElaineBao ElaineBao requested a review from a team as a code owner December 2, 2025 06:33
@github-actions github-actions bot added the component:tests Codeowner: @oneapi-src/onednn-arch label Dec 2, 2025
data_type::s8, data_type::u8, data_type::s32,
data_type::undef}))
data_type::undef})
.set_shape_inference_function(infer_dummy_output_shape))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this? By definition, there is no output for an End op.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The graph's infer shape function will iterate every op in the graph, we can certainly check if one op has output or not to avoid adding a dummy infer shape function for End op here, but I think it's not a better choice because in that way, every op will have to check that condition.

@ElaineBao
Copy link
Contributor Author

make test
set test_scope=NIGHTLY
disable benchdnn_all
enable benchdnn_graph

Copy link
Contributor

@TaoLv TaoLv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any changes required for document? What's the main difference in the fusion graph?

pgraph->create_input_port(2, matmul_dv, 1);
pgraph->create_input_port(2, matmul_v_do, 0);
})
.set_attr<FCreatePattern>("FCreatePattern",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious why a new pattern and not incorporating into existing one?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's because incorporating into existing one requires the pattern matcher's optional subgraph supports two consumers, which is currently not supported

if (cur_op_refs.size() == 2 && cur_op_refs[0].kind_ == "End") {
matmul_idx = 1;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be better if:

while (cur_op_refs[matmul_idx].kind_ != "MatMul") {
    matmul_idx++;
}

Copy link
Contributor Author

@ElaineBao ElaineBao Dec 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really, becuase matmul_idx (maybe it's not a good name) doesn't mean the consumer must be matmul (below find_next_until indicates that), it means matmul is in the chain of that consumer. On the other hand, the End op is determined.

@ElaineBao ElaineBao force-pushed the yixin/sdpa_mask_grad branch from 597aa04 to e6644ff Compare December 8, 2025 01:31
@ElaineBao ElaineBao requested a review from a team as a code owner December 8, 2025 01:31
@github-actions github-actions bot added the documentation A request to change/fix/improve the documentation. Codeowner: @oneapi-src/onednn-doc label Dec 8, 2025
@ElaineBao
Copy link
Contributor Author

Any changes required for document? What's the main difference in the fusion graph?

Added document changes, please take another look @TaoLv

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

component:graph-api Codeowner: @oneapi-src/onednn-graph component:tests Codeowner: @oneapi-src/onednn-arch documentation A request to change/fix/improve the documentation. Codeowner: @oneapi-src/onednn-doc

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants