Skip to content

Conversation

@Akashcodes732
Copy link
Contributor

@Akashcodes732 Akashcodes732 commented Nov 12, 2025

Purpose

Fix Power compilation failure: PR #27954
refactored the attention backend for CPU, but the build fails on Power due to the use of fp16, which is not supported on this architecture. This PR addresses that issue (also mentioned in my review on the original PR).

Update oneDNN to v3.10: The oneDNN PR uxlfoundation/oneDNN#4002
— included in v3.10 — fixes a compilation failure on Power9 systems. Bumping the version here resolves that issue as well.

cc: @bigPYJ1151

Akash Kaothalkar added 2 commits November 12, 2025 02:22
Signed-off-by: Akash Kaothalkar <[email protected]>
Signed-off-by: Akash Kaothalkar <[email protected]>
@mergify mergify bot added the ci/build label Nov 12, 2025
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request aims to fix compilation errors on the PowerPC architecture. It correctly bumps the oneDNN version to v3.10, which includes a fix for Power9 systems. It also adds a preprocessor guard to disable an fp16 template specialization that is unsupported on PowerPC. While the fix for cpu_attn_impl.hpp is correct for PowerPC, it appears to be incomplete as the s390x architecture has similar limitations regarding fp16 vector support and will likely still fail to compile. I've added a comment with a suggested fix. Overall, the changes are in the right direction to improve cross-architecture compatibility.

using vec_t = vec_op::BF16Vec16;
};

#if !defined(__powerpc__)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The fix is incomplete. The s390x architecture also lacks fp16 vector support (vec_op::FP16Vec16) and will fail to compile this specialization, similar to PowerPC. The preprocessor guard should be extended to also cover s390x. A similar guard #if defined(__powerpc64__) || defined(__s390x__) is used in csrc/cpu/dnnl_kernels.cpp, which indicates s390x has similar characteristics.

#if !defined(__powerpc__) && !defined(__s390x__)

@bigPYJ1151
Copy link
Member

If fp16 is not supported on Power, should we add a check in check_and_update_config of cpu.py?

@Akashcodes732
Copy link
Contributor Author

Hi @bigPYJ1151,

Thanks for the feedback. You are right, as currently for Power with dtype="float16" we do not get a proper error message. If possible, I would take this up in a seperate follow up PR as this could require some discussion.

Currently the tip is broken for Power due to the compilation failure.

@bigPYJ1151
Copy link
Member

Hi @fadara01 this PR will upgrade oneDNN to 3.10. Would you please help to check the compatibility of ACL? As I saw the 3.10 minimum requires ACL 52.4.0

Copy link
Contributor

@fadara01 fadara01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for spotting this @bigPYJ1151 - current version of ACL is indeed not compatible with oneDNN v3.10

@Akashcodes732 could you please update ACL version here to v52.6.0 which should be compatible with oneDNN v3.10

Signed-off-by: Akash Kaothalkar <[email protected]>
@Akashcodes732
Copy link
Contributor Author

Hi @bigPYJ1151 @fadara01 ,

I have updated the ACL version

@bigPYJ1151 bigPYJ1151 enabled auto-merge (squash) November 13, 2025 10:19
@github-actions github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Nov 13, 2025
Copy link
Contributor

@fadara01 fadara01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci/build ready ONLY add when PR is ready to merge/full CI is needed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants