Skip to content

Conversation

@mstembera
Copy link
Contributor

No functional change
bench: 2746404

@vondele
Copy link
Member

vondele commented Oct 5, 2025

The original idea was that the -mfoo flags correspond to

# Set the file CPU x86_64 architecture
set_arch_x86_64() {
if check_flags 'avx512f' 'avx512cd' 'avx512vl' 'avx512dq' 'avx512bw' 'avx512ifma' 'avx512vbmi' 'avx512vbmi2' 'avx512vpopcntdq' 'avx512bitalg' 'avx512vnni' 'vpclmulqdq' 'gfni' 'vaes'; then
true_arch='x86-64-avx512icl'
elif check_flags 'avx512vnni' 'avx512dq' 'avx512f' 'avx512bw' 'avx512vl'; then
true_arch='x86-64-vnni512'
elif check_flags 'avx512f' 'avx512bw'; then
true_arch='x86-64-avx512'
elif [ -z "${znver_1_2+1}" ] && check_flags 'bmi2'; then
true_arch='x86-64-bmi2'
elif check_flags 'avx2'; then
true_arch='x86-64-avx2'
elif check_flags 'sse41' && check_flags 'popcnt'; then
true_arch='x86-64-sse41-popcnt'
else
true_arch='x86-64'
fi
}
so that we have a 1-to-1 matching of these two and we are explicit. We might have moved away from that, but I think that is not a bad practice. What do you think?

@mstembera
Copy link
Contributor Author

mstembera commented Oct 5, 2025

Hmm I know almost nothing about the scripts and where the flags that are fed to them aggregate come from so I can't make any good suggestions there. I can only speak to the consistency of the Makefile which seems inconsistent for VNNI512 and AVX512ICL compared to the other archs. The normal Makefile set up seems to be that each successively more modern arch only adds new flags on top of the older archs without redefining the older flags again. For example AVX2 assumes the SSE41 flags are already defined, SSE41 assumes the SSSE3 flags are already defined, and so on. VNNI512 and AVX512ICL seem to break this principle by redefining what was already defined. If this is somehow necessary because of how the scripts work I can close this PR or make any changes you suggest.

No functional change
bench: 2912398
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants