Optimize getting the palette index for palette compression using SIMD. #2405

IntegratedQuantum · 2025-12-15T20:43:02Z

The compiler is poor at optimizing search loops like this, since it doesn't have enough information (it must not access the memory beyond the last entry and thus is stuck iterating element by element).

So I decided to try and throw some SIMD on this, and the easiest way to use SIMD is possible is to align the vector and blindly access memory beyond the length limit, and it actually is not a lot more complex than the non-SIMD code (see godbolt).

I also did some measurements, but getting the palette is not a common operation, and thus overall performance impact is below 1%, the given function measured alone though is about 17-50% faster depending on the use case.
But to be honest this is mostly about fixing the worst-case performance (which of course I did not care to measure :P)

cleanup, move function, orelse, maybe some name improvements

fixes #318

Argmaster · 2025-12-15T21:09:53Z

How did you measure performance impact?

IntegratedQuantum · 2025-12-15T21:20:43Z

I looked at it with a sampling profiler, not terribly precise I know. Maybe I'll make a better worst-case benchmark for this in the future.

Optimize getting the palette index for palette compression using SIMD.

b36a21b

Argmaster added this to PRs to review Dec 15, 2025

Argmaster moved this to WIP/not ready for review in PRs to review Dec 15, 2025

Argmaster moved this from WIP/not ready for review to Easy to Review in PRs to review Dec 20, 2025

Argmaster moved this from Easy to Review to WIP/not ready for review in PRs to review Dec 20, 2025

IntegratedQuantum moved this from WIP/not ready for review to In review in PRs to review Dec 26, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Optimize getting the palette index for palette compression using SIMD. #2405

Optimize getting the palette index for palette compression using SIMD. #2405

Uh oh!

IntegratedQuantum commented Dec 15, 2025

Uh oh!

Argmaster commented Dec 15, 2025

Uh oh!

IntegratedQuantum commented Dec 15, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Optimize getting the palette index for palette compression using SIMD. #2405

Are you sure you want to change the base?

Optimize getting the palette index for palette compression using SIMD. #2405

Uh oh!

Conversation

IntegratedQuantum commented Dec 15, 2025

Uh oh!

Argmaster commented Dec 15, 2025

Uh oh!

IntegratedQuantum commented Dec 15, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants