Improve applyCompMatr summation accuracy by LubuSeb · Pull Request #781 · QuEST-Kit/QuEST

LubuSeb · 2026-06-05T15:13:32Z

Closes #598.

Summary

Replaces the naive inner accumulation in cpu_statevec_anyCtrlAnyTargDenseMatr_sub() with component-wise compensated summation over cpu_qcomp.re and cpu_qcomp.im.
Writes each output amplitude once after its local reduction rather than repeatedly updating amps[i] inside the summation loop.
Adds a deterministic complex cancellation regression for 4-target applyCompMatr, which exercises the 3+ target CompMatr path rather than the one/two-target specialisations.

Notes

I saw the earlier closed attempt in #777. This version keeps the same narrow two-file scope, but uses direct qreal compensation for the real and imaginary components instead of relying on complex add/sub overloads in the hot loop. I also checked base_qcomp: the current operators are ordinary component-wise arithmetic, so independent real/imaginary Kahan compensation is compatible with the backend representation.

Local measurements

Configuration: Windows, GCC 13.2.0, Release, single CPU (QUEST_ENABLE_OMP=OFF, QUEST_ENABLE_MPI=OFF, QUEST_ENABLE_CUDA=OFF, QUEST_ENABLE_HIP=OFF). The benchmark applies a dense CompMatr whose first output row is [large+i*large, 1-i, ..., 1-i, -large-i*large] to an all-ones state. The expected first amplitude is (2^targets - 2) - i(2^targets - 2).

precision	targets	baseline abs error	baseline avg ms	patched avg ms
1	4	14	0.0114	0.0117
1	8	254	0.0779	0.2100
1	10	1022	1.0715	3.0297
2	4	14	0.0113	0.0121
2	8	254	0.0801	0.2008
2	10	1022	1.4134	3.2629
4	4	14	0.0137	0.0134
4	8	254	0.4391	0.4150
4	10	1022	7.1805	6.6778

The measurements show the expected accuracy improvement. The overhead is visible for larger single/double precision reductions, which seems consistent with the tradeoff described in the issue.

Testing

cmake -S . -B build-598-fp2 -G Ninja -D CMAKE_BUILD_TYPE=Release -D QUEST_BUILD_TESTS=ON -D QUEST_FLOAT_PRECISION=2 -D QUEST_ENABLE_OMP=OFF -D QUEST_ENABLE_MPI=OFF -D QUEST_ENABLE_CUDA=OFF -D QUEST_ENABLE_HIP=OFF
cmake --build build-598-fp2 --parallel
build-598-fp2/tests/tests.exe '*applyCompMatr*' --reporter compact
ctest --test-dir build-598-fp2 --output-on-failure -j 4
git diff --check

Results:

*applyCompMatr*: passed, 10 test cases / 10,003 assertions.
Full CPU-only double-precision ctest: passed.
git diff --check: passed; Git emitted only Windows line-ending conversion warnings.

Prepared with AI assistance; I reviewed the patch and ran the listed local checks.

TysonRayJones · 2026-06-06T00:55:26Z

How come you separate out the real and imaginary components, mr robo?

Improve applyCompMatr summation accuracy

f7b0946

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve applyCompMatr summation accuracy#781

Improve applyCompMatr summation accuracy#781
LubuSeb wants to merge 1 commit into
QuEST-Kit:develfrom
LubuSeb:codex/unitaryhack-quest-598-compensated

LubuSeb commented Jun 5, 2026

Uh oh!

TysonRayJones commented Jun 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

LubuSeb commented Jun 5, 2026

Summary

Notes

Local measurements

Testing

Uh oh!

TysonRayJones commented Jun 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants