Improve applyCompMatr summation accuracy#781
Open
LubuSeb wants to merge 1 commit into
Open
Conversation
Member
|
How come you separate out the real and imaginary components, mr robo? |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #598.
Summary
cpu_statevec_anyCtrlAnyTargDenseMatr_sub()with component-wise compensated summation overcpu_qcomp.reandcpu_qcomp.im.amps[i]inside the summation loop.applyCompMatr, which exercises the 3+ targetCompMatrpath rather than the one/two-target specialisations.Notes
I saw the earlier closed attempt in #777. This version keeps the same narrow two-file scope, but uses direct
qrealcompensation for the real and imaginary components instead of relying on complex add/sub overloads in the hot loop. I also checkedbase_qcomp: the current operators are ordinary component-wise arithmetic, so independent real/imaginary Kahan compensation is compatible with the backend representation.Local measurements
Configuration: Windows, GCC 13.2.0, Release, single CPU (
QUEST_ENABLE_OMP=OFF,QUEST_ENABLE_MPI=OFF,QUEST_ENABLE_CUDA=OFF,QUEST_ENABLE_HIP=OFF). The benchmark applies a denseCompMatrwhose first output row is[large+i*large, 1-i, ..., 1-i, -large-i*large]to an all-ones state. The expected first amplitude is(2^targets - 2) - i(2^targets - 2).The measurements show the expected accuracy improvement. The overhead is visible for larger single/double precision reductions, which seems consistent with the tradeoff described in the issue.
Testing
Results:
*applyCompMatr*: passed, 10 test cases / 10,003 assertions.ctest: passed.git diff --check: passed; Git emitted only Windows line-ending conversion warnings.Prepared with AI assistance; I reviewed the patch and ran the listed local checks.