Conversation
|
Thanks for the contribution! I'm excited to see real users out there. IIUC this patch is using the naive algorithm for computing variance. The original algorithm used here is Welford's algorithm for Put and Chan et al.'s algorithm for Merge. These are numerically stable algorithms, which minimize rounding error. The idea is that the sum can be very large w.r.t. the other values, and error creeps in when performing computation on values at different scales. We can eliminate this by never using the sum directly. That said, numerical stability may not be that important. The 2.25x speedup just from switching to the naive algorithm may be worth it for many user cases. So there are trade offs. I am not sure if I want to straight up replace a numerically stable implementation with an unstable-but-fast implementation. I will need to think about it more deeply. Case studies or macro benchmarks would help sway me either way. |
|
Also, the contributors file is unnecessary. Git tracks that metadata automatically. |
|
By changing the method signatures from The original version does not modify the stats object. It may be worth it to change, but it is backwards incompatible. (Not that this library is likely to have any real production users.) |
Hi,
I love your project!!
I changed the stats package to a faster single-pass calculation.
sumsqis now truly a sum of squares andmeanwas replaced withsum. This allows for a much fasterPutmethod which is where most time is spent. The determination ofminandmaxis faster too as there is not overhead call to functions in themathpackage.Unit tests pass as they are. A local benchmark showed original package of ~18ns/op and new package is ~8ns/op. Every bit helps, especially in genetic algorithms.