Faster stats package by bliksemstraal · Pull Request #15 · cbarrick/evo

bliksemstraal · 2021-01-06T08:08:27Z

Hi,

I love your project!!

I changed the stats package to a faster single-pass calculation. sumsq is now truly a sum of squares and mean was replaced with sum. This allows for a much faster Put method which is where most time is spent. The determination of min and max is faster too as there is not overhead call to functions in the math package.

Unit tests pass as they are. A local benchmark showed original package of ~18ns/op and new package is ~8ns/op. Every bit helps, especially in genetic algorithms.

cbarrick · 2021-01-09T05:26:28Z

Thanks for the contribution! I'm excited to see real users out there.

IIUC this patch is using the naive algorithm for computing variance.

The original algorithm used here is Welford's algorithm for Put and Chan et al.'s algorithm for Merge.

These are numerically stable algorithms, which minimize rounding error. The idea is that the sum can be very large w.r.t. the other values, and error creeps in when performing computation on values at different scales. We can eliminate this by never using the sum directly.

That said, numerical stability may not be that important. The 2.25x speedup just from switching to the naive algorithm may be worth it for many user cases.

So there are trade offs. I am not sure if I want to straight up replace a numerically stable implementation with an unstable-but-fast implementation. I will need to think about it more deeply.

Case studies or macro benchmarks would help sway me either way.

cbarrick · 2021-01-09T17:25:59Z

Also, the contributors file is unnecessary. Git tracks that metadata automatically.

cbarrick · 2021-01-09T17:30:51Z

By changing the method signatures from Stats to *Stats, you are changing the semantics.

The original version does not modify the stats object.

It may be worth it to change, but it is backwards incompatible.

(Not that this library is likely to have any real production users.)

bliksemstraal added 2 commits January 6, 2021 08:59

changed internal calculations of stats package for more than 2x speedup

21c224c

Add bliksemstraal to contributors list

63c334c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Faster stats package#15

Faster stats package#15
bliksemstraal wants to merge 2 commits intocbarrick:masterfrom
bliksemstraal:faster-stats-package

bliksemstraal commented Jan 6, 2021

Uh oh!

cbarrick commented Jan 9, 2021 •

edited

Loading

Uh oh!

cbarrick commented Jan 9, 2021

Uh oh!

cbarrick commented Jan 9, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

bliksemstraal commented Jan 6, 2021

Uh oh!

cbarrick commented Jan 9, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cbarrick commented Jan 9, 2021

Uh oh!

cbarrick commented Jan 9, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

cbarrick commented Jan 9, 2021 •

edited

Loading