Fix ess imse update #16

bob-carpenter · 2023-02-06T21:12:53Z

This PR:

pulls autocorrelation calculations into their own file
adds more autocorrelation tests including exceptions
adds more tests for effective sample size including independent draws
adds more doc

It fixes issues #13 and #15.

codecov-commenter · 2023-02-06T21:13:58Z

Codecov Report

Merging #16 (ea6ca6e) into main (b6374b4) will increase coverage by 2.08%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##             main      #16      +/-   ##
==========================================
+ Coverage   95.76%   97.85%   +2.08%     
==========================================
  Files           9       11       +2     
  Lines         260      280      +20     
==========================================
+ Hits          249      274      +25     
+ Misses         11        6       -5

Impacted Files	Coverage Δ
bayes_kit/__init__.py	`100.00% <100.00%> (ø)`
bayes_kit/autocorr.py	`100.00% <100.00%> (ø)`
bayes_kit/ess.py	`100.00% <100.00%> (+9.25%)`	⬆️
bayes_kit/iat.py	`100.00% <100.00%> (ø)`

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

jsoules · 2023-02-09T15:15:53Z

bayes_kit/ess.py

    while n + 1 < N:
        if chain[n] + chain[n + 1] < 0:
            return n
-        n = n + 2
+        n += 2
    return N


Just to make sure this is working as intended:

If given [2, -1, -3, -5, 6, 12], should this function return 2 or 1?

As written it returns 2, since it always counts pairwise, but obviously -1 + -3 is -4 (< 0), and the docstring doesn't indicate that the pairs have to start on an even index.

bob-carpenter · 2023-02-17T16:06:21Z

This is ready to re-review. I got up to 100% test coverage, which was really useful for making me be more careful defining and testing boundary conditions. I renamed the "internal" function _end_pos_pairs after @magland pointed out the first name misspecified the function, then documented the boundary conditions, then fixed the function to match the tests. Edit: it also adds anti-correlated tests in a broader range of anti correlations.

bob-carpenter · 2023-02-17T16:07:50Z

Also, this will close issues #15 and #13 and #3 (I added a new issue for the ranked versions and updated #3).

jsoules

I had a couple proofreading notes, then some comments about comments.

bayes_kit/ess.py

jsoules · 2023-02-17T16:12:42Z

bayes_kit/ess.py

+    the following element is negative, or the length of the sequence if
+    there is no such element.  Equivalently, this is the upper bound for
+    the range of indexes that come in positive pairs starting from 0.
+    For exmaple, `_end_pos_pairs([1, -.5, .25, -.3]) == 2` because the


typo (exmaple)

bayes_kit/ess.py

jsoules · 2023-02-17T16:29:22Z

bayes_kit/ess.py

-    sigma_sq_hat = acor[0] + 2 * sum(acor[1:n])
-    ess = len(chain) / sigma_sq_hat
+    n = _end_pos_pairs(acor)
+    iat = 2 * acor[0:n].sum() - 1


Is iat a standard notation that we'd expect everybody to recognize? I don't recognize it (which isn't in itself surprising) but it also didn't appear as such when I skimmed the Wikipedia article on sample size determination, I can't find corresponding notation in the Stan reference page on effective sample size, and a google search for "iat statistics" or "iat effective sample size" just pulls up a bunch of stuff about the Implicit Association Test (i.e. psychology), so I think it would be pedagogically helpful to use a more descriptive name (or at least comment it).

(I'm okay with acor since it's pretty spelled out by the name and documentation of the function call that populates it.)

(Same note probably applies for the imse function below)

It's for "integrated autocorrelation time". Here's a blog post from @dfm with an explanation.

If "iat" is too short for a local variable name, could you please suggest the shortest thing you'll approve and I'll use that.

We could expose the IAT function itself. Sometimes people like to report that, too. Then the definition of ESS simplifies to ESS = chain_length / IAT. If you'd like me to do that, could you also please suggest a name you'd approve?

As with acorr, I'm not fussed if there's something more explicit right nearby, it was just as-is this wasn't enough for me to know the right things to google.

I really like the idea of exposing an integrated_autocorr_time function--it fixes the naming issue, breaks the math into semantically meaningful components, and also exposes some reusable code that people might want direct access to anyway. Three wins!

I'll update and add an IAT function and tests.

WardBrian · 2023-02-17T17:22:30Z

I'm not sure if you'd like to fix them here or in #10 or somewhere else, but I just wanted to note that mypy --strict . raises 61 errors in this branch. It's about 25 between ess.py, test_ess.py and test_autocorrelation.py, with the rest in the RHat code you don't really touch here

bob-carpenter · 2023-02-17T17:27:13Z

I'l fix the mypy tests. Sorry for skipping bits of the workflow.

Would there be a way to automate (a) black formatting, (b) strict mypy testing, and (c) anything else that's going to cause a PR to be rejected if I forget to do it manually?

WardBrian · 2023-02-17T17:28:41Z

I plan on automating that as soon as our current main branch would pass it, but a lot of the type errors in these files are already checked in there so it wouldn’t be feasible yet

bob-carpenter · 2023-02-17T17:33:49Z

Another question on doc. The extra line space is because emacs thinks the first line of doc should fit on a single line. Should I be forcing the first sentence of doc to fit on a single line? Is it OK if it runs over? black didn't reformat it, but emacs doesn't like it.

WardBrian · 2023-02-17T17:37:52Z

I believe the standard is if it takes more than one line, you should add a newline after the opening """

e.g.

"""A single line docstring"""

"""
A docstring which is much longer, taking up 
multiple lines or even including multiple sections
"""

Black doesn't touch docstrings, but the python emacs package might have this opinion built-in

bob-carpenter · 2023-02-17T17:44:34Z

Thanks. The examples for the Google-style doc start right after the """ but they're all one-liners. I can try to be more terse if that's the standard in Python.

WardBrian · 2023-02-17T17:46:16Z

I wouldn’t sacrifice the level of detail you want to avoid the extra newline. In any programmatic usage (e.g. by Sphinx) leading whitespace is removed from them

bob-carpenter · 2023-02-17T17:47:52Z

I just want to follow conventions which looks like terse first sentences. That's what javadoc wanted, too. I'll see what I can do.

bayes_kit/iat.py

jsoules · 2023-02-27T22:27:58Z

bayes_kit/iat.py

-    to `iat_imse()`.
+    Return an estimate of the integrated autocorrelation time (IAT)
+    of the specified Markov chain. Evaluated by delegating to the
+    initial monotone sequence estimator, `iat_imse(chain)`.


Would there be any reason (performance?) someone might want the ipse instead? Is it worth exposing the ability to choose that? (I'm guessing not really)

I'm OK getting rid of IPSE; even though it's a bit faster than IMSE, it has an overestimation bias. Do you think I should get rid of it?

I still think we should keep the ess_imse and iat_imse definitions and delegate ess and iat. I know at least for ess we will have further estimators that work cross chain. The other option is to not provide a default implementation. I'm always on the fence about choosing default implementations for users, but in the end, I think they're going to want o see ess() or iat() in most applied code.

I don't have a strong feeling about IPSE--it's there, it's implemented, it works; no reason to discard that work, especially if we might want to expose it later. The counterargument is that it's more surface area, and unreachable unused code gets out of date (not that any code in Python is ever truly unreachable).

Unfortunately, I'm kind of clueless whether anybody would in practice actually want to have a choice here.

And if we expose that choice, however we did so, I could imagine a tricky situation where we wind up having some deeply nested calls such that there's an inconsistency between the implementations used in different places--whether because it's hard to make sure the "use-this-implementation" flag gets passed around everywhere ess is called, or because users call ess_ipse in their own direct code but our version doesn't. Of course, again, dunno if such a scenario is actually likely in practice.

I guess it would come down to three questions:

Would anybody have a reason to use IPSE?

Are these two the only implementations worth implementing, or could we imagine adding more later?

Are these used only/mainly for evaluating sample quality after the fact, or are there cases where a sampling method might want to make use of these in a way that affects its behavior?

I talked to @WardBrian and concluded we should leave it in for pedagogical purposes.

I think the utility for IPSE is pedagogical and I don't think anyone would use one in an application unless they really needed the extra speed. The standard way to teach these estimators is to first describe the initial positive sequence method, which was in common use until IMSE was developed, then add a monotonicity condition on top of that.

Yes, we'll probably be coding other versions of these eventually, such as a Bayesian estimator that makes a parametric or non-parametric assumption about the shape of the autocorrelation vs. lag curve---it's an active area of research.

Yes, we will eventually want to use ESS at run time to set a minimum ESS target and run until it's hit (it won't affect the transitions, just when we stop)

Sounds good to me. All of this points to the same conclusion--leave it in for pedagogy (and so people could technically call it if they really wanted to), but have a standard pathway that delegates to a specific implementation.

test/test_iat.py

jsoules

This has been open for a while, but reviewing it again this morning I have nothing further on it and would be happy for it to be merged.

bob-carpenter added 3 commits February 2, 2023 15:59

update min corr pair; fixes #13

c35c44a

doc for autocorr, includes in init

3167571

boundary cond for ess, autocorr, fixes #13, fixes #15

6827f39

This was linked to issues Feb 6, 2023

small bug in ess_imse #13

Closed

in first_neg_pair_start, use n = n + 1 #15

Closed

jsoules reviewed Feb 9, 2023

View reviewed changes

magland mentioned this pull request Feb 15, 2023

update ESS calculation code to be consistent with bayes-kit flatironinstitute/mcmc-monitor#5

Open

Bob Carpenter and others added 3 commits February 16, 2023 19:46

fix merge conflict

c7c9221

ess tests for anticorr

c282e8a

test coverage: end_pos_pairs return

cfae7c8

jsoules reviewed Feb 17, 2023

View reviewed changes

clarify doc and format

3acce34

WardBrian mentioned this pull request Feb 17, 2023

Add mypy and black to CI #26

Merged

bob-carpenter added 2 commits February 17, 2023 14:00

checkpoint iat impls w/o tests

7018496

add IAT functions with tests

e094605

jsoules reviewed Feb 27, 2023

View reviewed changes

fix typos and doc correctness for iat

ea6ca6e

jsoules approved these changes May 30, 2023

View reviewed changes

bob-carpenter merged commit 2c6ed9d into main May 30, 2023

Fix ess imse update #16

Fix ess imse update #16

Uh oh!

Conversation

bob-carpenter commented Feb 6, 2023

Uh oh!

codecov-commenter commented Feb 6, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bob-carpenter commented Feb 17, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bob-carpenter commented Feb 17, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jsoules left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bob-carpenter Feb 17, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

WardBrian commented Feb 17, 2023

Uh oh!

bob-carpenter commented Feb 17, 2023

Uh oh!

WardBrian commented Feb 17, 2023

Uh oh!

bob-carpenter commented Feb 17, 2023

Uh oh!

WardBrian commented Feb 17, 2023

Uh oh!

bob-carpenter commented Feb 17, 2023

Uh oh!

WardBrian commented Feb 17, 2023

Uh oh!

bob-carpenter commented Feb 17, 2023

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bob-carpenter Mar 1, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jsoules left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

codecov-commenter commented Feb 6, 2023 •

edited

Loading

bob-carpenter commented Feb 17, 2023 •

edited

Loading

bob-carpenter commented Feb 17, 2023 •

edited

Loading

bob-carpenter Feb 17, 2023 •

edited

Loading

bob-carpenter Mar 1, 2023 •

edited

Loading