-
Notifications
You must be signed in to change notification settings - Fork 4
Fix ess imse update #16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
@@ Coverage Diff @@
## main #16 +/- ##
==========================================
+ Coverage 95.76% 97.85% +2.08%
==========================================
Files 9 11 +2
Lines 260 280 +20
==========================================
+ Hits 249 274 +25
+ Misses 11 6 -5
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
bayes_kit/ess.py
Outdated
| while n + 1 < N: | ||
| if chain[n] + chain[n + 1] < 0: | ||
| return n | ||
| n = n + 2 | ||
| n += 2 | ||
| return N |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to make sure this is working as intended:
If given [2, -1, -3, -5, 6, 12], should this function return 2 or 1?
As written it returns 2, since it always counts pairwise, but obviously -1 + -3 is -4 (< 0), and the docstring doesn't indicate that the pairs have to start on an even index.
|
This is ready to re-review. I got up to 100% test coverage, which was really useful for making me be more careful defining and testing boundary conditions. I renamed the "internal" function |
jsoules
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a couple proofreading notes, then some comments about comments.
bayes_kit/ess.py
Outdated
| the following element is negative, or the length of the sequence if | ||
| there is no such element. Equivalently, this is the upper bound for | ||
| the range of indexes that come in positive pairs starting from 0. | ||
| For exmaple, `_end_pos_pairs([1, -.5, .25, -.3]) == 2` because the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo (exmaple)
bayes_kit/ess.py
Outdated
| sigma_sq_hat = acor[0] + 2 * sum(acor[1:n]) | ||
| ess = len(chain) / sigma_sq_hat | ||
| n = _end_pos_pairs(acor) | ||
| iat = 2 * acor[0:n].sum() - 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is iat a standard notation that we'd expect everybody to recognize? I don't recognize it (which isn't in itself surprising) but it also didn't appear as such when I skimmed the Wikipedia article on sample size determination, I can't find corresponding notation in the Stan reference page on effective sample size, and a google search for "iat statistics" or "iat effective sample size" just pulls up a bunch of stuff about the Implicit Association Test (i.e. psychology), so I think it would be pedagogically helpful to use a more descriptive name (or at least comment it).
(I'm okay with acor since it's pretty spelled out by the name and documentation of the function call that populates it.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Same note probably applies for the imse function below)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's for "integrated autocorrelation time". Here's a blog post from @dfm with an explanation.
If "iat" is too short for a local variable name, could you please suggest the shortest thing you'll approve and I'll use that.
We could expose the IAT function itself. Sometimes people like to report that, too. Then the definition of ESS simplifies to ESS = chain_length / IAT. If you'd like me to do that, could you also please suggest a name you'd approve?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As with acorr, I'm not fussed if there's something more explicit right nearby, it was just as-is this wasn't enough for me to know the right things to google.
I really like the idea of exposing an integrated_autocorr_time function--it fixes the naming issue, breaks the math into semantically meaningful components, and also exposes some reusable code that people might want direct access to anyway. Three wins!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll update and add an IAT function and tests.
|
I'm not sure if you'd like to fix them here or in #10 or somewhere else, but I just wanted to note that |
|
I'l fix the mypy tests. Sorry for skipping bits of the workflow. Would there be a way to automate (a) black formatting, (b) strict mypy testing, and (c) anything else that's going to cause a PR to be rejected if I forget to do it manually? |
|
I plan on automating that as soon as our current |
|
Another question on doc. The extra line space is because emacs thinks the first line of doc should fit on a single line. Should I be forcing the first sentence of doc to fit on a single line? Is it OK if it runs over? black didn't reformat it, but emacs doesn't like it. |
|
I believe the standard is if it takes more than one line, you should add a newline after the opening e.g. """A single line docstring"""
"""
A docstring which is much longer, taking up
multiple lines or even including multiple sections
"""Black doesn't touch docstrings, but the python emacs package might have this opinion built-in |
|
Thanks. The examples for the Google-style doc start right after the |
|
I wouldn’t sacrifice the level of detail you want to avoid the extra newline. In any programmatic usage (e.g. by Sphinx) leading whitespace is removed from them |
|
I just want to follow conventions which looks like terse first sentences. That's what javadoc wanted, too. I'll see what I can do. |
| to `iat_imse()`. | ||
| Return an estimate of the integrated autocorrelation time (IAT) | ||
| of the specified Markov chain. Evaluated by delegating to the | ||
| initial monotone sequence estimator, `iat_imse(chain)`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would there be any reason (performance?) someone might want the ipse instead? Is it worth exposing the ability to choose that? (I'm guessing not really)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm OK getting rid of IPSE; even though it's a bit faster than IMSE, it has an overestimation bias. Do you think I should get rid of it?
I still think we should keep the ess_imse and iat_imse definitions and delegate ess and iat. I know at least for ess we will have further estimators that work cross chain. The other option is to not provide a default implementation. I'm always on the fence about choosing default implementations for users, but in the end, I think they're going to want o see ess() or iat() in most applied code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have a strong feeling about IPSE--it's there, it's implemented, it works; no reason to discard that work, especially if we might want to expose it later. The counterargument is that it's more surface area, and unreachable unused code gets out of date (not that any code in Python is ever truly unreachable).
Unfortunately, I'm kind of clueless whether anybody would in practice actually want to have a choice here.
And if we expose that choice, however we did so, I could imagine a tricky situation where we wind up having some deeply nested calls such that there's an inconsistency between the implementations used in different places--whether because it's hard to make sure the "use-this-implementation" flag gets passed around everywhere ess is called, or because users call ess_ipse in their own direct code but our version doesn't. Of course, again, dunno if such a scenario is actually likely in practice.
I guess it would come down to three questions:
- Would anybody have a reason to use IPSE?
- Are these two the only implementations worth implementing, or could we imagine adding more later?
- Are these used only/mainly for evaluating sample quality after the fact, or are there cases where a sampling method might want to make use of these in a way that affects its behavior?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I talked to @WardBrian and concluded we should leave it in for pedagogical purposes.
- I think the utility for IPSE is pedagogical and I don't think anyone would use one in an application unless they really needed the extra speed. The standard way to teach these estimators is to first describe the initial positive sequence method, which was in common use until IMSE was developed, then add a monotonicity condition on top of that.
- Yes, we'll probably be coding other versions of these eventually, such as a Bayesian estimator that makes a parametric or non-parametric assumption about the shape of the autocorrelation vs. lag curve---it's an active area of research.
- Yes, we will eventually want to use ESS at run time to set a minimum ESS target and run until it's hit (it won't affect the transitions, just when we stop)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good to me. All of this points to the same conclusion--leave it in for pedagogy (and so people could technically call it if they really wanted to), but have a standard pathway that delegates to a specific implementation.
jsoules
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This has been open for a while, but reviewing it again this morning I have nothing further on it and would be happy for it to be merged.
This PR:
It fixes issues #13 and #15.