Skip to content

Conversation

@bob-carpenter
Copy link
Collaborator

This PR:

  • pulls autocorrelation calculations into their own file
  • adds more autocorrelation tests including exceptions
  • adds more tests for effective sample size including independent draws
  • adds more doc

It fixes issues #13 and #15.

@codecov-commenter
Copy link

codecov-commenter commented Feb 6, 2023

Codecov Report

Merging #16 (ea6ca6e) into main (b6374b4) will increase coverage by 2.08%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##             main      #16      +/-   ##
==========================================
+ Coverage   95.76%   97.85%   +2.08%     
==========================================
  Files           9       11       +2     
  Lines         260      280      +20     
==========================================
+ Hits          249      274      +25     
+ Misses         11        6       -5     
Impacted Files Coverage Δ
bayes_kit/__init__.py 100.00% <100.00%> (ø)
bayes_kit/autocorr.py 100.00% <100.00%> (ø)
bayes_kit/ess.py 100.00% <100.00%> (+9.25%) ⬆️
bayes_kit/iat.py 100.00% <100.00%> (ø)

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

This was linked to issues Feb 6, 2023
bayes_kit/ess.py Outdated
Comment on lines 23 to 27
while n + 1 < N:
if chain[n] + chain[n + 1] < 0:
return n
n = n + 2
n += 2
return N
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to make sure this is working as intended:

If given [2, -1, -3, -5, 6, 12], should this function return 2 or 1?

As written it returns 2, since it always counts pairwise, but obviously -1 + -3 is -4 (< 0), and the docstring doesn't indicate that the pairs have to start on an even index.

@bob-carpenter
Copy link
Collaborator Author

bob-carpenter commented Feb 17, 2023

This is ready to re-review. I got up to 100% test coverage, which was really useful for making me be more careful defining and testing boundary conditions. I renamed the "internal" function _end_pos_pairs after @magland pointed out the first name misspecified the function, then documented the boundary conditions, then fixed the function to match the tests. Edit: it also adds anti-correlated tests in a broader range of anti correlations.

@bob-carpenter
Copy link
Collaborator Author

bob-carpenter commented Feb 17, 2023

Also, this will close issues #15 and #13 and #3 (I added a new issue for the ranked versions and updated #3).

Copy link
Collaborator

@jsoules jsoules left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a couple proofreading notes, then some comments about comments.

bayes_kit/ess.py Outdated
the following element is negative, or the length of the sequence if
there is no such element. Equivalently, this is the upper bound for
the range of indexes that come in positive pairs starting from 0.
For exmaple, `_end_pos_pairs([1, -.5, .25, -.3]) == 2` because the
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo (exmaple)

bayes_kit/ess.py Outdated
sigma_sq_hat = acor[0] + 2 * sum(acor[1:n])
ess = len(chain) / sigma_sq_hat
n = _end_pos_pairs(acor)
iat = 2 * acor[0:n].sum() - 1
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is iat a standard notation that we'd expect everybody to recognize? I don't recognize it (which isn't in itself surprising) but it also didn't appear as such when I skimmed the Wikipedia article on sample size determination, I can't find corresponding notation in the Stan reference page on effective sample size, and a google search for "iat statistics" or "iat effective sample size" just pulls up a bunch of stuff about the Implicit Association Test (i.e. psychology), so I think it would be pedagogically helpful to use a more descriptive name (or at least comment it).

(I'm okay with acor since it's pretty spelled out by the name and documentation of the function call that populates it.)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Same note probably applies for the imse function below)

Copy link
Collaborator Author

@bob-carpenter bob-carpenter Feb 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's for "integrated autocorrelation time". Here's a blog post from @dfm with an explanation.

If "iat" is too short for a local variable name, could you please suggest the shortest thing you'll approve and I'll use that.

We could expose the IAT function itself. Sometimes people like to report that, too. Then the definition of ESS simplifies to ESS = chain_length / IAT. If you'd like me to do that, could you also please suggest a name you'd approve?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As with acorr, I'm not fussed if there's something more explicit right nearby, it was just as-is this wasn't enough for me to know the right things to google.

I really like the idea of exposing an integrated_autocorr_time function--it fixes the naming issue, breaks the math into semantically meaningful components, and also exposes some reusable code that people might want direct access to anyway. Three wins!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll update and add an IAT function and tests.

@WardBrian
Copy link
Collaborator

I'm not sure if you'd like to fix them here or in #10 or somewhere else, but I just wanted to note that mypy --strict . raises 61 errors in this branch. It's about 25 between ess.py, test_ess.py and test_autocorrelation.py, with the rest in the RHat code you don't really touch here

@bob-carpenter
Copy link
Collaborator Author

I'l fix the mypy tests. Sorry for skipping bits of the workflow.

Would there be a way to automate (a) black formatting, (b) strict mypy testing, and (c) anything else that's going to cause a PR to be rejected if I forget to do it manually?

@WardBrian
Copy link
Collaborator

I plan on automating that as soon as our current main branch would pass it, but a lot of the type errors in these files are already checked in there so it wouldn’t be feasible yet

@bob-carpenter
Copy link
Collaborator Author

Another question on doc. The extra line space is because emacs thinks the first line of doc should fit on a single line. Should I be forcing the first sentence of doc to fit on a single line? Is it OK if it runs over? black didn't reformat it, but emacs doesn't like it.

@WardBrian
Copy link
Collaborator

I believe the standard is if it takes more than one line, you should add a newline after the opening """

e.g.

"""A single line docstring"""

"""
A docstring which is much longer, taking up 
multiple lines or even including multiple sections
"""

Black doesn't touch docstrings, but the python emacs package might have this opinion built-in

@bob-carpenter
Copy link
Collaborator Author

Thanks. The examples for the Google-style doc start right after the """ but they're all one-liners. I can try to be more terse if that's the standard in Python.

@WardBrian
Copy link
Collaborator

I wouldn’t sacrifice the level of detail you want to avoid the extra newline. In any programmatic usage (e.g. by Sphinx) leading whitespace is removed from them

@bob-carpenter
Copy link
Collaborator Author

I just want to follow conventions which looks like terse first sentences. That's what javadoc wanted, too. I'll see what I can do.

to `iat_imse()`.
Return an estimate of the integrated autocorrelation time (IAT)
of the specified Markov chain. Evaluated by delegating to the
initial monotone sequence estimator, `iat_imse(chain)`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would there be any reason (performance?) someone might want the ipse instead? Is it worth exposing the ability to choose that? (I'm guessing not really)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm OK getting rid of IPSE; even though it's a bit faster than IMSE, it has an overestimation bias. Do you think I should get rid of it?

I still think we should keep the ess_imse and iat_imse definitions and delegate ess and iat. I know at least for ess we will have further estimators that work cross chain. The other option is to not provide a default implementation. I'm always on the fence about choosing default implementations for users, but in the end, I think they're going to want o see ess() or iat() in most applied code.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a strong feeling about IPSE--it's there, it's implemented, it works; no reason to discard that work, especially if we might want to expose it later. The counterargument is that it's more surface area, and unreachable unused code gets out of date (not that any code in Python is ever truly unreachable).

Unfortunately, I'm kind of clueless whether anybody would in practice actually want to have a choice here.

And if we expose that choice, however we did so, I could imagine a tricky situation where we wind up having some deeply nested calls such that there's an inconsistency between the implementations used in different places--whether because it's hard to make sure the "use-this-implementation" flag gets passed around everywhere ess is called, or because users call ess_ipse in their own direct code but our version doesn't. Of course, again, dunno if such a scenario is actually likely in practice.

I guess it would come down to three questions:

  • Would anybody have a reason to use IPSE?
  • Are these two the only implementations worth implementing, or could we imagine adding more later?
  • Are these used only/mainly for evaluating sample quality after the fact, or are there cases where a sampling method might want to make use of these in a way that affects its behavior?

Copy link
Collaborator Author

@bob-carpenter bob-carpenter Mar 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I talked to @WardBrian and concluded we should leave it in for pedagogical purposes.

  • I think the utility for IPSE is pedagogical and I don't think anyone would use one in an application unless they really needed the extra speed. The standard way to teach these estimators is to first describe the initial positive sequence method, which was in common use until IMSE was developed, then add a monotonicity condition on top of that.
  • Yes, we'll probably be coding other versions of these eventually, such as a Bayesian estimator that makes a parametric or non-parametric assumption about the shape of the autocorrelation vs. lag curve---it's an active area of research.
  • Yes, we will eventually want to use ESS at run time to set a minimum ESS target and run until it's hit (it won't affect the transitions, just when we stop)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me. All of this points to the same conclusion--leave it in for pedagogy (and so people could technically call it if they really wanted to), but have a standard pathway that delegates to a specific implementation.

Copy link
Collaborator

@jsoules jsoules left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has been open for a while, but reviewing it again this morning I have nothing further on it and would be happy for it to be merged.

@bob-carpenter bob-carpenter merged commit 2c6ed9d into main May 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

in first_neg_pair_start, use n = n + 1 small bug in ess_imse

5 participants