Skip to content

Conversation

@ableegoldman
Copy link
Contributor

The main change here is the introduction of the CachedMetadata class, which as the name suggests caches some of the nontrivial repeated requests for things like the number of kafka partitions or sub-partitions in Cassandra. Also cleans up the sub-partitioner and related APIs a bit, mainly with regards to naming. We should try to stick to the conventions outlined in the SubPartitioner javadocs, as it gets a bit confusing to keep track of which "partition" is being referred to when.

This is related to the refactoring I need for the window store implementation, but indirectly enough that it makes sense to pull out into its own patch.

* Use this any time we need to repeatedly look up metadata that doesn't change, so we can
* cache the results rather than make admin/client requests every single time
*/
public class CachedMetadata {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This class seems to be shared by all the stores, so it needs to be synchronized.

Copy link
Contributor

@rodesai rodesai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Usage of the cache looks good to me, but the cache itself needs concurrency control.

ableegoldman added a commit that referenced this pull request Sep 16, 2023
This PR makes changelog truncation a store-level config, rather than something we implicitly derive/guess at the user intention based on the topic configs. This solves a few things that have been on my mind for a while

1. Don't accidentally wipe out users don't when they weren't expecting it 💀
2. Make it a top-level config so it's easier to notice for people who do want it
3. Since we currently aren't compatible with the source-topic optimization or disabling the changelog, we can point users who are concerned about changelog topic space/growth to use changelog truncation instead
4. Best of all, no need to make so many admin calls during state store initialization (with this and Cache partition counts for sub-partitioner and set definitions/naming conventions #130, we now have no superflous admin calls)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants