-
Notifications
You must be signed in to change notification settings - Fork 2.2k
thanos/receive: add feature flag to put ext labels into TSDB #8546
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
7c7a1f2 to
2bfaa32
Compare
Enable streaming by putting external labels into the TSDB. If the external labels ever change then the head is pruned during startup. Signed-off-by: Giedrius Statkevičius <[email protected]>
2bfaa32 to
51fbbe4
Compare
ringerc
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand correctly, this is a patch for thanos-receive to materialize external-labels in the TSDB rather than injecting them into Series gRPC.
It's analogous to the discussion were were involved with re Prometheus, external-labels, sidecar and remote-read. But this change is for helping with queries serviced by Receive, not Sidecar.
Sound about right?
If so: Won't changes also be required for the Store gateway to stream series from blocks known to contain embedded external labels? And in Compactor to preserve the external labels added by Receive, and to inject external labels when processing Sidecar-uploaded blocks?
Also, these extlabel-embedded blocks would only be from Receive, since Sidecar uploaded blocks won't have the external labels. Meaning a Thanos that mixes Receive and Sidecar will not benefit much from this change, because freshly uploaded uncompacted blocks from Sidecar will not have the external labels embedded yet.
To support streaming Series gRPC from Sidecar, a similar change would be required in Prometheus itself to materialize external labels in the TSDB. And some assessment made of the cost/performance impact of doing so.
To benefit from this proposed change it seems like it'd be necessary to deploy one or more agent-mode Prometheus instances for scraping, and configure them to remote-write to Thanos Receive. Is that the intended use model here, where Sidecar is omitted and Receive handles Series gRPC for fresh samples?
| // WithExternalLabelsInTSDB enables putting external labels in the TSDB. | ||
| // This permits streaming from the TSDB to the querier. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// WithExternalLabelsInTSDB enables putting external labels in the TSDB.
// This ensures that external labels and normal labels are sorted correctly, so a Series gRPC request
// can be streamed without the need for a buffer-and-re-sort pass after injecting external labels.
|
|
||
| ### ext-labels-in-tsdb | ||
|
|
||
| If enabled then it will put the current external labels as "normal" labels inside of the TSDB. This also adds a special marker to the meta files in blocks so that it would be known whether external labels are part of the series inside of the TSDB. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you check if this proposed addition to the readme for feature flags you've added looks sensible?:
...
If all blocks matched by a request had this feature flag enabled at the time they were written, Receive can stream responses to Series gRPC requests from Query instead of buffering them in memory. This lowers Receive's peak memory consumption for high-cardinality or high-sample-count requests.
However, storing external labels in the TSDB will increase TSDB index sizes because each external label must appear in the inverted index for every series. TSDB blocks will be somewhat larger and need more disk I/O to read, and HEAD chunks will require somewhat more memory.
If the configured external labels are changed, Receive will flush the current HEAD block to disk and start a new HEAD block. No data is discarded.
| oldLset := labels.FromMap(m.Thanos.Labels) | ||
| if labels.Equal(oldLset, curLset) { | ||
| return nil | ||
| } | ||
|
|
||
| level.Info(t.logger).Log("msg", "changed external labelset detected, dumping the head block", "newLset", curLset.String(), "oldLset", oldLset.String()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider "flushing ... to disk" instead of "dumping"; the latter makes it sound like the data is being discarded.
| } | ||
|
|
||
| if st.addExtLabelsToTSDB { | ||
| allBlocksHaveExtension := func() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+ // If all blocks have external labels embedded in them, we can skip building a matrix of results
+ // in-memory, injecting external labels and re-sorting them. The results may be streamed directly
+ // to the caller instead.
|
Consider adding to PR description that This relates to prior work on re-sorting responses with external labels in #8487, and memory-use increases as discussed in #7395 . This change only affects blocks in Receive's memory, not blocks shipped to the object store and fetched via Store gateway, or blocks read from Prometheus via Sidecar. But Query will still benefit from the streaming response from Receive even if other components still need to buffer, inject labels and sort. |
|
I don't know if this solves the problem completely. The key issue is that dropping an internal label can cause the set to become unsorted, I don't think where we store external labels changes this. |
Agree - I dont think this solves the issue - We should probably add external labels as a slice in the Series gRPC call and only allow dropping from that slice. Then we will keep the series order and can drop I think. |
We aren't dropping any labels anywhere with this patch. Could you elaborate a bit more? |
|
I thought this PR is needed to enable streaming series without resorting so my comment is related to that feature. |
Yeah but with this patch "without replica labels" doesn't do anything anymore - we put sorted series in with external labels, get sorted series out. Your original comment said that dropping labels causes the set to be unsorted but we don't do that anywhere, do we? |
This is only true if the series data has collisions with the external labels. If I understand correctly, @fpetkovski is probably saying that this change has to either clobber existing values by replacing them with the configured external labels, or it has to respect the existing user values and fail to enforce the external labels. Is that right? I increasingly think that conflicts between external and user labels should simply be made an error that results in ingestion, scrape, or query failure (depending on where the conflict arises). Or at least an annotation warning that query results may be invalid and incorrect. A pre-existing label with the same name as an external label would be permitted and accepted, but only if its value is the same as the configured external label. That'd make the whole mess go away throughout the stack. A feature flag could be used to introduce enforcement optionally, then made on by default but allow people to turn it off for a while before wholly removing the option. When enforcement is on, Sidecar wouldn't have to re-sort (it'd just check for and enforce labels instead, then inject them into the stream) so no need for #8487. And we could probably add a similar FF to Prometheus for its remote-read external labels, or simply stop using prom-level external labels in favour of having Sidecar inject them as it processes the metric stream it is forwarding. Receive, the subject of this PR, could drop samples with conflicting external labels and increment a suitable counter to ensure this can be observed. Then it could discard all label-pairs that match the configured external labels; there'd be no need to store them in the TSDB if they're known from external labels already, they can just be injected when streaming-out the response without changing the sort order. For cascaded setups, fan-outs, etc, the lower tiers would need one set of external label names, and the higher tiers another. E.g. the lower tier might have |
Enable streaming by putting external labels into the TSDB. If the external labels ever change then the head is pruned during startup.