Skip to content

Conversation

@Naman-B-Parlecha
Copy link
Contributor

Resolve: #573

Currently, the peak sample count is incorrectly reported as 1 because currStepSamples is reset to 0 at every step iteration, preventing proper accumulation of peak samples across series.

To fix this we can maintain a vector to track sample count per step across all series and add up the sample at each step. Once all the all sample per step of all series are processed we can update the telemetry for all series

Signed-off-by: Naman-B-Parlecha <[email protected]>
@Naman-B-Parlecha
Copy link
Contributor Author

@yeya24 @MichaHoffmann PTAL!

@yeya24
Copy link
Contributor

yeya24 commented Jul 7, 2025

I think it is better to do #573 (comment).
We can decouple updating peak samples from IncrementSamplesAtTimestamp.

@Naman-B-Parlecha
Copy link
Contributor Author

Sure refactoring!

@harry671003 harry671003 self-requested a review July 9, 2025 22:32
@harry671003
Copy link
Contributor

Thanks for the work on this!

Before we finalize the implementation, I'd like to clarify what we mean by peak samples in the context of the Thanos engine.

In the Prometheus engine, peak samples is defined as the highest number of samples held in memory at any moment during query execution, across all operators. This is used to enforce the --query.max-samples limit and prevent memory exhaustion.

A few open questions for alignment:

  • Are we using the same definition of peak samples in the Thanos engine? That is, tracking the max number of samples in memory at any point during evaluation (across all operators)?
  • Or are we maintaining per-operator peak samples separately? If so, how are we aggregating or enforcing limits?
  • At what points in the engine are we calling UpdatePeak? Are we sure those points capture all relevant memory usage?

Having a shared definition with Prometheus helps make this limit consistent and easier to reason about. If we’re diverging, it would be good to document how and why.

Curious to hear thoughts.

@Naman-B-Parlecha
Copy link
Contributor Author

Hey @harry671003 thanks for the clarification,
i m not entirely sure at what point we should be updating the peak but i believe we want to do something similar to what happens in prometheus?

@yeya24 is more qualified to answer this i believe as he has worked closely with the engine

@harry671003
Copy link
Contributor

I thought a little bit more about this:

Let's define the Peak samples first. Peak samples for an operator is the total samples the operator processed in a single Next() call. Maximum Peak samples would be the peak samples across all operators.

One way to implement this is to use the telemetry operator
https://github.com/thanos-io/promql-engine/blob/main/execution/telemetry/telemetry.go#L204


func (t *Operator) Next(ctx context.Context) ([]model.StepVector, error) {
	start := time.Now()
	totalSamplesBefore := t.OperatorTelemetry.Samples().TotalSamples() // Get the samples before calling Next()
	defer func() { t.OperatorTelemetry.AddNextExecutionTime(time.Since(start)) }()
	out, err := t.inner.Next(ctx)
	if err != nil {
		return nil, err
	}
 	totalSamplesAfter :=  t.OperatorTelemetry.Samples().TotalSamples() // Get the samples after calling Next()
 	t.OperatorTelemetry.UpdatePeak(totalSamplesAfter - totalSamplesBefore) // The diff is the peak samples.
 	
 	return out, err
}

Wdyt @yeya24 @Naman-B-Parlecha ?

@Naman-B-Parlecha
Copy link
Contributor Author

I thought a little bit more about this:

Let's define the Peak samples first. Peak samples for an operator is the total samples the operator processed in a single Next() call. Maximum Peak samples would be the peak samples across all operators.

One way to implement this is to use the telemetry operator https://github.com/thanos-io/promql-engine/blob/main/execution/telemetry/telemetry.go#L204


func (t *Operator) Next(ctx context.Context) ([]model.StepVector, error) {
	start := time.Now()
	totalSamplesBefore := t.OperatorTelemetry.Samples().TotalSamples() // Get the samples before calling Next()
	defer func() { t.OperatorTelemetry.AddNextExecutionTime(time.Since(start)) }()
	out, err := t.inner.Next(ctx)
	if err != nil {
		return nil, err
	}
 	totalSamplesAfter :=  t.OperatorTelemetry.Samples().TotalSamples() // Get the samples after calling Next()
 	t.OperatorTelemetry.UpdatePeak(totalSamplesAfter - totalSamplesBefore) // The diff is the peak samples.
 	
 	return out, err
}

Wdyt @yeya24 @Naman-B-Parlecha ?

great was looking into how prometheus is updating its peak
but curious about why is peak value the diff of after and before
shouldnt it be the highest in of the series? (this is how i m updating it currently in this pr not sure if i m forgetting something)

@harry671003
Copy link
Contributor

@Naman-B-Parlecha

Yes, what you're doing is fundamentally the same. This is just a cleaner way of doing it.

  • We decouple the IncrementSamples() and UpdatePeaks()
  • We only do it in telemetry.Operator

@Naman-B-Parlecha
Copy link
Contributor Author

@Naman-B-Parlecha

Yes, what you're doing is fundamentally the same. This is just a cleaner way of doing it.

  • We decouple the IncrementSamples() and UpdatePeaks()
  • We only do it in telemetry.Operator

Thanks for the clarification 🙌
Should I refactor the code in this pr or would u like to make a fresh pr and add changes to it?

@harry671003
Copy link
Contributor

Use the same PR.
Also include unit tests.

@Naman-B-Parlecha Naman-B-Parlecha force-pushed the NamanParlecha/FixPeakSample branch from f81684b to 0e0ae60 Compare July 19, 2025 15:54
Signed-off-by: Naman-B-Parlecha <[email protected]>
Signed-off-by: Naman-B-Parlecha <[email protected]>
@Naman-B-Parlecha Naman-B-Parlecha force-pushed the NamanParlecha/FixPeakSample branch from 0e0ae60 to 7d06931 Compare July 19, 2025 15:56
Copy link
Contributor

@harry671003 harry671003 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work! Just some comments.

| | | | |---[concurrent(buff=2)]: max_series: 1 total_samples: 0 peak_samples: 0
| | | | | |---[matrixSelector] rate({[__name__="http_requests_total"]}[10m0s] 0 mod 2): max_series: 1 total_samples: 1010 peak_samples: 200
| | | | |---[concurrent(buff=2)]: max_series: 1 total_samples: 0 peak_samples: 0
| | | | | |---[matrixSelector] rate({[__name__="http_requests_total"]}[10m0s] 1 mod 2): max_series: 1 total_samples: 1010 peak_samples: 200
Copy link
Contributor

@harry671003 harry671003 Jul 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the peak samples correct here?
For the selector {[__name__="http_requests_total"]}[10m0s]

Conditions:

  • Steps batch is 10.
  • Selectors shards = 2

Calculation:

  • Each Next() call has 10 steps.

  • Each step has 1 series (due to sharding)

  • For each step:

    • Each series has a sample every 30 seconds.
    • In [10m] range, there will be 10*60s/30s = 20 samples
  • Total samples in each step = 10 steps * 1 series * 20 samples = 200

Looks correct.

@harry671003 harry671003 requested a review from yeya24 July 21, 2025 17:53
Signed-off-by: Naman-B-Parlecha <[email protected]>
@Naman-B-Parlecha Naman-B-Parlecha force-pushed the NamanParlecha/FixPeakSample branch from 625dbe1 to eb0809d Compare July 21, 2025 22:13
@Naman-B-Parlecha Naman-B-Parlecha force-pushed the NamanParlecha/FixPeakSample branch from 3096f45 to ef60fd8 Compare July 21, 2025 23:42
Copy link
Contributor

@harry671003 harry671003 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Great work. Thank you.

@Naman-B-Parlecha
Copy link
Contributor Author

Naman-B-Parlecha commented Jul 21, 2025

Thanks for the helping me out @harry671003 @yeya24 🙌

Copy link
Contributor

@yeya24 yeya24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Naman-B-Parlecha and @harry671003! I like how this is implemented now and it seems clean.

Let's fix lint. Unit tests also failed so maybe need to take another look

Signed-off-by: Naman-B-Parlecha <[email protected]>
@Naman-B-Parlecha
Copy link
Contributor Author

@yeya24 PTAL

Signed-off-by: Naman-B-Parlecha <[email protected]>
@yeya24 yeya24 merged commit 91e6e32 into thanos-io:main Jul 26, 2025
11 of 12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Incorrect peak sample stats

3 participants