-
Notifications
You must be signed in to change notification settings - Fork 423
feat: TraceIdRatioBasedSampler and other sampling fixes
#3501
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
feat: TraceIdRatioBasedSampler and other sampling fixes
#3501
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #3501 +/- ##
==========================================
- Coverage 97.77% 97.65% -0.13%
==========================================
Files 421 426 +5
Lines 55763 55923 +160
Branches 1 1
==========================================
+ Hits 54524 54609 +85
- Misses 1239 1314 +75
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
bcd7c51 to
1e32477
Compare
2c085a1 to
037a29f
Compare
531c86e to
0356e15
Compare
TraceIdRatioBasedSampler and other sampling fixes
6f881da to
d2d44ae
Compare
88dfa7a to
f2b2208
Compare
bizob2828
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't fully reviewed but I have some concerns with our approach, especially when you layer on partial granularity/full granularity tracing
| * @returns {Sampler} A Sampler e.g. AdaptiveSampler or TraceIdRatioBasedSampler | ||
| */ | ||
| function determineSampler({ agent, config, sampler = 'root' }) { | ||
| const samplerDefinition = config?.distributed_tracing?.sampler?.[sampler] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feels like there has to be a better way to do this. This will not work for core tracing. Will we have to instantiate even more samplers?
| * in respect to this sampler's decision. | ||
| * @param {Transaction} transaction the transaction to update | ||
| */ | ||
| applySamplingDecision(transaction) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm unclear here. We've conflated a sampler with a sampler. Meaning previous samplers like adaptive sampler were called with shouldSample
| transaction.sampled = tracestate?.intrinsics ? tracestate.isSampled : null | ||
| transaction.priority = tracestate?.intrinsics ? tracestate.priority : null | ||
| function applySamplingStrategy({ transaction, tracestate = null, sampler }) { | ||
| if (sampler?.toString() !== 'AdaptiveSampler') { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't all the sampling logic be encapsulated within the samplers? Having this hang off adaptive sampler feels wrong
| applySamplingDecision(transaction) { | ||
| if (!transaction) return | ||
| // eslint-disable-next-line sonarjs/pseudo-random | ||
| const initPriority = Math.random() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the sample logic as the adaptive, it's just that it calls its shouldSample method. Feels like this should be in the default Sampler class
| } | ||
|
|
||
| // Is it TraceIdRatioBased? | ||
| const ratioBasedSampler = samplerDefinition?.trace_id_ratio_based |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this logic should live in the config. we shouldn't have to check if it's a valid ratio here.
Description
Currently, the agent only has one real sampler, the
AdaptiveSampler. This PR adds a second sampler type that is akin to OpenTelemetry'sTraceIdRatioBasedSamplerand proper classes for the other sampling decisions,always_onandalways_off. For the later two sampling decisions, we only applied them if a valid tracestate and traceparent were present. Unfortunately, we still have to account for the deprecated-header or no-header case, which this PR fixes by better encapsulating sampling logic.There is now a
Samplerinterface which exposesapplySamplingDecision(transaction)which will updatepriorityandsampledon the transaction with respect to the sampler's decision. Previously, this logic was scattered aroundTransaction(lib/transaction/index.js). This has been reduced to a refactoredTransaction.prototype._calculatePriority()and a newapplySamplerStrategy(transaction, tracestate, sampler), which is justSampler.applySamplingDecisionwith a bit more logic.There are (currently, core tracing will add more) three ways a customer could use ratio-based sampling. There are defined in our config as
distributed_tracing.sampler.*. Under all of these, you would enable ratio-based-sampling by definingtrace_id_ratio_based.ratio.root: This is the main sampler for traces originating from the current service.remote_parent_sampled: The sampler for when the upstream service has sampled the trace.remote_parent_not_sampled: The sampler for when the upstream service has not sampled the trace.Therefore, this PR adds two samplers stored on the agent to better encapsulate the above logic:
agent.remoteParentSampledSamplerandagent.remoteParentNotSampledSampler. The root sampler kept the current name ofagent.transactionSampler. There is a lot of complexity around this, namely the possibly of > 1AdaptiveSamplers who may or may not share state. This is out of scope for this PR and will be solved by #3519.How to Test
Cross-agent tests are re-synced! The new tests uncovered some missing pieces with our sampling logic, outside of ratio-based sampling, which this PR solves.
Related Issues
Closes #3387 and closes #3480