-
Notifications
You must be signed in to change notification settings - Fork 596
feat: trace watchable messages #7403
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Shreemaan Abhishek <[email protected]>
Signed-off-by: Shreemaan Abhishek <[email protected]>
Signed-off-by: Shreemaan Abhishek <[email protected]>
Signed-off-by: Shreemaan Abhishek <[email protected]>
Signed-off-by: Shreemaan Abhishek <[email protected]>
Signed-off-by: Shreemaan Abhishek <[email protected]>
Signed-off-by: Shreemaan Abhishek <[email protected]>
Signed-off-by: Shreemaan Abhishek <[email protected]>
Signed-off-by: Shreemaan Abhishek <[email protected]>
Signed-off-by: Shreemaan Abhishek <[email protected]>
| } | ||
|
|
||
| // EnvoyGatewayTraces defines control plane tracing configurations. | ||
| type EnvoyGatewayTraces struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In EG, we use tracing in other place, better to keep consistency.
gateway/api/v1alpha1/envoygateway_types.go
Line 524 in 4c36666
| type RateLimitTracing struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't understand your point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the name should be EnvoyGatewayTracing instead of EnvoyGatewayTraces.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Signed-off-by: Shreemaan Abhishek <[email protected]>
Signed-off-by: Shreemaan Abhishek <[email protected]>
Signed-off-by: Shreemaan Abhishek <[email protected]>
Signed-off-by: Shreemaan Abhishek <[email protected]>
Signed-off-by: Shreemaan Abhishek <[email protected]>
Signed-off-by: Shreemaan Abhishek <[email protected]>
Codecov Report❌ Patch coverage is ❌ Your patch check has failed because the patch coverage (45.85%) is below the target coverage (60.00%). You can increase the patch coverage or adjust the target coverage. Additional details and impacted files@@ Coverage Diff @@
## main #7403 +/- ##
==========================================
- Coverage 72.26% 71.11% -1.16%
==========================================
Files 231 275 +44
Lines 34084 34940 +856
==========================================
+ Hits 24632 24848 +216
- Misses 7676 8296 +620
- Partials 1776 1796 +20 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
thanks for working on this @shreealt is it possible to break this feature up into 2
|
| parentCtx = update.Value.Context | ||
| } | ||
|
|
||
| _, span := tracer.Start(parentCtx, "XdsRunner.subscribeAndTranslate") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we add another child span to r.cache.GenerateNewSnapshot and use the trace id as the version inside it, so this id trickles into envoy proxy
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done, but I hope this doesn't come off as a surprise to users?
Signed-off-by: Shreemaan Abhishek <[email protected]>
Signed-off-by: Shreemaan Abhishek <[email protected]>
Signed-off-by: Shreemaan Abhishek <[email protected]>
Signed-off-by: Shreemaan Abhishek <[email protected]>
Signed-off-by: Shreemaan Abhishek <[email protected]>
Signed-off-by: Shreemaan Abhishek <[email protected]>
Signed-off-by: Shreemaan Abhishek <[email protected]>
| r.Logger.Info("received an update", "key", update.Key) | ||
| val := update.Value | ||
| func(update message.Update[string, *resource.ControllerResourcesContext], errChan chan error) { | ||
| parentCtx := context.Background() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this defensive code needed ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not? 🤔
is it guaranteed that these values will not be nil?
internal/gatewayapi/runner/runner.go
Outdated
| // so when a delete is triggered, delete all keys | ||
| if update.Delete || val == nil { | ||
| if update.Delete || valWrapper == nil || valWrapper.Resources == nil { | ||
| span.AddEvent("delete_all_keys") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is not a useful span
instead can we create two child spans
- one to track
translatetime - one to track
statusUpdate
|
|
||
| message.HandleSubscription(message.Metadata{Runner: r.Name(), Message: message.XDSIRMessageName}, c, | ||
| func(update message.Update[string, *ir.Xds], errChan chan error) { | ||
| func(update message.Update[string, *message.XdsIRWithContext], errChan chan error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
similar to the gateway-api runner can we add a parent span and 2 child span for translate & updateSnapshot
internal/xds/cache/snapshotcache.go
Outdated
| var Hash = cachev3.IDHash{} | ||
| var ( | ||
| Hash = cachev3.IDHash{} | ||
| tracer = otel.Tracer("envoy-gateway/gateway-api") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| tracer = otel.Tracer("envoy-gateway/gateway-api") | |
| tracer = otel.Tracer("envoy-gateway/xds") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also does the envoy-gateway prefix exist since thats the idiomatic way to do things ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, pretty much that.
| defer s.mu.Unlock() | ||
|
|
||
| version := s.newSnapshotVersion() | ||
| _, span := tracer.Start(ctx, "SnapshotCache.GenerateNewSnapshot") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we keep things consistent and add spans in runner instead of the within the func ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's cleaner this way? 🤔
there are two usages of this function in runner.go, so the current way is better than wrapping GenerateNewSnapshot in two places?
internal/xds/runner/runner.go
Outdated
| defaultMaxConnectionAgeGrace = 2 * time.Minute | ||
| ) | ||
|
|
||
| var tracer = otel.Tracer("envoy-gateway/gateway-api") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| var tracer = otel.Tracer("envoy-gateway/gateway-api") | |
| var tracer = otel.Tracer("envoy-gateway/xds") |
| traceLogger := r.Logger.WithTrace(parentCtx) | ||
| traceLogger.Info("received an update") | ||
|
|
||
| _, span := tracer.Start(parentCtx, "XdsRunner.subscribeAndTranslate") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we add 2 child spans, one for translate and one for update snapshot
| enabled: false | ||
| opentelemetry-collector: | ||
| enabled: true | ||
| mode: deployment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you undo the changes in test
|
hey @shreealt left some comments, once those are addressed, can you also share before/after logs, tia ! |
|
@shreealt we'll also need |
Signed-off-by: Shreemaan Abhishek <[email protected]>
|
/retest |
Signed-off-by: Shreemaan Abhishek <[email protected]>
| emptyClusterName = "EmptyCluster" | ||
| ) | ||
|
|
||
| var tracer = otel.Tracer("envoy-gateway/xds/translator") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lets not add any Otel things to the lib, but in the runner please, in case these libs are made public and used purely for translation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Signed-off-by: Shreemaan Abhishek <[email protected]>
Signed-off-by: Shreemaan Abhishek <[email protected]>
ab45164 to
cd7fdd6
Compare
Signed-off-by: Shreemaan Abhishek <[email protected]>
What type of PR is this?
What this PR does / why we need it:
Which issue(s) this PR fixes:
Fixes #7324
Release Notes: Yes/No