-
-
Notifications
You must be signed in to change notification settings - Fork 4.5k
ref(explorer): improve profile indexing query #103014
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| referrer=Referrer.SEER_RPC, | ||
| config=config, | ||
| sampling_mode="NORMAL", | ||
| ) | ||
|
|
||
| # Step 2: Collect all profiles and merge those with same profile_id and is_continuous | ||
| all_profiles = [] | ||
| profile_data = [] | ||
|
|
||
| for row in profiles_result.get("data", []): | ||
| span_id = row.get("span_id") | ||
| profile_id = row.get("profile.id") # Transaction profiles | ||
| profiler_id = row.get("profiler.id") # Continuous profiles | ||
| transaction_name = row.get("transaction") | ||
| start_ts = row.get("precise.start_ts") | ||
| end_ts = row.get("precise.finish_ts") | ||
|
|
||
| logger.info( | ||
| "Iterating over span to get profiles", | ||
| extra={ | ||
| "span_id": span_id, | ||
| "profile_id": profile_id, | ||
| "profiler_id": profiler_id, | ||
| "transaction_name": transaction_name, | ||
| }, | ||
| ) | ||
|
|
||
| if not span_id: | ||
| logger.info( | ||
| "Span doesn't have an id, skipping", | ||
| extra={"span_id": span_id}, | ||
| ) | ||
| continue | ||
| start_ts = row.get("min(precise.start_ts)") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Query uses limit=5, preventing retrieval of all unique profiles for a trace, contradicting refactor's goal.
Severity: CRITICAL | Confidence: 1.00
🔍 Detailed Analysis
The Snuba query constructed to retrieve unique profiles and their time ranges incorrectly applies a limit=5. This hardcoded limit prevents the function from returning all unique profiles for a given trace if more than five distinct profiles are present, directly undermining the refactor's stated goal of obtaining all unique profiles.
💡 Suggested Fix
Remove or substantially increase the limit parameter in the Snuba query to ensure all unique profiles are fetched, aligning with the function's intended purpose.
🤖 Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.
Location: src/sentry/seer/explorer/index_data.py#L331-L353
Potential issue: The Snuba query constructed to retrieve unique profiles and their time
ranges incorrectly applies a `limit=5`. This hardcoded limit prevents the function from
returning all unique profiles for a given trace if more than five distinct profiles are
present, directly undermining the refactor's stated goal of obtaining all unique
profiles.
Did we get this right? 👍 / 👎 to inform future reviews.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not a bug
|
|
||
| # Determine if this is a continuous profile (profiler.id without profile.id) | ||
| is_continuous = profile_id is None and profiler_id is not None | ||
| is_continuous = profiler_id is not None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Correct Profile ID Prioritization
When a span has both profile.id and profiler.id, the code incorrectly prioritizes profiler_id and marks it as continuous. Transaction profiles (identified by profile.id) should take priority over continuous profiles (profiler.id). The correct logic is actual_profile_id = profile_id or profiler_id and is_continuous = profiler_id is not None and profile_id is None, matching the pattern used elsewhere in the codebase (e.g., autofix.py).
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #103014 +/- ##
===========================================
- Coverage 80.75% 80.64% -0.12%
===========================================
Files 9145 9140 -5
Lines 395020 392773 -2247
Branches 24968 24968
===========================================
- Hits 319002 316755 -2247
Misses 75617 75617
Partials 401 401 |
Improves `get_profiles_for_trace` to more reliably get all unique profiles and their full time ranges. Before we were likely to miss them since we just sampled 50 spans from the start of the trace. Also simplifies the logic after the query a lot. And should help avoid ReadTimeouts and exceeding Snuba quota
Improves
get_profiles_for_traceto more reliably get all unique profiles and their full time ranges. Before we were likely to miss them since we just sampled 50 spans from the start of the trace. Also simplifies the logic after the query a lot. And should help avoid ReadTimeouts and exceeding Snuba quota