Instrument QueryExecution.assertAnalyzed() to catch DataFrame analysis failures#11033
Draft
aboitreaud wants to merge 2 commits intomasterfrom
Draft
Instrument QueryExecution.assertAnalyzed() to catch DataFrame analysis failures#11033aboitreaud wants to merge 2 commits intomasterfrom
aboitreaud wants to merge 2 commits intomasterfrom
Conversation
ef61c95 to
4e5f33e
Compare
4e5f33e to
68ebce2
Compare
BenchmarksStartupParameters
See matching parameters
SummaryFound 0 performance improvements and 0 performance regressions! Performance is the same for 61 metrics, 10 unstable metrics. Startup time reports for insecure-bankgantt
title insecure-bank - global startup overhead: candidate=1.62.0-SNAPSHOT~68ebce2b54, baseline=1.62.0-SNAPSHOT~6880c80c48
dateFormat X
axisFormat %s
section tracing
Agent [baseline] (1.055 s) : 0, 1055442
Total [baseline] (8.831 s) : 0, 8831216
Agent [candidate] (1.06 s) : 0, 1060377
Total [candidate] (8.832 s) : 0, 8832422
section iast
Agent [baseline] (1.243 s) : 0, 1243309
Total [baseline] (9.612 s) : 0, 9611985
Agent [candidate] (1.242 s) : 0, 1241813
Total [candidate] (9.612 s) : 0, 9611658
gantt
title insecure-bank - break down per module: candidate=1.62.0-SNAPSHOT~68ebce2b54, baseline=1.62.0-SNAPSHOT~6880c80c48
dateFormat X
axisFormat %s
section tracing
crashtracking [baseline] (1.233 ms) : 0, 1233
crashtracking [candidate] (1.23 ms) : 0, 1230
BytebuddyAgent [baseline] (633.181 ms) : 0, 633181
BytebuddyAgent [candidate] (635.081 ms) : 0, 635081
AgentMeter [baseline] (29.482 ms) : 0, 29482
AgentMeter [candidate] (29.594 ms) : 0, 29594
GlobalTracer [baseline] (248.784 ms) : 0, 248784
GlobalTracer [candidate] (249.834 ms) : 0, 249834
AppSec [baseline] (32.31 ms) : 0, 32310
AppSec [candidate] (32.532 ms) : 0, 32532
Debugger [baseline] (59.024 ms) : 0, 59024
Debugger [candidate] (59.056 ms) : 0, 59056
Remote Config [baseline] (609.511 µs) : 0, 610
Remote Config [candidate] (587.771 µs) : 0, 588
Telemetry [baseline] (8.023 ms) : 0, 8023
Telemetry [candidate] (8.054 ms) : 0, 8054
Flare Poller [baseline] (6.62 ms) : 0, 6620
Flare Poller [candidate] (8.105 ms) : 0, 8105
section iast
crashtracking [baseline] (1.236 ms) : 0, 1236
crashtracking [candidate] (1.238 ms) : 0, 1238
BytebuddyAgent [baseline] (817.038 ms) : 0, 817038
BytebuddyAgent [candidate] (817.133 ms) : 0, 817133
AgentMeter [baseline] (11.52 ms) : 0, 11520
AgentMeter [candidate] (11.525 ms) : 0, 11525
GlobalTracer [baseline] (241.35 ms) : 0, 241350
GlobalTracer [candidate] (240.873 ms) : 0, 240873
IAST [baseline] (27.783 ms) : 0, 27783
IAST [candidate] (30.199 ms) : 0, 30199
AppSec [baseline] (30.619 ms) : 0, 30619
AppSec [candidate] (27.771 ms) : 0, 27771
Debugger [baseline] (65.541 ms) : 0, 65541
Debugger [candidate] (64.391 ms) : 0, 64391
Remote Config [baseline] (542.52 µs) : 0, 543
Remote Config [candidate] (537.946 µs) : 0, 538
Telemetry [baseline] (7.862 ms) : 0, 7862
Telemetry [candidate] (7.765 ms) : 0, 7765
Flare Poller [baseline] (3.497 ms) : 0, 3497
Flare Poller [candidate] (3.447 ms) : 0, 3447
Startup time reports for petclinicgantt
title petclinic - global startup overhead: candidate=1.62.0-SNAPSHOT~68ebce2b54, baseline=1.62.0-SNAPSHOT~6880c80c48
dateFormat X
axisFormat %s
section tracing
Agent [baseline] (1.068 s) : 0, 1068175
Total [baseline] (11.058 s) : 0, 11058260
Agent [candidate] (1.06 s) : 0, 1059737
Total [candidate] (11.118 s) : 0, 11117814
section appsec
Agent [baseline] (1.271 s) : 0, 1271156
Total [baseline] (11.114 s) : 0, 11114427
Agent [candidate] (1.27 s) : 0, 1269525
Total [candidate] (10.988 s) : 0, 10988314
section iast
Agent [baseline] (1.233 s) : 0, 1233094
Total [baseline] (11.33 s) : 0, 11330077
Agent [candidate] (1.244 s) : 0, 1244352
Total [candidate] (11.381 s) : 0, 11380505
section profiling
Agent [baseline] (1.196 s) : 0, 1196046
Total [baseline] (10.992 s) : 0, 10991997
Agent [candidate] (1.189 s) : 0, 1189317
Total [candidate] (11.032 s) : 0, 11032063
gantt
title petclinic - break down per module: candidate=1.62.0-SNAPSHOT~68ebce2b54, baseline=1.62.0-SNAPSHOT~6880c80c48
dateFormat X
axisFormat %s
section tracing
crashtracking [baseline] (1.242 ms) : 0, 1242
crashtracking [candidate] (1.233 ms) : 0, 1233
BytebuddyAgent [baseline] (639.446 ms) : 0, 639446
BytebuddyAgent [candidate] (635.176 ms) : 0, 635176
AgentMeter [baseline] (29.946 ms) : 0, 29946
AgentMeter [candidate] (29.631 ms) : 0, 29631
GlobalTracer [baseline] (250.256 ms) : 0, 250256
GlobalTracer [candidate] (249.903 ms) : 0, 249903
AppSec [baseline] (32.542 ms) : 0, 32542
AppSec [candidate] (32.458 ms) : 0, 32458
Debugger [baseline] (59.822 ms) : 0, 59822
Debugger [candidate] (59.838 ms) : 0, 59838
Remote Config [baseline] (609.789 µs) : 0, 610
Remote Config [candidate] (594.801 µs) : 0, 595
Telemetry [baseline] (8.116 ms) : 0, 8116
Telemetry [candidate] (7.993 ms) : 0, 7993
Flare Poller [baseline] (9.898 ms) : 0, 9898
Flare Poller [candidate] (6.767 ms) : 0, 6767
section appsec
crashtracking [baseline] (1.249 ms) : 0, 1249
crashtracking [candidate] (1.24 ms) : 0, 1240
BytebuddyAgent [baseline] (680.563 ms) : 0, 680563
BytebuddyAgent [candidate] (679.342 ms) : 0, 679342
AgentMeter [baseline] (12.218 ms) : 0, 12218
AgentMeter [candidate] (12.223 ms) : 0, 12223
GlobalTracer [baseline] (251.265 ms) : 0, 251265
GlobalTracer [candidate] (250.871 ms) : 0, 250871
AppSec [baseline] (186.783 ms) : 0, 186783
AppSec [candidate] (186.928 ms) : 0, 186928
Debugger [baseline] (66.044 ms) : 0, 66044
Debugger [candidate] (66.063 ms) : 0, 66063
Remote Config [baseline] (583.32 µs) : 0, 583
Remote Config [candidate] (573.25 µs) : 0, 573
Telemetry [baseline] (7.862 ms) : 0, 7862
Telemetry [candidate] (7.9 ms) : 0, 7900
Flare Poller [baseline] (3.444 ms) : 0, 3444
Flare Poller [candidate] (3.465 ms) : 0, 3465
IAST [baseline] (24.547 ms) : 0, 24547
IAST [candidate] (24.493 ms) : 0, 24493
section iast
crashtracking [baseline] (1.223 ms) : 0, 1223
crashtracking [candidate] (1.229 ms) : 0, 1229
BytebuddyAgent [baseline] (810.41 ms) : 0, 810410
BytebuddyAgent [candidate] (816.698 ms) : 0, 816698
AgentMeter [baseline] (11.398 ms) : 0, 11398
AgentMeter [candidate] (11.568 ms) : 0, 11568
GlobalTracer [baseline] (238.993 ms) : 0, 238993
GlobalTracer [candidate] (241.785 ms) : 0, 241785
AppSec [baseline] (28.49 ms) : 0, 28490
AppSec [candidate] (28.01 ms) : 0, 28010
Debugger [baseline] (64.632 ms) : 0, 64632
Debugger [candidate] (67.267 ms) : 0, 67267
Remote Config [baseline] (534.497 µs) : 0, 534
Remote Config [candidate] (538.165 µs) : 0, 538
Telemetry [baseline] (7.82 ms) : 0, 7820
Telemetry [candidate] (7.925 ms) : 0, 7925
Flare Poller [baseline] (3.468 ms) : 0, 3468
Flare Poller [candidate] (3.423 ms) : 0, 3423
IAST [baseline] (30.104 ms) : 0, 30104
IAST [candidate] (29.641 ms) : 0, 29641
section profiling
ProfilingAgent [baseline] (94.216 ms) : 0, 94216
ProfilingAgent [candidate] (93.476 ms) : 0, 93476
crashtracking [baseline] (1.209 ms) : 0, 1209
crashtracking [candidate] (1.182 ms) : 0, 1182
BytebuddyAgent [baseline] (699.101 ms) : 0, 699101
BytebuddyAgent [candidate] (695.697 ms) : 0, 695697
AgentMeter [baseline] (9.224 ms) : 0, 9224
AgentMeter [candidate] (9.3 ms) : 0, 9300
GlobalTracer [baseline] (208.756 ms) : 0, 208756
GlobalTracer [candidate] (207.826 ms) : 0, 207826
AppSec [baseline] (33.045 ms) : 0, 33045
AppSec [candidate] (32.64 ms) : 0, 32640
Debugger [baseline] (66.568 ms) : 0, 66568
Debugger [candidate] (65.531 ms) : 0, 65531
Remote Config [baseline] (594.218 µs) : 0, 594
Remote Config [candidate] (588.482 µs) : 0, 588
Telemetry [baseline] (7.885 ms) : 0, 7885
Telemetry [candidate] (7.817 ms) : 0, 7817
Flare Poller [baseline] (3.595 ms) : 0, 3595
Flare Poller [candidate] (3.57 ms) : 0, 3570
Profiling [baseline] (94.784 ms) : 0, 94784
Profiling [candidate] (94.038 ms) : 0, 94038
LoadParameters
See matching parameters
SummaryFound 2 performance improvements and 0 performance regressions! Performance is the same for 17 metrics, 17 unstable metrics.
Request duration reports for petclinicgantt
title petclinic - request duration [CI 0.99] : candidate=1.62.0-SNAPSHOT~68ebce2b54, baseline=1.62.0-SNAPSHOT~6880c80c48
dateFormat X
axisFormat %s
section baseline
no_agent (19.391 ms) : 19193, 19589
. : milestone, 19391,
appsec (19.031 ms) : 18838, 19224
. : milestone, 19031,
code_origins (17.884 ms) : 17710, 18059
. : milestone, 17884,
iast (17.627 ms) : 17457, 17798
. : milestone, 17627,
profiling (20.218 ms) : 20017, 20418
. : milestone, 20218,
tracing (17.935 ms) : 17755, 18115
. : milestone, 17935,
section candidate
no_agent (19.74 ms) : 19538, 19943
. : milestone, 19740,
appsec (18.82 ms) : 18631, 19008
. : milestone, 18820,
code_origins (17.702 ms) : 17528, 17875
. : milestone, 17702,
iast (17.978 ms) : 17800, 18155
. : milestone, 17978,
profiling (18.214 ms) : 18034, 18395
. : milestone, 18214,
tracing (18.046 ms) : 17868, 18225
. : milestone, 18046,
Request duration reports for insecure-bankgantt
title insecure-bank - request duration [CI 0.99] : candidate=1.62.0-SNAPSHOT~68ebce2b54, baseline=1.62.0-SNAPSHOT~6880c80c48
dateFormat X
axisFormat %s
section baseline
no_agent (1.267 ms) : 1256, 1279
. : milestone, 1267,
iast (3.265 ms) : 3215, 3315
. : milestone, 3265,
iast_FULL (5.88 ms) : 5821, 5939
. : milestone, 5880,
iast_GLOBAL (3.754 ms) : 3686, 3822
. : milestone, 3754,
profiling (2.158 ms) : 2136, 2181
. : milestone, 2158,
tracing (1.916 ms) : 1901, 1932
. : milestone, 1916,
section candidate
no_agent (1.262 ms) : 1250, 1275
. : milestone, 1262,
iast (3.304 ms) : 3255, 3353
. : milestone, 3304,
iast_FULL (5.992 ms) : 5932, 6052
. : milestone, 5992,
iast_GLOBAL (3.699 ms) : 3637, 3760
. : milestone, 3699,
profiling (2.207 ms) : 2187, 2227
. : milestone, 2207,
tracing (1.903 ms) : 1887, 1918
. : milestone, 1903,
DacapoParameters
See matching parameters
SummaryFound 0 performance improvements and 0 performance regressions! Performance is the same for 11 metrics, 1 unstable metrics. Execution time for tomcatgantt
title tomcat - execution time [CI 0.99] : candidate=1.62.0-SNAPSHOT~68ebce2b54, baseline=1.62.0-SNAPSHOT~6880c80c48
dateFormat X
axisFormat %s
section baseline
no_agent (1.492 ms) : 1481, 1504
. : milestone, 1492,
appsec (3.85 ms) : 3628, 4072
. : milestone, 3850,
iast (2.291 ms) : 2220, 2361
. : milestone, 2291,
iast_GLOBAL (2.343 ms) : 2272, 2414
. : milestone, 2343,
profiling (2.112 ms) : 2056, 2167
. : milestone, 2112,
tracing (2.11 ms) : 2055, 2165
. : milestone, 2110,
section candidate
no_agent (1.49 ms) : 1479, 1502
. : milestone, 1490,
appsec (3.788 ms) : 3570, 4005
. : milestone, 3788,
iast (2.291 ms) : 2221, 2361
. : milestone, 2291,
iast_GLOBAL (2.333 ms) : 2262, 2404
. : milestone, 2333,
profiling (2.128 ms) : 2072, 2184
. : milestone, 2128,
tracing (2.09 ms) : 2036, 2144
. : milestone, 2090,
Execution time for biojavagantt
title biojava - execution time [CI 0.99] : candidate=1.62.0-SNAPSHOT~68ebce2b54, baseline=1.62.0-SNAPSHOT~6880c80c48
dateFormat X
axisFormat %s
section baseline
no_agent (15.539 s) : 15539000, 15539000
. : milestone, 15539000,
appsec (14.974 s) : 14974000, 14974000
. : milestone, 14974000,
iast (18.324 s) : 18324000, 18324000
. : milestone, 18324000,
iast_GLOBAL (18.081 s) : 18081000, 18081000
. : milestone, 18081000,
profiling (15.029 s) : 15029000, 15029000
. : milestone, 15029000,
tracing (14.977 s) : 14977000, 14977000
. : milestone, 14977000,
section candidate
no_agent (15.48 s) : 15480000, 15480000
. : milestone, 15480000,
appsec (14.8 s) : 14800000, 14800000
. : milestone, 14800000,
iast (18.833 s) : 18833000, 18833000
. : milestone, 18833000,
iast_GLOBAL (17.727 s) : 17727000, 17727000
. : milestone, 17727000,
profiling (15.191 s) : 15191000, 15191000
. : milestone, 15191000,
tracing (15.015 s) : 15015000, 15015000
. : milestone, 15015000,
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What Does This Do
Adds instrumentation on
QueryExecution.assertAnalyzed()to catch Catalyst analysis failures from any entry point —SparkSession.sql(),Dataset.select(),Dataset.filter(), etc. Previously, onlySparkSession.sql()failures were caught (PR #10981), but customer failures through DataFrame API operations (df.select()) were invisible to the tracer.Also reverts the
lastSqlFailedreset removal from the original branch — a successful Spark job now resets bothlastJobFailedandlastSqlFailed, matching existing behavior and avoiding false positives for apps that catch and recover from SQL errors.Debug logging is intentionally included — this build is for customer validation, not merge.
Motivation
Customer BRE stderr logs showed
spark.applicationspan marked SUCCESS when the EMR step failed. TheAnalysisException(UNRESOLVED_COLUMN) was thrown fromDataset.select()→QueryExecution.assertAnalyzed(), a path not covered by the existingSparkSqlFailureAdviceonSparkSession.sql().Additional Notes
QueryExecution.assertAnalyzed()is stable (public void assertAnalyzed()) across Spark 3.5.x, 4.0.x, 4.1.xSparkSqlFailureAdviceonSparkSession.sql()is kept (redundant but harmless, cleanup later)sparkSession.range(1).toDF("id").select("nonexistent_column"))Jira ticket: [PROJ-IDENT]