-
Notifications
You must be signed in to change notification settings - Fork 470
[WIP] chore(ray): add configure_logging helper to pass to Ray worker_process_setup_hook #15362
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
|
Bootstrap import analysisComparison of import times between this PR and base. SummaryThe average import time from this PR is: 255 ± 4 ms. The average import time from base is: 264 ± 5 ms. The import time difference between this PR and base is: -8.9 ± 0.2 ms. Import time breakdownThe following import paths have shrunk:
|
Performance SLOsComparing candidate imran-hendley/setup_logging (b1477c2) with baseline main (4678eef) 🟡 Near SLO Breach (2 suites)🟡 flasksimple - 18/18✅ appsec-getTime: ✅ 4.589ms (SLO: <4.750ms -3.4%) vs baseline: -0.3% Memory: ✅ 64.026MB (SLO: <66.500MB -3.7%) vs baseline: +4.9% ✅ appsec-postTime: ✅ 6.615ms (SLO: <6.750ms 🟡 -2.0%) vs baseline: +0.1% Memory: ✅ 64.381MB (SLO: <66.500MB -3.2%) vs baseline: +4.9% ✅ appsec-telemetryTime: ✅ 4.588ms (SLO: <4.750ms -3.4%) vs baseline: -0.1% Memory: ✅ 64.088MB (SLO: <66.500MB -3.6%) vs baseline: +4.9% ✅ debuggerTime: ✅ 1.857ms (SLO: <2.000ms -7.1%) vs baseline: +0.3% Memory: ✅ 47.889MB (SLO: <49.500MB -3.3%) vs baseline: +5.0% ✅ iast-getTime: ✅ 1.856ms (SLO: <2.000ms -7.2%) vs baseline: -0.3% Memory: ✅ 44.597MB (SLO: <49.000MB -9.0%) vs baseline: +4.6% ✅ profilerTime: ✅ 1.925ms (SLO: <2.100ms -8.3%) vs baseline: -0.5% Memory: ✅ 48.441MB (SLO: <50.000MB -3.1%) vs baseline: +4.8% ✅ resource-renamingTime: ✅ 3.370ms (SLO: <3.650ms -7.7%) vs baseline: -0.3% Memory: ✅ 54.767MB (SLO: <56.000MB -2.2%) vs baseline: +5.0% ✅ tracerTime: ✅ 3.366ms (SLO: <3.650ms -7.8%) vs baseline: -0.1% Memory: ✅ 54.672MB (SLO: <56.500MB -3.2%) vs baseline: +4.8% ✅ tracer-nativeTime: ✅ 3.364ms (SLO: <3.650ms -7.8%) vs baseline: +0.4% Memory: ✅ 54.654MB (SLO: <60.000MB -8.9%) vs baseline: +4.8% 🟡 recursivecomputation - 8/8✅ deepTime: ✅ 308.836ms (SLO: <320.950ms -3.8%) vs baseline: -0.3% Memory: ✅ 35.861MB (SLO: <36.500MB 🟡 -1.7%) vs baseline: +6.0% ✅ deep-profiledTime: ✅ 330.500ms (SLO: <359.150ms -8.0%) vs baseline: -0.3% Memory: ✅ 39.204MB (SLO: <40.500MB -3.2%) vs baseline: +4.7% ✅ mediumTime: ✅ 6.999ms (SLO: <7.400ms -5.4%) vs baseline: ~same Memory: ✅ 34.583MB (SLO: <35.500MB -2.6%) vs baseline: +4.6% ✅ shallowTime: ✅ 0.943ms (SLO: <1.050ms 📉 -10.2%) vs baseline: +0.3% Memory: ✅ 34.603MB (SLO: <35.500MB -2.5%) vs baseline: +5.0%
|
Description
This PR adds a helper function to add a JSON-formatting handler to the root logger and clear all other handlers which can be passed to Ray's worker_process_setup_hook on ray.init to configure logging on other libraries of interest such as PyTorch, for example:
This helper function is NOT applied by default, so the user can decide whether it is appropriate for their specific needs. For example if the user has other tools expecting a specific log format they may not want to apply this.
However, we believe in most cases the user will want to format logs as JSON to enable tracer context injection and correlation with traces in the Dist AI Obs UI.
The
encoding="JSON"in logging_config only appear to affect messages directly emitted by the Ray logger - However most interesting messages for practical jobs are usually emitted by the main backend libraries used in the job, such as PyTorch, etc.Passing this function to worker_process_setup_hook results in it running LATE in the worker setup process, and unlike attempting the same thing in the --tracing-startup-hook (which runs early in the worker process setup) this configuration does not get blown away by Ray or third party libraries attempting something similar. According to the Ray docs, a worker_process_setup_hook is the recommended way to configure logging.
Testing
Risks
Minimal, since this is off by default.
Additional Notes