-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Capture code snippet from diagnostic compiler output #9362
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
* Built tentative test class SwiftBuildSystemOutputParser to handle the compiler output specifically * Added a handleDiagnostic method to possibly substitute the emitEvent local scope implementation of handling a SwiftBuildMessage diagnostic
* the flag `appendToOutputStream` helps us to determine whether a diagnostic is to be emitted or whether we'll be emitting the compiler output via OutputInfo * separate the emitEvent method into the SwiftBuildSystemMessageHandler
a20f020 to
f3aaabf
Compare
1dcaaeb to
c48e606
Compare
|
@swift-ci please test |
owenv
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thie generally lgtm but I have some concerns about the regex-based parsing when we emit the textual compiler output.
- Perf - It's important this is fast so that it doesn't block the end of the build if a command produces huge quantities of output. It's hard to say if this will be a real issue without some testing
- We're re-parsing information which we're already getting from the compiler in structured form. I see the appeal of not reporting a diagnostic twice if multiple compile jobs report it though
| targetsByID[target.targetID] = target | ||
| } | ||
|
|
||
| mutating func target(for task: SwiftBuild.SwiftBuildMessage.TaskStartedInfo) throws -> SwiftBuild.SwiftBuildMessage.TargetStartedInfo? { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this need to be mutating?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does not, will modify this
| buildState.appendToBuffer(taskSignature, data: info.data) | ||
| } | ||
|
|
||
| return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we're unable to get a taskID, this may be target/global output, so I think we should print it the best we can instead of returning and silently dropping it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also suggested using a locationContext as the key, which might help.
It might be about time to clean up these deprecated APIs, too. I recall there was something missing from the new ones which blocked total removal.
|
|
||
| // If we've captured the compiler output with formatted diagnostics, emit them. | ||
| // if let buffer = buildState.dataBuffer(for: startedInfo) { | ||
| emitDiagnosticCompilerOutput(startedInfo) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be something other than a compile task which emits output with different formatting than we expect
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely. There are some guards in emitDiagnosticCompilerOutput that checks whether there is an existing data buffer that we've logged from the received SwiftBuildMessage.OutputInfo case so if this is anything else it will not be emitted.
Is it preferred to have this check be more explicit re: asserting this before calling this method?
|
|
||
| /// Split compiler output into individual diagnostic segments | ||
| /// Format: /path/to/file.swift:line:column: severity: message | ||
| private func splitIntoDiagnostics(_ output: String) -> [ParsedDiagnostic] { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should be doing any output parsing at this level, Swift Build already has an entire subsystem dedicated to doing this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jakepetroules I would agree, however the events we're receiving from SwiftBuild accumulates every diagnostic message into a singular data buffer -- when emitting this as-is, it's possible that we'd be coupling some info-level diagnostics with error-level diagnostics with no way to separate the severity. Splitting the string on a per-diagnostic basis is the only way to achieve this in this way.
The observability scope that we use to emit these diagnostics will capture the entire string blob, and we have to decide with what severity to emit the entire message. I'm not sure if there's an alternative path here that would give us the same ergonomics on the user front, but IMO it's preferred to separate each diagnostic and ensure that we're emitting them with the appropriate severity, rather than sending the entire string of possibly many diagnostics with varying severities.
| struct BuildState { | ||
| private var targetsByID: [Int: SwiftBuild.SwiftBuildMessage.TargetStartedInfo] = [:] | ||
| private var activeTasks: [Int: SwiftBuild.SwiftBuildMessage.TaskStartedInfo] = [:] | ||
| private var taskBuffer: [String: Data] = [:] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of a string as the key, can you use a SwiftBuildMessage.LocationContext?
| buildSystem.delegate?.buildSystem(buildSystem, didFinishCommand: BuildSystemCommand(startedInfo, targetInfo: targetInfo)) | ||
| if let targetName = targetInfo?.targetName { | ||
| serializedDiagnosticPathsByTargetName[targetName, default: []].append(contentsOf: startedInfo.serializedDiagnosticsPaths.compactMap { | ||
| try? Basics.AbsolutePath(validating: $0.pathString) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just let this throw, if this isn't an absolute path something has gone very wrong.
| self.observabilityScope.emit(info: "\(info.executionDescription)") | ||
| } | ||
|
|
||
| if self.logLevel.isVerbose { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should not write any output on task started events and instead always defer its emission to task completed events. The reason is that multiple tasks can start in parallel, and the outputStream content emitted when the taskStarted and taskEnded events are emitted may not be reasonably ordered at all. so you could have:
- task 1 header
- task 2 output
- task 1 header
- task 2 output
...which will be really confusing since task 2's output will look like it belongs to task 1, and so on.
The observabilityScope stuff might be OK, provided that doesn't also end up in the textual output stream and is only for API clients.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the detailed explanation! I'm in agreement -- this is some legacy implementation that I moved from the inline method to this class, so it's possible that this kind of behaviour is sprinkled elsewhere in emitEvent. I'll keep an eye on that :)
I also pointed this out inline, though I'm not sure why the re-parsing relates to deduplication? We should be able to deduplicate diagnostics whether we parse them again or use the existing ones. |
This PR refactors diagnostic handling in the Swift build system by introducing a dedicated message handler and per-task output buffering to properly parse and emit compiler diagnostics individually.
Key Changes
SwiftBuildSystemMessageHandler
SwiftBuildMessageevents from the build operationPer-Task Data Buffering
taskBuffer: [String: Data]dictionary inBuildStateto capture compiler output per task signature.outputmessages arriveDiagnostic Parsing & Emission
splitIntoDiagnostics()method that uses regex to parse compiler output into individualdiagnostic segments (
ParsedDiagnostic)DiagnosticInfodepending on whether the flagappendToOutputStreamis true.carets, etc.)