-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
SARIF reporter #10759
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
SARIF reporter #10759
Conversation
Pierre-Sassoulas
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, this is great.
I'm not a SARIF expert so I have some code placement comments, and I hope someone more knowledgeable in SARIF will have an opinion about the actual content.
Codecov Report❌ Patch coverage is
❌ Your patch check has failed because the patch coverage (97.93%) is below the target coverage (100.00%). You can increase the patch coverage or adjust the target coverage. Additional details and impacted files@@ Coverage Diff @@
## main #10759 +/- ##
=======================================
Coverage 95.98% 95.99%
=======================================
Files 176 177 +1
Lines 19560 19657 +97
=======================================
+ Hits 18775 18870 +95
- Misses 785 787 +2
🚀 New features to boost your workflow:
|
Maybe @nvuillam is still active since they're the original requestor. |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
d2110f6 to
940ce59
Compare
This comment has been minimized.
This comment has been minimized.
|
As long as the sarif validator validates the SARIF output, it's good :) |
This comment has been minimized.
This comment has been minimized.
997fd0c to
e64b059
Compare
SARIF is a unified file format used to exchange information between static analysis tools (like pylint) and various types of formatters, meta-runners, broadcasters / alert system, ... This implementation is ad-hoc, and non-validating. Spec v Github ------------- Turns out Github both doesn't implement all of SARIF (which makes sense) and requires a bunch of properties which the spec considers optional. The [official SARIF validator][] (linked to by both oasis and github) was used to validate the output of the reporter, ensuring that all the github requirements it flags are fulfilled, and fixing *some* of the validator's pet issues. As of now the following issues are left unaddressed: - azure requires `run.automationDetails`, looking at the spec I don't think it makes sense for the reporter to inject that, it's more up to the CI - the validator wants a `run.versionControlProvenance`, same as above - the validator wants rule names in PascalCase, lol - the validator wants templated result messages, but without pylint providing the args as part of the `Message` that's a bit of a chore - the validator wants `region` to include a snippet (the flagged content) - the validator wants `physicalLocation` to have a `contextRegion` (most likely with a snippet) On URIs ------- The reporter makes use of URIs for artifacts (~files). Per ["guidance on the use of artifactLocation objects"][3.4.7], `uri` *should* capture the deterministic part of the artifact location and `uriBaseId` *should* capture the non-deterministic part. However as far as I can tell pylint has no requirement (and no clean way to require) consistent resolution roots: `path` is just relative to the cwd, and there is no requirement to have project-level files to use pylint. This makes the use of relative uris dodgy, but absolute uris are pretty much always broken for the purpose of *interchange* so they're not really any better. As a side-note, Github [asserts][relative-uri-guidance] > While this [nb: `originalUriBaseIds`] is not required by GitHub for > the code scanning results to be displayed correctly, it is required > to produce a valid SARIF output when using relative URI references. However per [3.4.4][] this is incorrect, the `uriBaseId` can be resolved through end-user configuration, `originalUriBaseIds`, external information (e.g. envvars), or heuristics. It would be nice to document the "relative root" via `originalUriBaseIds` (which may be omitted for that purpose per [3.14.14][], but per the above claiming a consistent project root is dodgy. We *could* resolve known project files (e.g. pyproject.toml, tox.ini, etc...) in order to find a consistent root (project root, repo root, ...) and set / use that for relative URIs but that's a lot of additional complexity which I'm not sure is warranted at least for a first version. Fixes pylint-dev#5493 [3.4.4]: https://docs.oasis-open.org/sarif/sarif/v2.1.0/csprd01/sarif-v2.1.0-csprd01.html#_Toc10540869 [3.4.7]: https://docs.oasis-open.org/sarif/sarif/v2.1.0/csprd01/sarif-v2.1.0-csprd01.html#_Toc10540872 [3.14.14]: https://docs.oasis-open.org/sarif/sarif/v2.1.0/csprd01/sarif-v2.1.0-csprd01.html#_Toc10540936 [relative-uri-guidance]: https://docs.github.com/en/code-security/code-scanning/integrating-with-code-scanning/sarif-support-for-code-scanning#relative-uri-guidance-for-sarif-producers [official SARIF validator]: https://sarifweb.azurewebsites.net/
|
🤖 According to the primer, this change has no effect on the checked open source code. 🤖🎉 This comment was generated for commit ebf822c |
|
Not entirely sure how to fix the coverage issue as it's mostly for the conversion of windows paths to relative URIs. |
Pierre-Sassoulas
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implementation LGTM, it seems we don't need to cover everything as long as we add sarif validation in the test suite ?
Not sure it's that useful: the type specs should be the useful (to pylint) subset of the json schema (modulo a rule or two which are painful to type), while the "official" validator with additional suggestions or service-specific requirements doesn't seem to be available as anything but a webservice of sorts? |
|
pylint maintainers will probably have to deal with bug fixes on this, so we need to have something to make sure this does not regress somehow. We do have windows specific CI jobs though, so we can cover the windows specific path. |
SARIF is a unified file format used to exchange information between static analysis tools (like pylint) and various types of formatters, meta-runners, broadcasters / alert system, ...
This implementation is ad-hoc, and non-validating.
Spec v Github
Turns out Github both doesn't implement all of SARIF (which makes sense) and requires a bunch of properties which the spec considers optional. The official SARIF validator (linked to by both oasis and github) was used to validate the output of the reporter, ensuring that all the github requirements it flags are fulfilled, and fixing some of the validator's pet issues.
As of now the following issues are left unaddressed:
run.automationDetails, looking at the spec I don't think it makes sense for the reporter to inject that, it's more up to the CIrun.versionControlProvenance, same as aboveMessagethat's a bit of a choreregionto include a snippet (the flagged content)physicalLocationto have acontextRegion(most likely with a snippet)On URIs
The reporter makes use of URIs for artifacts (~files). Per "guidance on the use of artifactLocation objects",
urishould capture the deterministic part of the artifact location anduriBaseIdshould capture the non-deterministic part. However as far as I can tell pylint has no requirement (and no clean way to require) consistent resolution roots:pathis just relative to the cwd, and there is no requirement to have project-level files to use pylint. This makes the use of relative uris dodgy, but absolute uris are pretty much always broken for the purpose of interchange so they're not really any better.As a side-note, Github asserts
However per 3.4.4 this is incorrect, the
uriBaseIdcan be resolved through end-user configuration,originalUriBaseIds, external information (e.g. envvars), or heuristics.It would be nice to document the "relative root" via
originalUriBaseIds(which may be omitted for that purpose per 3.14.14, but per the above claiming a consistent project root is dodgy.We could resolve known project files (e.g. pyproject.toml, tox.ini, etc...) in order to find a consistent root (project root, repo root, ...) and set / use that for relative URIs but that's a lot of additional complexity which I'm not sure is warranted at least for a first version.
Fixes #5493