Skip to content

Conversation

@jeremylong
Copy link
Collaborator

Description of Change

First of three PRs around cleaning up the differences between the generated and base suppression file. This PR enhances the build to download the current generated suppression file, packages it into the core lib, and loads both the base and generated suppression files during execution.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces a mechanism to package a generated suppressions file with the application during build time by downloading it from a hosted URL. The downloaded file is then loaded alongside the existing base suppression file.

  • Adds Maven exec plugin to download suppressions from a hosted URL during the build process
  • Extends the suppression loading logic to include the downloaded generated-suppressions.xml file
  • Refactors the packaged suppression loading method to accept a filename parameter for reusability

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 5 comments.

File Description
pom.xml Adds version management for exec-maven-plugin
core/pom.xml Configures exec-maven-plugin to download suppressions file using wget during build
core/src/main/java/org/owasp/dependencycheck/analyzer/AbstractSuppressionAnalyzer.java Adds constant for generated suppressions filename and refactors loading logic to support both base and generated suppression files

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

nhumblot
nhumblot previously approved these changes Nov 12, 2025
@chadlwilson chadlwilson self-assigned this Nov 12, 2025
@chadlwilson
Copy link
Collaborator

With this approach, ODC needs to load the generatedSuppressions twice, once off packaged, and again off "hosted" suppressions within the HostedSuppressionsDataSource - which will largely overlap. Does it de-duplicate these in memory at least? Is this a significant source of slowness?

It's slightly more complex, but I wonder if it would be cleaner to make this self contained somewhere within the below to just pre-populate the user's cache in the data directory on disk; to keep all logic and workflow related to these suppressions self-contained, and with consistent language?

With this approach it would happen

  • on first use, prior to any attempt to update
  • after hosted suppressions are purged
  • regardless of whether hosted suppressions updates are enabled.

if (!repoFile.isFile() || repoFile.length() <= 1L) {
repoEmpty = true;
LOGGER.warn("Hosted Suppressions file is empty or missing - attempting to force the update");
getSettings().setBoolean(Settings.KEYS.HOSTED_SUPPRESSIONS_FORCEUPDATE, true);
}

/**
* The file name of the generated suppression XML file.
*/
private static final String GENERATED_SUPPRESSION_FILE = "generated-suppressions.xml";
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we keep this approach, perhaps we could try and align on language to make it a bit less confusing.

Right now it is referred to as

  • "hosted" suppressions (main runtime terminology at runtime in ODC; but implies a remote source)
  • "generated" suppressions (really refers to the automation workflows and the special branch they are tracked on)
  • "published" suppressions (the file name when promoted to GitHub pages from source control, and bootstrapped here)

I wonder if referring to as dependencycheck-published-suppression-snapshot.xml would be better, or dependencycheck-hosted-suppression-snapshot.xml.

@chadlwilson chadlwilson removed their assignment Nov 18, 2025
Co-authored-by: Chad Wilson <[email protected]>
@nhumblot
Copy link
Collaborator

With this approach, ODC needs to load the generatedSuppressions twice, once off packaged, and again off "hosted" suppressions within the HostedSuppressionsDataSource - which will largely overlap. Does it de-duplicate these in memory at least? Is this a significant source of slowness?

It's slightly more complex, but I wonder if it would be cleaner to make this self contained somewhere within the below to just pre-populate the user's cache in the data directory on disk; to keep all logic and workflow related to these suppressions self-contained, and with consistent language?

With this approach it would happen

* on first use, prior to any attempt to update

* after hosted suppressions are purged

* regardless of whether hosted suppressions updates are enabled.

if (!repoFile.isFile() || repoFile.length() <= 1L) {
repoEmpty = true;
LOGGER.warn("Hosted Suppressions file is empty or missing - attempting to force the update");
getSettings().setBoolean(Settings.KEYS.HOSTED_SUPPRESSIONS_FORCEUPDATE, true);
}

On my machine, a forced update took 148ms to be loaded on 4G connection with a phone.

We seem to load all elements in a List<SuppressionRule> and we add them through a void addAll(List<SuppressionRule>) call. I don't see any de-duplication so my guess is we are duplicating suppression rules between base and hosted suppressions as of now. It does not change much from now, so I may say it can be a future improvement opportunity (impact would have to be more thoroughly investigated) that could be tracked, and not a necessary change in this PR 🤔

</testResources>
<plugins>
<plugin>
<groupId>com.googlecode.maven-download-plugin</groupId>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue: this seems to generate the following warning when performing a mvn -s settings.xml clean install:

[WARNING] 
[WARNING] Some problems were encountered while building the effective model for org.owasp:dependency-check-core:jar:12.1.10-SNAPSHOT
[WARNING] 'build.plugins.plugin.version' for com.googlecode.maven-download-plugin:download-maven-plugin is missing. @ line 103, column 21
[WARNING] 
[WARNING] It is highly recommended to fix these problems because they threaten the stability of your build.
[WARNING] 
[WARNING] For this reason, future Maven versions might no longer support building such malformed projects.
[WARNING] 

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
<groupId>com.googlecode.maven-download-plugin</groupId>
<groupId>io.github.download-maven-plugin</groupId>

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, missed this in my proposed edit earlier - thx!

@chadlwilson
Copy link
Collaborator

chadlwilson commented Nov 19, 2025

My concern is more about evaluation/analysis rather than loading the file.

I haven't looked at how smart the algorithm is or how parallelized, but my concern would be that there is an O(n^2) issue here, e.g if every suppression is evaluated for every single possible dependency. We also don't really have any automation to ensure the regexes aren't insanely expensive which might have a multiplicative effect.

On a bigger project with, say, 1,000 dependencies (JavaScript!!) and lots of FPs due to use of CLI and generic dependency names, that might be an issue - either slowness or excessive junk/GC production.

Aside from perf, I also think it's a bit confusing to have 3 data sources in the code to reason about, which is why I suggest using this file essentially as an initial cache value for HostedSuppressions instead (when working offline, or if stale) since it is not valid/useful to have both the packaged version and a version received remotely. I'm happy to test and suggest a PR to this one if @jeremylong wants to see what it might look like.

That'd avoid having to do any de-duplication of individual suppressions since we can generally assume there are very limited duplications between base and hosted/generated/published (especially after Jeremy's adjacent cleanup PRs).

I think that helps us as it's already quite tough figuring out why we might not be able to replicate a given FP given combinations of (ODC version, ODC variant, enabled analyzers, enabled data sources, available data sources, hosted suppressions currentness, offline status, available local tools)

@jeremylong
Copy link
Collaborator Author

I like the idea of the embedded being the initial cache - especially if you don't mind putting together the PR ;).

I was also going to make sure there was a mechanism to prevent loading the same rule twice...

@jeremylong jeremylong added this to the 12.2.0 milestone Nov 20, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core changes to core

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants