-
Notifications
You must be signed in to change notification settings - Fork 523
Add ProjectDiscovery Cloud integration with changelog datastream #15760
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Add ProjectDiscovery Cloud integration with changelog datastream #15760
Conversation
packages/projectdiscovery_cloud/data_stream/vulnerability/agent/stream/stream.yml.hbs
Outdated
Show resolved
Hide resolved
clement-fouque
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't been able to ingest data into the stack. I have the following error:
"error": {
"message": [
"failed eval: ERROR: <input>:26:32: no such overload\n | (resp.StatusCode == 200) ?\n | ...............................^",
"Processor json with tag json_event_original in pipeline logs-projectdiscovery_cloud.changelogs-0.1.1 failed with message: field [original] not present as part of path [event.original]"
]
},| - name: base_url | ||
| type: text | ||
| title: ProjectDiscovery Cloud API Base URL | ||
| description: The base URL for the ProjectDiscovery Cloud API (e.g., https://api.projectdiscovery.io) | ||
| multi: false | ||
| required: true | ||
| show_user: true | ||
| default: https://api.projectdiscovery.io | ||
| - name: api_key | ||
| type: password | ||
| title: API Key | ||
| description: The API key for authenticating to ProjectDiscovery Cloud. | ||
| multi: false | ||
| required: true | ||
| show_user: true | ||
| secret: true | ||
| - name: team_id | ||
| type: text | ||
| title: Team ID | ||
| description: The Team ID for your ProjectDiscovery Cloud account. | ||
| multi: false | ||
| required: true | ||
| show_user: true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| - name: base_url | |
| type: text | |
| title: ProjectDiscovery Cloud API Base URL | |
| description: The base URL for the ProjectDiscovery Cloud API (e.g., https://api.projectdiscovery.io) | |
| multi: false | |
| required: true | |
| show_user: true | |
| default: https://api.projectdiscovery.io | |
| - name: api_key | |
| type: password | |
| title: API Key | |
| description: The API key for authenticating to ProjectDiscovery Cloud. | |
| multi: false | |
| required: true | |
| show_user: true | |
| secret: true | |
| - name: team_id | |
| type: text | |
| title: Team ID | |
| description: The Team ID for your ProjectDiscovery Cloud account. | |
| multi: false | |
| required: true | |
| show_user: true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| **Minimum versions:** | ||
| - Kibana: `^9.1.0` | ||
| - Elasticsearch: Compatible with Kibana version | ||
| - Elastic subscription: `platinum` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe it's false since it should be available in the basic/open-source subscription.
| @@ -0,0 +1,499 @@ | |||
| # ProjectDiscovery Cloud Integration | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Documentation is automatically generated if you define it in packages/XXX/_dev/build/docs/README.md
5d9cfc9 to
3c4a36e
Compare
🚀 Benchmarks reportTo see the full report comment with |
qcorporation
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll want to modify
- integrations/.github/ISSUE_TEMPLATE/integration_bug.yml
- integrations/.github/ISSUE_TEMPLATE/integration_feature_request.yml
warning: incomplete review - we'll need to come back to this
| @@ -0,0 +1,21 @@ | |||
| # newer versions go on top | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You'll want to collapse this into one entry as there will be one PR to reference to merge into main
| @@ -0,0 +1,191 @@ | |||
| # ProjectDiscovery Cloud Integration | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You'll most likely want to follow the documentation template that's been created by our team: https://www.elastic.co/docs/extend/integrations/documentation-guidelines
Reach out to @mjwolf and the Docs II team as he'll have the ability to auto-generate this documentation for you.
| - security | ||
| conditions: | ||
| kibana: | ||
| version: "^9.1.0" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to offer this integration to 8.x stack as well?
|
|
||
| type: integration | ||
| categories: | ||
| - security |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
might want to consider adding -cloud here
| title: ProjectDiscovery Cloud | ||
| description: Collect vulnerability changelogs and export results from ProjectDiscovery Cloud | ||
| inputs: | ||
| - type: cel |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this cel input require ssl configuraiton?
| type: keyword | ||
| - name: remediation | ||
| type: text | ||
| - name: reference |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we use vulnerability.reference?
| type: keyword | ||
| - name: category | ||
| type: keyword | ||
| - name: request |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we use any of the ecs http.request.* fields?
| type: keyword | ||
| - name: request | ||
| type: keyword | ||
| - name: response |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we use any of the http.response.* fields?
| - name: projectdiscovery | ||
| type: group | ||
| fields: | ||
| - name: target |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you use destination.* within ecs, possibly destination.address
| type: keyword | ||
| - name: vuln_hash | ||
| type: keyword | ||
| - name: scan_id |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you use vulnerability.report_id
clement-fouque
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I completed my review by adding multiple comments. Please let me know if you need clarification.
| # Custom vulnerability fields not in ECS 9.2.0 | ||
| - name: vulnerability.status | ||
| type: keyword | ||
| description: The state of the vulnerability (e.g., open, closed, resolved). | ||
| - name: vulnerability.scanner.type | ||
| type: keyword | ||
| description: The type of vulnerability scanner used (e.g., nuclei). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As mentioned by Dan here:
If you can avoid adding capitalised fields to published products, that would be ideal
I believe they should be removed.
| @@ -0,0 +1,96 @@ | |||
| config_version: 2 | |||
| interval: {{interval}} | |||
| resource.max_executions: 1000 | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
max_executions has a default value of 1000. I think it must be either removed, either we should add a field in the configuration (example).
| - rename: | ||
| field: json.vuln_status | ||
| target_field: vulnerability.status | ||
| ignore_missing: true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This field doesn't exist in vulnerability ECS field. It must be removed.
| - set: | ||
| field: vulnerability.scanner.type | ||
| value: nuclei |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This field doesn't exist in vulnerability ECS field. It must be removed.
| state.with( | ||
| { | ||
| "base": state.url.trim_right("/") + "/v1/scans/vuln/changelogs", | ||
| "q": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For readability, it would be easier to rename p as post and q as query
| @@ -0,0 +1,33 @@ | |||
| - external: ecs | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure it's required as ECS fields could be inherited (to be confirmed). For example, in the Qualys GAV, cloud ECS fields are not manually defined: https://github.com/elastic/integrations/tree/main/packages/qualys_gav/data_stream/asset/fields
| multi: false | ||
| required: false | ||
| show_user: true | ||
| default: low |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe default should be removed (or set to low,medium,high,critical).
| - date: | ||
| field: json.created_at | ||
| target_field: '@timestamp' | ||
| formats: | ||
| - ISO8601 | ||
| if: ctx.json?.created_at != null && ctx.json.created_at != '' | ||
| on_failure: | ||
| - append: | ||
| field: error.message | ||
| value: 'Processor {{{_ingest.on_failure_processor_type}}} with tag {{{_ingest.on_failure_processor_tag}}} in pipeline {{{_ingest.on_failure_pipeline}}} failed with message: {{{_ingest.on_failure_message}}}' | ||
| - date: | ||
| field: json.updated_at | ||
| target_field: '@timestamp' | ||
| formats: | ||
| - ISO8601 | ||
| if: ctx.json?.created_at == null && ctx.json?.updated_at != null && ctx.json.updated_at != '' | ||
| on_failure: | ||
| - append: | ||
| field: error.message | ||
| value: 'Processor {{{_ingest.on_failure_processor_type}}} with tag {{{_ingest.on_failure_processor_tag}}} in pipeline {{{_ingest.on_failure_pipeline}}} failed with message: {{{_ingest.on_failure_message}}}' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like the behaviour of modifying the @timestamp based on the created_at or updated_at fields. It makes trending impossible, unless we are created a transform that will store daily values.
I would be in favour to delete them in order to store the full export at each interval.
| - set: | ||
| field: event.module | ||
| value: projectdiscovery_cloud | ||
| - set: | ||
| field: event.dataset | ||
| value: projectdiscovery_cloud.changelogs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you align field names, dataset name and tags to either projectdiscovery or projectdiscovery_cloud ?
| @@ -0,0 +1,89 @@ | |||
| title: Collect Vulnerability Results from ProjectDiscovery Cloud | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a68ca88 to
ca7cc91
Compare
- Implements changelogs datastream for incremental vulnerability updates - Includes optional target field sanitizer (gated by 'sanitize_target' tag) to clean malformed API responses - Pipeline tests for both pass-through and sanitized cases - Full ECS mapping with vulnerability fields - Supports configurable interval, batch_size, and time_window - Auto-generated documentation in _dev/build/docs/README.md
…ngelog per Quan’s review (elastic#15760) Summary - Map HTTP bodies to ECS: - http.request.body.content - http.response.body.content - Map target → destination.address with IP/domain derivation - Map scan_id → vulnerability.report_id - Use vulnerability.reference exclusively (remove vendor duplicate) - Use host.hostname exclusively (remove vendor duplicate) - Map vendor tags → ECS tags; remove json.tags - Set vulnerability.scanner.vendor=ProjectDiscovery; remove non-ECS scanner.type - Remove vendor-specific duplicate fields (projectdiscovery.*) - Add cloud category to package manifest - Widen Kibana version support to ^8.18.0 || ^9.0.0 - Collapse changelog into a single 0.1.1 entry (per package guidelines) Deferred (intentional) - SSL/TLS configuration: optional; defer until a concrete need (custom CA, TLS-inspecting proxy, self-signed certs). If required, follow standard CEL pattern (manifest var + resource.ssl wiring). - Documentation template: defer until @mjwolf returns to coordinate auto-generation and template adoption (align badges/structure). - Issue templates: repo-wide infra; Quan indicated these will be handled centrally. Impact - Improves ECS compliance and consistency across data streams - Reduces noise by removing non-ECS and vendor-duplicate fields - Improves package discoverability (cloud category) and broadens compatibility (8.18+) References - Addresses Quan’s requested changes - Changelog entry linked to PR elastic#15760
ca7cc91 to
e6ac7cd
Compare
…ngelog per Quan’s review (elastic#15760) Summary - Map HTTP bodies to ECS: - http.request.body.content - http.response.body.content - Map target → destination.address with IP/domain derivation - Map scan_id → vulnerability.report_id - Use vulnerability.reference exclusively (remove vendor duplicate) - Use host.hostname exclusively (remove vendor duplicate) - Map vendor tags → ECS tags; remove json.tags - Set vulnerability.scanner.vendor=ProjectDiscovery; remove non-ECS scanner.type - Remove vendor-specific duplicate fields (projectdiscovery.*) - Add cloud category to package manifest - Widen Kibana version support to ^8.18.0 || ^9.0.0 - Collapse changelog into a single 0.1.1 entry (per package guidelines) Deferred (intentional) - SSL/TLS configuration: optional; defer until a concrete need (custom CA, TLS-inspecting proxy, self-signed certs). If required, follow standard CEL pattern (manifest var + resource.ssl wiring). - Documentation template: defer until @mjwolf returns to coordinate auto-generation and template adoption (align badges/structure). - Issue templates: repo-wide infra; Quan indicated these will be handled centrally. Impact - Improves ECS compliance and consistency across data streams - Reduces noise by removing non-ECS and vendor-duplicate fields - Improves package discoverability (cloud category) and broadens compatibility (8.18+) References - Addresses Quan’s requested changes - Changelog entry linked to PR elastic#15760
e6ac7cd to
e3f83fc
Compare
…ngelog per Quan’s review (elastic#15760) Summary - Map HTTP bodies to ECS: - http.request.body.content - http.response.body.content - Map target → destination.address with IP/domain derivation - Map scan_id → vulnerability.report_id - Use vulnerability.reference exclusively (remove vendor duplicate) - Use host.hostname exclusively (remove vendor duplicate) - Map vendor tags → ECS tags; remove json.tags - Set vulnerability.scanner.vendor=ProjectDiscovery; remove non-ECS scanner.type - Remove vendor-specific duplicate fields (projectdiscovery.*) - Add cloud category to package manifest - Widen Kibana version support to ^8.18.0 || ^9.0.0 - Collapse changelog into a single 0.1.1 entry (per package guidelines) Deferred (intentional) - SSL/TLS configuration: optional; defer until a concrete need (custom CA, TLS-inspecting proxy, self-signed certs). If required, follow standard CEL pattern (manifest var + resource.ssl wiring). - Documentation template: defer until @mjwolf returns to coordinate auto-generation and template adoption (align badges/structure). - Issue templates: repo-wide infra; Quan indicated these will be handled centrally. Impact - Improves ECS compliance and consistency across data streams - Reduces noise by removing non-ECS and vendor-duplicate fields - Improves package discoverability (cloud category) and broadens compatibility (8.18+) References - Addresses Quan’s requested changes - Changelog entry linked to PR elastic#15760
e3f83fc to
c3b370b
Compare
Implement Clement's requested changes for changelogs data stream:
- Switch vulnerability field mappings from `rename` to `set/copy_from` for better transparency
- Remove non-ECS `vulnerability.status` field mapping
- Add vendor-specific `projectdiscovery.vuln_status` field to preserve status information
- Update event message template to use `{{projectdiscovery.vuln_status}}`
- Remove `vulnerability.status` field definition from schema (not in ECS spec)
- Add `projectdiscovery.vuln_status` keyword field to schema
This change ensures better ECS compliance while preserving all vendor-specific
data in the projectdiscovery namespace, aligning with the preserve_duplicate_custom_fields pattern.
Address review feedback from Clement Fouque, including critical system test fixes and four enhancement areas. ## System Test Fixes Fixed CEL syntax errors blocking system tests: - Added missing closing parentheses in state.with() calls - Removed trailing commas in CEL object literals - Removed publisher_pipeline.disable_host setting that blocked data routing System tests now pass: changelogs (4 hits), export (1 hit) ## SSL/TLS Configuration Support Added optional SSL configuration for enterprise environments: - New `ssl` variable in both data stream manifests - Supports verification_mode, certificate_authorities, ca_trusted_fingerprint - Default: all commented out, doesn't affect standard HTTPS ## Export Timestamp Semantics Changed export data stream to use ingestion time for @timestamp: - Removed date processors that toggled between created_at/updated_at - Vendor timestamps preserved in projectdiscovery.created_at/updated_at - Enables proper trending for snapshot-style exports ## Field Organization - Removed redundant ecs.yml files (inherited via index templates) - Removed subobjects: false from export manifest (fixes field nesting) - Removed default: low from severity filter (matches "export all" description) ## Test Results - ✅ Package check: Pass - ✅ Pipeline tests: Pass (changelogs 4/4, export 2/2) - ✅ System tests: Pass (4 hits, 1 hit) Fixes elastic#15061 Co-authored-by: Clement Fouque <[email protected]>
7d39066 to
88d1688
Compare
…ry-cloud-integration
2e45cee to
d0dc964
Compare
💚 Build Succeeded
History
|



Description
This PR implements the ProjectDiscovery Cloud integration with the changelogs and export datastream.
Current Status:
Implementation Details:
/v1/scans/vuln/changelogsRelates to #15061
TODO:
/v1/scans/results/exportendpoint (all vulnerability results)