Utilities for downloading OSV data, enriching vulnerabilities with a recidivism metric, and cloning referenced source repositories locally.
Copy the default config and edit your local paths:
cp recidivism.default.ini recidivism.iniBoth scripts read settings from recidivism.ini. If that file is missing, the
scripts print guidance and fall back to recidivism.default.ini.
python scripts/enrich_osv_recidivism.py \
--output data/osv_recidivism.jsonlThis script:
- downloads the OSV dump (
OSV-all.zipby default), - extracts all vulnerabilities,
- computes a recidivism metric using CWE recurrence and repository/fix history,
- appends recidivism details to each vulnerability and writes JSONL output.
python scripts/clone_osv_repositories.py \
--osv-dir data/osv_dump \
--target-dir data/repos \
--update-existingThis script scans OSV vulnerabilities for GitHub source references and
clones/updates local copies for research workflows (organized as
<target-dir>/<owner>/<repo>).
python scripts/cleanup_empty_repos.py --path data/repos --dry-runThe script cleanup_empty_repos.py deletes empty repositories that were created in the cloning process. These repos either no longer exist or were privated. This command runs a dry-run without permanent changes.
python scripts/cleanup_empty_repos.py --path data/repos --yesThis command runs the script and removes empty directories without user prompts.