Interactive data visualizations and analysis of the November 2025 House Oversight Committee Epstein document release.
- Embedding Cluster Map - UMAP projection of all 69,290 document embeddings showing semantic clusters
- Network Graph - Co-occurrence network of 31 named entities with 110 connections
- Document Distribution - Breakdown by document type and volume
- RAG Mining Report - Comprehensive findings with fact-checking, source citations, and novelty classifications
- Visualization Summary - Statistics and methodology
- Total Documents: 69,290 chunks
- Embedding Model: Ollama nomic-embed-text (768-dim)
- Source: House Oversight Committee Epstein Document Release (November 2025)
- BoxesBlue/epstein-files-nov11-25-house-post-ocr-embeddings - Full embeddings dataset
- tensonaut/EPSTEIN_FILES_20K - Original OCR'd documents
Chomsky extended timeline through May 2019 - Communications documented through May 26, 2019 (weeks before July arrest), including December 2018 "all in" quote for documentary project.
- Barry J. Cohen / JEGE Inc correspondence (October 2017)
- Nobel Charitable Trust Symposium (September 10, 2010)
- Seminar-POWER guest list specifics
- $35M Harvard donation claim (actual was $6.5M per Harvard investigation)
- Some operational details were already publicly known (Visoski testimony 2021, etc.)
See RAG Mining Report for full details.
# Generate visualizations
python epstein_visualizations.pyumap-learn- Dimensionality reductionplotly- Interactive visualizationsnetworkx- Graph analysispandas- Data manipulation
These documents include both verified law enforcement records AND Epstein's own promotional materials. Cross-reference all findings with independent sources.
Data is from public government document releases. Visualizations and analysis are provided for research purposes.
Generated November 2025