Detect financial fraud in startups using federated graph intelligence and AI-generated audit reports.
SentinelFL enables investment firms to collaboratively identify fraud — circular transactions, shell companies, inflated revenue — without sharing confidential data. Upload financials, see the fraud network, get an AI-generated audit report.
┌────────────────────────────────────────────────────────────────┐
│ React + Vite + Tailwind CSS │
│ Dashboard · Companies · Graph Explorer · Reports · FL Status │
└──────────────────────┬─────────────────────────────────────────┘
│ REST API (proxied via Vite)
┌──────────────────────▼─────────────────────────────────────────┐
│ FastAPI Backend │
│ ┌──────────┐ ┌──────────────┐ ┌───────────┐ ┌──────────────┐ │
│ │ Graph │ │ Fraud │ │ ML │ │ Federated │ │
│ │ Builder │ │ Detector │ │ Models │ │ Learning │ │
│ │ NetworkX │ │ Rules Engine │ │ IsoForest │ │ FedAvg + DP │ │
│ └──────────┘ └──────────────┘ └───────────┘ └──────────────┘ │
│ ┌──────────────────┐ ┌─────────────────┐ ┌────────────────┐ │
│ │ Risk Scorer │ │ GenAI Reports │ │ Explainer │ │
│ │ Weighted Formula │ │ OpenAI/Template │ │ SHAP-like │ │
│ └──────────────────┘ └─────────────────┘ └────────────────┘ │
└────────────────────────────────────────────────────────────────┘
| Concept | In SentinelFL |
|---|---|
| Graph nodes | Startup / portfolio companies (company_id in companies.csv). |
| Transaction history | transactions.csv → aggregated stats, graph edges, anomaly detection. |
| Company reports | Structured fields in companies.csv (revenue vs. bank inflow, sector, status) — NLP + rules treat these as “report text” proxies; extend with real PDFs via NLPProcessor later. |
| Personal / director reports | directors.csv (overlaps, other_directorships) → network and overlap features. |
| NLP → fixed attributes | NLPProcessor + feature_schema.py produce one standardized row per company (transactions + company + director + graph + keyword/NLP signals). See GET /features and GET /features/{id}. |
| Fraud model | FraudModelTrainer (MLP) learns from pseudo-labels derived from red-flag features. |
| Federated learning | FederatedEngine groups companies into notional fund clients, runs FedAvg + DP on weights, produces a global model. |
| Score every company on the global model | fl_engine.score_companies() → fl_fraud_probability per company; blended into composite risk as historical_pattern_score (15%). |
| Bank-wide view | GET /portfolio/summary — all companies sorted by risk, counts, FL status. Per-company narrative: POST /generate-report. |
End-to-end flow in code: run_initial_analysis() in backend/main.py (graphs → anomalies → NLP features → FL train → fl_scores → RiskScorer + FraudDetector).
- Python 3.10+
- Node.js 18+
pip install -r requirements.txtuvicorn backend.main:app --host 127.0.0.1 --port 8000The API will load sample data from data/ automatically and run initial fraud analysis on all 25 companies.
cd frontend
npm install
npm run devNavigate to http://localhost:5173 in your browser.
sentinelfl/
├── backend/
│ ├── main.py # FastAPI app with DataStore & startup analysis
│ ├── models.py # Pydantic data models
│ ├── routes/
│ │ ├── upload.py # POST /upload-data
│ │ ├── company.py # GET /companies, /company/{id}
│ │ ├── fraud.py # GET /fraud-analysis/{id}
│ │ ├── graph.py # GET /graph, /graph/{id}
│ │ ├── report.py # POST /generate-report, GET /report/{id}
│ │ ├── federated.py # GET /federated/status, POST /federated/simulate
│ │ ├── features.py # GET /features, /features/{id} — standardized dataset
│ │ ├── portfolio.py # GET /portfolio/summary — bank portfolio view
│ │ └── explain.py # GET /explain/{entity_id}
│ ├── services/
│ │ ├── fraud_detector.py # Rule-based fraud detection (6 detection methods)
│ │ ├── nlp_processor.py # NLP + structured features → fixed attribute rows
│ │ ├── feature_schema.py # Feature names and metadata for the ML dataset
│ │ └── risk_scorer.py # Composite risk scoring (includes FL probability)
│ ├── ml/
│ │ ├── anomaly_detector.py # Isolation Forest anomaly detection
│ │ ├── gnn_model.py # Graph-based risk scoring (PageRank, centrality)
│ │ ├── isolation_forest.py # Extended anomaly detector with feature extraction
│ │ └── explainer.py # SHAP-like feature importance explanations
│ ├── graph/
│ │ └── graph_builder.py # NetworkX graph construction from CSV data
│ ├── federated/
│ │ ├── simulator.py # FedAvg simulation with 5 financial institutions
│ │ ├── fedavg.py # Full federated learning with PyTorch MLP
│ │ └── dp_wrapper.py # Differential privacy noise injection
│ └── genai/
│ └── report_generator.py # GenAI audit reports (OpenAI or template fallback)
├── frontend/
│ ├── src/
│ │ ├── App.jsx # Sidebar layout with routing
│ │ ├── pages/
│ │ │ ├── Dashboard.jsx # Overview with stats, charts, top risks
│ │ │ ├── Companies.jsx # Searchable company listing with risk bars
│ │ │ ├── CompanyDetail.jsx # Deep-dive: risk gauge, signals, graph, SHAP
│ │ │ ├── UploadData.jsx # CSV drag-and-drop upload
│ │ │ ├── GraphExplorer.jsx # Interactive force-directed entity graph
│ │ │ ├── FederatedLearning.jsx # FL metrics, convergence chart, clients
│ │ │ └── Report.jsx # AI-generated audit report viewer
│ │ └── services/
│ │ └── api.js # API client for all endpoints
│ └── vite.config.js # Vite + Tailwind v4 + API proxy
├── data/
│ ├── transactions.csv # 10,000 transactions with planted fraud
│ ├── companies.csv # 25 companies (3 shell clusters, 2 inflated)
│ ├── directors.csv # 40 directors (5 controlling multiple entities)
│ └── generate_data.py # Dataset generation script
└── requirements.txt
| Method | Algorithm | What It Detects |
|---|---|---|
| Circular Transactions | DFS cycle detection | A → B → C → A money flows |
| Shared Directors | Director-company mapping | Shell company networks |
| Revenue Inflation | Revenue vs. bank inflow comparison | Fabricated growth |
| Fund Diversion | Egocentric network analysis | Money siphoned to personal accounts |
| Anomaly Detection | Isolation Forest (ML) | Statistically unusual transactions |
| Graph Risk | PageRank + Degree Centrality | Suspicious network positions |
Risk Score = (0.30 × Anomaly) + (0.30 × Graph Risk) + (0.25 × Rule Violations) + (0.15 × Historical)
| Score Range | Level | Action |
|---|---|---|
| 0–30 | Low | Standard due diligence |
| 30–70 | Medium | Enhanced monitoring |
| 70–100 | High | Escalate to forensic audit |
| Endpoint | Method | Description |
|---|---|---|
/upload-data |
POST | Upload CSV files for analysis |
/companies |
GET | List all companies with risk scores |
/company/{id} |
GET | Company profile with directors and flags |
/fraud-analysis/{id} |
GET | Full fraud analysis breakdown |
/graph/{id} |
GET | Entity graph data (ego network) |
/graph |
GET | Full entity graph |
/generate-report |
POST | Generate AI audit report |
/report/{id} |
GET | Retrieve generated report |
/federated/status |
GET | Federation metrics and accuracy |
/federated/simulate |
POST | Run a new FL training round |
/features |
GET | Standardized NLP/feature dataset for all companies |
/features/{id} |
GET | Feature vector + fl_fraud_probability for one company |
/portfolio/summary |
GET | Investment-bank view: all startups, risk mix, FL status |
/explain/{id} |
GET | SHAP-like feature importances |
The dataset includes deliberately planted fraud patterns:
- Circular transactions: Alpha Corp → Beta Ltd → Gamma Inc → Alpha Corp (30 rounds)
- Shell company clusters: Zeta/Eta/Theta, Rho/Sigma/Tau (dormant companies with 10x revenue inflation)
- Revenue inflation: Xi Global (100Cr reported, 30Cr actual), Upsilon Corp (85Cr reported, 22Cr actual)
- Fund diversion: Omicron Labs funneling money to personal accounts
- Shared directors: 5 directors controlling multiple entities across fraud clusters
Set your OpenAI API key for AI-generated audit reports:
export OPENAI_API_KEY=your-key-hereWithout the API key, the system generates comprehensive template-based reports.
| Component | Technology |
|---|---|
| Backend | FastAPI, Python 3.10+ |
| Frontend | React 19, Vite 8, Tailwind CSS v4 |
| Charts | Recharts |
| Icons | Lucide React |
| Graph Engine | NetworkX (in-memory) |
| ML | scikit-learn (Isolation Forest) |
| Federated Learning | Simulated FedAvg with DP |
| Reports | OpenAI GPT-4o-mini / Template engine |