SentinelFL — AI-Powered Federated Due Diligence Platform

Detect financial fraud in startups using federated graph intelligence and AI-generated audit reports.

SentinelFL enables investment firms to collaboratively identify fraud — circular transactions, shell companies, inflated revenue — without sharing confidential data. Upload financials, see the fraud network, get an AI-generated audit report.

Architecture

┌────────────────────────────────────────────────────────────────┐
│  React + Vite + Tailwind CSS                                   │
│  Dashboard · Companies · Graph Explorer · Reports · FL Status  │
└──────────────────────┬─────────────────────────────────────────┘
                       │ REST API (proxied via Vite)
┌──────────────────────▼─────────────────────────────────────────┐
│  FastAPI Backend                                                │
│  ┌──────────┐ ┌──────────────┐ ┌───────────┐ ┌──────────────┐ │
│  │  Graph    │ │  Fraud       │ │  ML       │ │  Federated   │ │
│  │  Builder  │ │  Detector    │ │  Models   │ │  Learning    │ │
│  │ NetworkX  │ │ Rules Engine │ │ IsoForest │ │ FedAvg + DP  │ │
│  └──────────┘ └──────────────┘ └───────────┘ └──────────────┘ │
│  ┌──────────────────┐ ┌─────────────────┐ ┌────────────────┐  │
│  │  Risk Scorer      │ │  GenAI Reports  │ │  Explainer     │  │
│  │ Weighted Formula  │ │ OpenAI/Template │ │  SHAP-like     │  │
│  └──────────────────┘ └─────────────────┘ └────────────────┘  │
└────────────────────────────────────────────────────────────────┘

Investment bank mental model

Concept	In SentinelFL
Graph nodes	Startup / portfolio companies (`company_id` in `companies.csv`).
Transaction history	`transactions.csv` → aggregated stats, graph edges, anomaly detection.
Company reports	Structured fields in `companies.csv` (revenue vs. bank inflow, sector, status) — NLP + rules treat these as “report text” proxies; extend with real PDFs via `NLPProcessor` later.
Personal / director reports	`directors.csv` (overlaps, `other_directorships`) → network and overlap features.
NLP → fixed attributes	`NLPProcessor` + `feature_schema.py` produce one standardized row per company (transactions + company + director + graph + keyword/NLP signals). See `GET /features` and `GET /features/{id}`.
Fraud model	`FraudModelTrainer` (MLP) learns from pseudo-labels derived from red-flag features.
Federated learning	`FederatedEngine` groups companies into notional fund clients, runs FedAvg + DP on weights, produces a global model.
Score every company on the global model	`fl_engine.score_companies()` → `fl_fraud_probability` per company; blended into composite risk as historical_pattern_score (15%).
Bank-wide view	`GET /portfolio/summary` — all companies sorted by risk, counts, FL status. Per-company narrative: `POST /generate-report`.

End-to-end flow in code: run_initial_analysis() in backend/main.py (graphs → anomalies → NLP features → FL train → fl_scores → RiskScorer + FraudDetector).

Quick Start

Prerequisites

Python 3.10+
Node.js 18+

1. Install backend dependencies

pip install -r requirements.txt

2. Start the backend server

uvicorn backend.main:app --host 127.0.0.1 --port 8000

The API will load sample data from data/ automatically and run initial fraud analysis on all 25 companies.

3. Install and start the frontend

cd frontend
npm install
npm run dev

4. Open the dashboard

Navigate to http://localhost:5173 in your browser.

Project Structure

sentinelfl/
├── backend/
│   ├── main.py                # FastAPI app with DataStore & startup analysis
│   ├── models.py              # Pydantic data models
│   ├── routes/
│   │   ├── upload.py          # POST /upload-data
│   │   ├── company.py         # GET /companies, /company/{id}
│   │   ├── fraud.py           # GET /fraud-analysis/{id}
│   │   ├── graph.py           # GET /graph, /graph/{id}
│   │   ├── report.py          # POST /generate-report, GET /report/{id}
│   │   ├── federated.py       # GET /federated/status, POST /federated/simulate
│   │   ├── features.py        # GET /features, /features/{id} — standardized dataset
│   │   ├── portfolio.py       # GET /portfolio/summary — bank portfolio view
│   │   └── explain.py         # GET /explain/{entity_id}
│   ├── services/
│   │   ├── fraud_detector.py  # Rule-based fraud detection (6 detection methods)
│   │   ├── nlp_processor.py   # NLP + structured features → fixed attribute rows
│   │   ├── feature_schema.py  # Feature names and metadata for the ML dataset
│   │   └── risk_scorer.py     # Composite risk scoring (includes FL probability)
│   ├── ml/
│   │   ├── anomaly_detector.py    # Isolation Forest anomaly detection
│   │   ├── gnn_model.py          # Graph-based risk scoring (PageRank, centrality)
│   │   ├── isolation_forest.py   # Extended anomaly detector with feature extraction
│   │   └── explainer.py          # SHAP-like feature importance explanations
│   ├── graph/
│   │   └── graph_builder.py   # NetworkX graph construction from CSV data
│   ├── federated/
│   │   ├── simulator.py       # FedAvg simulation with 5 financial institutions
│   │   ├── fedavg.py          # Full federated learning with PyTorch MLP
│   │   └── dp_wrapper.py      # Differential privacy noise injection
│   └── genai/
│       └── report_generator.py # GenAI audit reports (OpenAI or template fallback)
├── frontend/
│   ├── src/
│   │   ├── App.jsx            # Sidebar layout with routing
│   │   ├── pages/
│   │   │   ├── Dashboard.jsx      # Overview with stats, charts, top risks
│   │   │   ├── Companies.jsx      # Searchable company listing with risk bars
│   │   │   ├── CompanyDetail.jsx  # Deep-dive: risk gauge, signals, graph, SHAP
│   │   │   ├── UploadData.jsx     # CSV drag-and-drop upload
│   │   │   ├── GraphExplorer.jsx  # Interactive force-directed entity graph
│   │   │   ├── FederatedLearning.jsx  # FL metrics, convergence chart, clients
│   │   │   └── Report.jsx        # AI-generated audit report viewer
│   │   └── services/
│   │       └── api.js         # API client for all endpoints
│   └── vite.config.js         # Vite + Tailwind v4 + API proxy
├── data/
│   ├── transactions.csv       # 10,000 transactions with planted fraud
│   ├── companies.csv          # 25 companies (3 shell clusters, 2 inflated)
│   ├── directors.csv          # 40 directors (5 controlling multiple entities)
│   └── generate_data.py       # Dataset generation script
└── requirements.txt

Fraud Detection Methods

Method	Algorithm	What It Detects
Circular Transactions	DFS cycle detection	A → B → C → A money flows
Shared Directors	Director-company mapping	Shell company networks
Revenue Inflation	Revenue vs. bank inflow comparison	Fabricated growth
Fund Diversion	Egocentric network analysis	Money siphoned to personal accounts
Anomaly Detection	Isolation Forest (ML)	Statistically unusual transactions
Graph Risk	PageRank + Degree Centrality	Suspicious network positions

Risk Score Formula

Risk Score = (0.30 × Anomaly) + (0.30 × Graph Risk) + (0.25 × Rule Violations) + (0.15 × Historical)

Score Range	Level	Action
0–30	Low	Standard due diligence
30–70	Medium	Enhanced monitoring
70–100	High	Escalate to forensic audit

API Endpoints

Endpoint	Method	Description
`/upload-data`	POST	Upload CSV files for analysis
`/companies`	GET	List all companies with risk scores
`/company/{id}`	GET	Company profile with directors and flags
`/fraud-analysis/{id}`	GET	Full fraud analysis breakdown
`/graph/{id}`	GET	Entity graph data (ego network)
`/graph`	GET	Full entity graph
`/generate-report`	POST	Generate AI audit report
`/report/{id}`	GET	Retrieve generated report
`/federated/status`	GET	Federation metrics and accuracy
`/federated/simulate`	POST	Run a new FL training round
`/features`	GET	Standardized NLP/feature dataset for all companies
`/features/{id}`	GET	Feature vector + `fl_fraud_probability` for one company
`/portfolio/summary`	GET	Investment-bank view: all startups, risk mix, FL status
`/explain/{id}`	GET	SHAP-like feature importances

Sample Data (Planted Fraud)

The dataset includes deliberately planted fraud patterns:

Circular transactions: Alpha Corp → Beta Ltd → Gamma Inc → Alpha Corp (30 rounds)
Shell company clusters: Zeta/Eta/Theta, Rho/Sigma/Tau (dormant companies with 10x revenue inflation)
Revenue inflation: Xi Global (100Cr reported, 30Cr actual), Upsilon Corp (85Cr reported, 22Cr actual)
Fund diversion: Omicron Labs funneling money to personal accounts
Shared directors: 5 directors controlling multiple entities across fraud clusters

Optional: Enable GenAI Reports

Set your OpenAI API key for AI-generated audit reports:

export OPENAI_API_KEY=your-key-here

Without the API key, the system generates comprehensive template-based reports.

Tech Stack

Component	Technology
Backend	FastAPI, Python 3.10+
Frontend	React 19, Vite 8, Tailwind CSS v4
Charts	Recharts
Icons	Lucide React
Graph Engine	NetworkX (in-memory)
ML	scikit-learn (Isolation Forest)
Federated Learning	Simulated FedAvg with DP
Reports	OpenAI GPT-4o-mini / Template engine

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
backend		backend
data		data
frontend		frontend
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SentinelFL — AI-Powered Federated Due Diligence Platform

Architecture

Investment bank mental model

Quick Start

Prerequisites

1. Install backend dependencies

2. Start the backend server

3. Install and start the frontend

4. Open the dashboard

Project Structure

Fraud Detection Methods

Risk Score Formula

API Endpoints

Sample Data (Planted Fraud)

Optional: Enable GenAI Reports

Tech Stack

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

SentinelFL — AI-Powered Federated Due Diligence Platform

Architecture

Investment bank mental model

Quick Start

Prerequisites

1. Install backend dependencies

2. Start the backend server

3. Install and start the frontend

4. Open the dashboard

Project Structure

Fraud Detection Methods

Risk Score Formula

API Endpoints

Sample Data (Planted Fraud)

Optional: Enable GenAI Reports

Tech Stack

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages