AgentForesight reframes multi-agent failure analysis from post-hoc diagnosis of completed trajectories to online auditing of unfolding ones. At each step of an unfolding trajectory, an auditor observes only the current prefix and must either continue the run or alarm at the earliest decisive error, opening a runtime intervention window before downstream propagation locks in failure.
We release AFTraj-2K, a curated corpus of 2,276 multi-agent trajectories (1,162 safe + 1,114 unsafe) across Coding, Math, and Agentic domains, and AgentForesight-7B, a compact online auditor trained with a coarse-to-fine reinforcement learning recipe.
- Online auditing protocol — We introduce online auditing, a deployment-time reframing of agentic failure analysis that audits unfolding trajectories step by step rather than diagnosing them after failure.
- AFTraj-2K dataset — We construct AFTraj-2K, a curated corpus of agentic trajectories spanning Coding, Math, and Agentic domains, pairing strictly filtered safe runs with multi-judge verified failure runs annotated at their decisive error step
- A compact online auditor — We develop AgentForesight-7B, a compact online auditor trained via a coarse-to-fine RL recipe that first equips it with a risk-anticipation prior at the failure boundary, then sharpens this prior into precise step-level localization under the structure, timing, and attribution optimization
-
AgentForesight-7B outperforms larger proprietary judges — 66.44 overall Exact-F1 on AFTraj-2K, +19.9 points above DeepSeek-V4-Pro and a 3
$\times$ tighter Absolute Step Shift (ASS).
A unified corpus of multi-agent trajectories collected, filtered, and annotated for online auditing.
Hosted on 🤗HuggingFace: ZBox008003/AFTraj.
| Domain | Safe | Unsafe | Total |
|---|---|---|---|
| Math | 396 | 397 | 793 |
| Coding | 361 | 247 | 608 |
| Agentic | 405 | 470 | 875 |
| TOTAL | 1,162 | 1,114 | 2,276 |
git clone https://github.com/ZBox1005/AgentForesight.git
cd AgentForesight
pip install -r requirements.txtfrom huggingface_hub import snapshot_download
import pandas as pd
local_dir = snapshot_download(repo_id="ZBox008003/AFTraj", repo_type="dataset")
safe = pd.read_parquet(f"{local_dir}/aftraj_safe.parquet")
unsafe = pd.read_parquet(f"{local_dir}/aftraj_unsafe.parquet")
print(safe.shape, unsafe.shape)
print(unsafe.iloc[0][["conv_id", "domain", "mistake_step", "mistake_agent"]])Local model (transformers):
python -m inference.infer_local \
--model-path <hf_repo_or_local_path> \
--data-dir <path_to_dataset> \
--output-dir ./outputs/auditor_localOpenAI-compatible API (GPT-4.1, DeepSeek, vLLM-served local model, ...):
export OPENAI_API_KEY=sk-...
python -m inference.infer_api \
--model gpt-4.1 \
--data-dir <path_to_dataset> \
--output-dir ./outputs/gpt41Add --paper-test-split to restrict to the held-out test split (332 trajectories) used in the paper.
AgentForesight-7B reaches 66.44 overall Exact-F1, +19.88 points above the strongest proprietary baseline DeepSeek-V4-Pro, with a 3
The AgentForesight-7B checkpoint will be released on HuggingFace upon paper acceptance.
AgentForesight/
├── README.md
├── LICENSE
├── requirements.txt
├── inference/
│ ├── prompts.py # auditor system prompt + chat-template builder + parser
│ ├── data.py # parquet loader (with paper_test_split flag)
│ ├── metrics.py # Exact-F1 / ASS / FAR / Step-Acc + macro-domain bucketing
│ ├── infer_local.py # local-model auditor inference (transformers)
│ └── infer_api.py # OpenAI-compatible API auditor inference
└── assets/
├── logo_full.png
├── pipeline.png
└── main_table.png
If you find this work useful, please cite:
@article{zhang2026agentforesight,
title={AgentForesight: Online Auditing for Early Failure Prediction in Multi-Agent Systems},
author={Zhang, Boxuan and Zhu, Jianing and Shi, Zeru and Liu, Dongfang and Tang, Ruixiang},
journal={arXiv preprint arXiv:2605.08715},
year={2026}
}- Code (
inference/): MIT License — see LICENSE. - Dataset (HuggingFace
ZBox008003/AFTraj): CC BY 4.0.


