Training sparse autoencoders (SAEs) on vision transformers (ViTs), implemented in PyTorch.
This framework was developed for a series of projects leveraging SAEs with vision models.
- [Feb 2025] Interpretable and Testable Vision Features via Sparse Autoencoders
- [Nov 2025] Towards Open-Ended Visual Scientific Discovery with Sparse Autoencoders
Trained SAE checkpoints are available at:
If you want to cite the software, please cite it as:
@software{stevens2025saev,
author = {Stevens, Samuel},
license = {CC-BY-4.0},
month = apr,
title = {{saev}},
url = {https://github.com/OSU-NLP-Group/saev},
year = {2025}
}