Skip to content

Troubleshooting

Anuar Sharafudinov edited this page Oct 31, 2025 · 6 revisions

flash-attn (flash attention 2) fails to install

Try pip install psutil and pip install --no-build-isolation flash-attn. If it doesn't, help make sure your hardware supports flash-attn. If not, ollm will still work but without long context support

kvikio fails to install

kvikio works only with NVIDIA GPUs and is optional. It accelerates data transfers from SSD to GPU, improving overall loading performance.

flash-linear-attention failed to install.

This library is used only by the qwen3-next model and is optional. It accelerates computations and is supported on most modern hardware.

Clone this wiki locally