Community

Jump to bottom

Anuar Sharafudinov edited this page Oct 23, 2025 · 4 revisions

Youtube: oLLM - Run 80GB Model on 8GB VRAM Locally - Hands-on Demo

Medium: I Ran an 80-Billion-Parameter AI Model on My 8GB GPU -My Experience with oLLM

Medium: oLLM vs Ollama: Democratizing Local AI Inference in 2025

Medium: Revolutionizing Large-Context LLM Inference: A Deep Dive into the oLLM Python Library

News article: Meet oLLM: A Lightweight Python Library that brings 100K-Context LLM Inference to 8 GB Consumer GPUs

Article with performance comparison