The essential tools for local LLM development — categorized and compared.
Model Discovery
| Tool |
Platform |
Description |
GPU support |
| llmfit |
All (Rust) |
Hardware-aware model finder, TUI |
Any |
Model Runners
| Tool |
Platform |
Best for |
GPU support |
| Ollama |
macOS, Linux, Windows |
Beginners, quick setup |
CUDA, Metal, ROCm |
| llama.cpp |
All (C/C++) |
Servers, custom pipelines |
CUDA, Metal, Vulkan, ROCm, SYCL |
| DwarfStar 4 |
macOS, Linux |
DeepSeek V4 Flash optimized |
Metal, CUDA |
| LM Studio |
macOS, Windows, Linux |
GUI users |
CUDA, Metal |
| vLLM |
Linux |
Production serving |
CUDA, ROCm |
| MLX |
macOS (Apple Silicon) |
Mac-native dev |
Apple Silicon GPU |
| llama.rn |
iOS, Android |
Mobile inference |
Metal, Vulkan |
Frontends & UIs
| Tool |
Description |
Key feature |
| Open WebUI |
Self-hosted ChatGPT clone |
RAG, tools, multi-user |
| SillyTavern |
Character chat frontend |
Roleplay, character cards |
| Anything LLM |
All-in-one desktop app |
RAG, agents, multi-model |
| Jan |
Open-source ChatGPT alternative |
Offline-first, extensions |
| GPT4All |
Desktop local AI |
No GPU required |
Model Sources
Monitoring & Observability