llama-farm/llamafarm
Deploy any AI model, agent, database, RAG, and pipeline locally or remotely in minutes
npx skills add llama-farm/llamafarmREADME
LlamaFarm - Edge AI for Everyone
Enterprise AI capabilities on your own hardware. No cloud required.
LlamaFarm is an open-source AI platform that runs entirely on your hardware. Build RAG applications, train custom classifiers, detect anomalies, and run document processing—all locally with complete privacy.
- 🔒 Complete Privacy — Your data never leaves your device
- 💰 No API Costs — Use open-source models without per-token fees
- 🌐 Offline Capable — Works without internet once models are downloaded
- ⚡ Hardware Optimized — Automatic GPU/NPU acceleration on Apple Silicon, NVIDIA, and AMD
Desktop App Downloads
Get started instantly — no command line required:
What Can You Build?
| Capability | Description |
|---|---|
| RAG (Retrieval-Augmented Generation) | Ingest PDFs, docs, CSVs and query them with AI |
| Custom Classifiers | Train text classifiers with 8-16 examples using SetFit |
| Anomaly Detection | Detect outliers in logs, metrics, or transactions |
| OCR & Document Extraction | Extract text and structured data from images and PDFs |
| Named Entity Recognition | Find people, organizations, and locations |
| Multi-Model Runtime | Switch between Ollama, OpenAI, vLLM, or local GGUF models |
Video demo (90 seconds): https://youtu.be/W7MHGyN0MdQ
Quickstart
Option 1: Desktop App
Download the desktop app above and run it. No additional setup required.
Option 2: CLI + Development Mode
-
Install the CLI
macOS / Linux:
curl -fsSL https://raw.githubusercontent.com/llama-farm/llamafarm/main/install.sh | bashWindows (PowerShell):
irm https://raw.githubusercontent.com/llama-farm/llamafarm/main/install.ps1 | iexOr download directly from releases.
-
Create and run a project
lf init my-project # Generates llamafarm.yaml lf start # Starts services and opens Designer UI -
Chat with your AI
lf chat # Interactive chat lf chat "Hello, LlamaFarm!" # One-off message
The Designer web interface is available at http://localhost:8000.
Option 3: Development from Source
git clone https://github.com/llama-farm/llamafarm.git
cd llamafarm
# Install Nx globally and initialize the workspace
npm install -g nx
nx init --useDotNxInstallation --interactive=false # Required on first clone
# Start all services (run each in a separate terminal)
nx start server # FastAPI server (port 8000)
nx start rag # RAG worker for document processing
nx start universal-runtime # ML models, OCR, embeddings (port 11540)
Architecture
LlamaFarm consists of three main services:
| Service | Port | Purpose |
|---|---|---|
| Server | 8000 | FastAPI REST API, Designer web UI, project management |
| RAG Worker | - | Celery worker for async document processing |
| Universal Runtime | 11540 | ML model inference, embeddings, OCR, anomaly detection |
All configuration lives in llamafarm.yaml—no scattered settings or hidden defaults.
Runtime Options
Universal Runtime (Recommended)
The Universal Runtime provides access to HuggingFace models plus specialized ML capabilities:
- Text Generation - Any HuggingFace text model
- Embeddings - sentence-transformers and other embedding models
- OCR - Text extraction from images/PDFs (Surya, EasyOCR, PaddleOCR, Tesseract)
- Document Extraction - Forms, invoices, receipts via vision models
- Text Classification - Pre-trained or custom models via SetFit
- Named Entity Recognition - Extract people, organizations, locations
- Reranking - Cross-encoder models for improved RAG quality
- **Anomaly
...