LlamaFarm - Edge AI for Everyone

Enterprise AI capabilities on your own hardware. No cloud required.

LlamaFarm is an open-source AI platform that runs entirely on your hardware. Build RAG applications, train custom classifiers, detect anomalies, and run document processing—all locally with complete privacy.

🔒 Complete Privacy — Your data never leaves your device
💰 No API Costs — Use open-source models without per-token fees
🌐 Offline Capable — Works without internet once models are downloaded
⚡ Hardware Optimized — Automatic GPU/NPU acceleration on Apple Silicon, NVIDIA, and AMD

Desktop App Downloads

Get started instantly — no command line required:

Platform	Download
Mac (Universal)	Download
Windows	Download
Linux (x86_64)	Download
Linux (ARM64)	Download

What Can You Build?

Capability	Description
RAG (Retrieval-Augmented Generation)	Ingest PDFs, docs, CSVs and query them with AI
Custom Classifiers	Train text classifiers with 8-16 examples using SetFit
Anomaly Detection	Detect outliers in logs, metrics, or transactions
OCR & Document Extraction	Extract text and structured data from images and PDFs
Named Entity Recognition	Find people, organizations, and locations
Multi-Model Runtime	Switch between Ollama, OpenAI, vLLM, or local GGUF models

Video demo (90 seconds): https://youtu.be/W7MHGyN0MdQ

Quickstart

Option 1: Desktop App

Download the desktop app above and run it. No additional setup required.

Option 2: CLI + Development Mode

Install the CLI

macOS / Linux:

curl -fsSL https://raw.githubusercontent.com/llama-farm/llamafarm/main/install.sh | bash

Windows (PowerShell):

irm https://raw.githubusercontent.com/llama-farm/llamafarm/main/install.ps1 | iex

Or download directly from releases.

Create and run a project

lf init my-project      # Generates llamafarm.yaml
lf start                # Starts services and opens Designer UI

Chat with your AI

lf chat                           # Interactive chat
lf chat "Hello, LlamaFarm!"       # One-off message

The Designer web interface is available at http://localhost:8000.

Option 3: Development from Source

git clone https://github.com/llama-farm/llamafarm.git
cd llamafarm

# Install Nx globally and initialize the workspace
npm install -g nx
nx init --useDotNxInstallation --interactive=false  # Required on first clone

# Start all services (run each in a separate terminal)
nx start server           # FastAPI server (port 8000)
nx start rag              # RAG worker for document processing
nx start universal-runtime # ML models, OCR, embeddings (port 11540)

Architecture

LlamaFarm consists of three main services:

Service	Port	Purpose
Server	8000	FastAPI REST API, Designer web UI, project management
RAG Worker	-	Celery worker for async document processing
Universal Runtime	11540	ML model inference, embeddings, OCR, anomaly detection

All configuration lives in llamafarm.yaml—no scattered settings or hidden defaults.

Runtime Options

Universal Runtime (Recommended)

The Universal Runtime provides access to HuggingFace models plus specialized ML capabilities:

Text Generation - Any HuggingFace text model
Embeddings - sentence-transformers and other embedding models
OCR - Text extraction from images/PDFs (Surya, EasyOCR, PaddleOCR, Tesseract)
Document Extraction - Forms, invoices, receipts via vision models
Text Classification - Pre-trained or custom models via SetFit
Named Entity Recognition - Extract people, organizations, locations
Reranking - Cross-encoder models for improved RAG quality
**Anomaly

...

llama-farm/llamafarm

README