npx skills add https://github.com/rawwerks/rlm-cli --skill rlmSKILL.md
RLM CLI
Recursive Language Models (RLM) CLI - enables LLMs to handle near-infinite context by recursively decomposing inputs and calling themselves over parts. Supports files, directories, URLs, and stdin.
Installation
pip install rlm-cli # or: pipx install rlm-cli
uvx rlm-cli ask ... # run without installing
Set an API key for your backend (openrouter is default):
export OPENROUTER_API_KEY=... # default backend
export OPENAI_API_KEY=... # for --backend openai
export ANTHROPIC_API_KEY=... # for --backend anthropic
Commands
ask - Query with context
rlm ask <inputs> -q "question"
Inputs (combinable):
| Type | Example | Notes |
|---|---|---|
| Directory | rlm ask . -q "..." | Recursive, respects .gitignore |
| File | rlm ask main.py -q "..." | Single file |
| URL | rlm ask https://x.com -q "..." | Auto-converts to markdown |
| stdin | git diff | rlm ask - -q "..." | - reads from pipe |
| Literal | rlm ask "text" -q "..." --literal | Treat as raw text |
| Multiple | rlm ask a.py b.py -q "..." | Combine any types |
Options:
| Flag | Description |
|---|---|
-q "..." | Question/prompt (required) |
--backend | Provider: openrouter (default), openai, anthropic |
--model NAME | Model override (format: provider/model or just model) |
--json | Machine-readable output |
--output-format | Output format: text, json, or json-tree |
--summary | Show execution summary with depth statistics |
--extensions .py .ts | Filter by extension |
--include/--exclude | Glob patterns |
--max-iterations N | Limit REPL iterations (default: 30) |
--max-depth N | Recursive RLM depth (default: 1 = no recursion) |
--max-budget N.NN | Spending limit in USD (requires OpenRouter) |
--max-timeout N | Time limit in seconds |
--max-tokens N | Total token limit (input + output) |
--max-errors N | Consecutive error limit before stopping |
--no-index | Skip auto-indexing |
--exa | Enable Exa web search (requires EXA_API_KEY) |
--inject-file FILE | Execute Python code between iterations |
JSON output structure:
{"ok": true, "exit_code": 0, "result": {"response": "..."}, "stats": {...}}
JSON-tree output (--output-format=json-tree):
Adds execution tree showing nested RLM calls:
{
"result": {
"response": "...",
"tree": {
"depth": 0,
"model": "openai/gpt-4",
"duration": 2.3,
"cost": 0.05,
"iterations": [...],
"children": [...]
}
}
}
Summary output (--summary):
Shows depth-wise statistics after completion:
- JSON mode: adds
summaryfield tostats - Text mode: prints summary to stderr
=== RLM Execution Summary ===
Total depth: 2 | Nodes: 3 | Cost: $0.0054 | Duration: 17.38s
Depth 0: 1 call(s) ($0.0047, 13.94s)
Depth 1: 2 call(s) ($0.0007, 3.44s)
complete - Query without context
rlm complete "prompt text"
rlm complete "Generate SQL" --json --backend openai
search - Search indexed files
rlm search "query" [options]
| Flag | Description |
|---|---|
--limit N | Max results (default: 20) |
--language python | Filter by language |
--paths-only | Output file paths only |
--json | JSON output |
Auto-indexes on first use. Manual index: rlm index .
index - Build search index
rlm index . # Index current dir
rlm index ./src --force # Force full reindex
doctor - Check setup
rlm doctor # Check config, API keys, deps
rlm doctor --json
Workflows
Git diff review:
git diff | rlm ask - -q "Review for bugs"
git diff --cached | rlm ask - -q "Ready to commit?"
git diff HEAD~3 | rlm ask - -q "Summarize changes"
Codebase analysis:
rlm ask . -q "Explain architecture"
rlm ask src/ -q "How does auth work?" --extensions .py
Search + analyze:
rlm search "database" --paths-only
rlm ask src/db.py -q "How is connection pooling done?"
Compare files:
rlm ask old.py new.py -q "What changed?"
Configuration
Precedence: CLI flags > env vars > config file > defaults
Config locations: ./rlm.yaml, ./.rlm.yaml, ~/.config/rlm/config.yaml
backend: openrouter
model: google/gemini-3-flash-preview
max_iterations: 30
Environment variables:
RLM_BACKEND- Default backendRLM_MODEL- Default modelRLM_CONFIG- Config file pathRLM_JSON=1- Always output JSON
Recursion and Budget Limits
Recursive RLM (--max-depth)
Enable recursive llm_query() calls where child RLMs process sub-tasks:
# 2 levels of recursion
rlm ask . -q "Research thoroughly" --max-depth 2
# With budget cap
rlm ask . -q "Analyze codebase" --max-depth 3 --max-budget 0.50
Budget Control (--max-budget)
Limit spending per completion. Raises `BudgetExceededErr
...