Agent Registry

Lazy-loading system for Claude Code agents that reduces context window usage by 70-90%

As your agent collection grows, Claude Code loads every single agent into every conversation.

With dozens or hundreds of agents installed, this creates token overhead that wastes your context window on agents you'll never use in that session.

Agent Registry solves this with on-demand loading: index your agents once, then load only what you need.

The Problem

Claude Code's default behavior loads all agents upfront into every conversation:

Token overhead: ~117 tokens per agent × agent count = wasted context
Scales poorly: 50 agents ≈ 5.8k, 150 agents ≈ 17.5k, 300+ agents ≈ 35k+ tokens
Context waste: Typically only 1-3 agents are relevant per conversation
All or nothing: You pay the full cost even if you use zero agents
Slow startup: Processing hundreds of agent files delays conversation start

Real-World Impact: Before & After

Here's the actual difference from a real Claude Code session with 140 agents:

❌ Before: All Agents Loaded

Before Agent Registry

Context consumption:

🔴 Custom agents: 16.4k tokens (8.2%)
Total: 76k/200k (38%)
Problem: 14k tokens wasted on unused agents

✅ After: Agent Registry

After Agent Registry

Context consumption:

🟢 Custom agents: 2.7k tokens (1.4%)
Total: 42k/200k (21%)
Savings: 13.7k tokens freed = 83% reduction

Bottom line: Agent Registry freed up 34k tokens in total context (38% → 21%), giving you 56% more free workspace (79k → 113k available) for your actual code and conversations.

Testing methodology: Both screenshots were captured from the same repository in separate Claude Code sessions. Each session was started fresh using the /clear command to ensure zero existing context, providing accurate baseline measurements of agent-related token overhead.

The Solution

Agent Registry shifts from eager loading to lazy loading:

Before: Load ALL agents → Context Window → Use 1-2 agents
        (~16-35k tokens)    (limited)      (~200-300 tokens)

        ❌ Wastes 90%+ of agent tokens on unused agents

After:  Search registry → Load specific agent → Use what you need
        (~2-4k tokens)   (instant)          (~200-300 tokens)

        ✅ Saves 70-90% of agent-related tokens

The math (140 agents example):

Before: 16.4k tokens (all agents loaded)
After: 2.7k tokens (registry index loaded, agents on-demand)
Savings: 13.7k tokens saved → 83% reduction

Scaling examples:

50 agents: Save ~3-4k tokens (5.8k → 2.5k) = 60-70% reduction
150 agents: Save ~14k tokens (17.5k → 3k) = 80% reduction
300 agents: Save ~30k tokens (35k → 3.5k) = 85-90% reduction

What This Skill Provides

🔍 Smart Search (BM25 + Keyword Matching)

Find agents by intent, not by name:

python scripts/search_agents.py "code review security"
# Returns: security-auditor (0.89), code-reviewer (0.71)

python scripts/search_agents_paged.py "backend api" --page 1 --page-size 10
# Paginated results for large agent collections

Supported:

Intent-based search using BM25 algorithm
Keyword matching with fuzzy matching
Relevance scoring (0.0-1.0)
Pagination for 100+ agent results
JSON output mode for scripting

✨ Interactive Migration UI

Beautiful checkbox interface with advanced selection:

Multi-level Select All: Global, per-category, per-page selection
Pagination: Automatic 10-item pages for large collections (100+ agents)
Visual indicators: 🟢 <1k tokens, 🟡 1-3k, 🔴 >3k
Category grouping: Auto-organized by subdirectory structure
Keyboard navigation: ↑↓ navigate, Space toggle, Enter confirm
Selection persistence: Selections preserved across page navigation
Graceful fallback: Text input mode if questionary unavailable

Supported:

Checkbox UI with questionary
Page-based navigation (◀ Previous / ▶ Next)
Finish selection workflow
Text-based fallback mode

📊 Lightweight Index

Registry stores only metadata — not full agent content:

Agent name and summary
Keywords for search matching
Token estimates for capacity planning
File paths for lazy loading
Content hashes for change detection

Index size scales slowly:

50 agents ≈ 2k tokens
150 agents ≈ 3-4k tokens
300 agents ≈ 6-8k tokens

Much smaller than loading all agents:

Traditional: ~117 tokens/agent × count
Registry: ~20-25 tokens/agent in index

Installation

Prerequisites

Python 3.7+ (required)
Node.js 14+ (for NPX installation method)
Git (for traditional installation)

Method 1: NPX (Recommended)

Install via add-skill (one command):

npx add-skill MaTriXy/Agent-Registry

Or install gl

...

matrixy/agent-registry

README