matrixy/agent-registry

Lazy-loading system for Claude Code agents

6 stars0 forksUpdated Jan 20, 2026
npx skills add matrixy/agent-registry

README

Agent Registry

Lazy-loading system for Claude Code agents that reduces context window usage by 70-90%

As your agent collection grows, Claude Code loads every single agent into every conversation.

With dozens or hundreds of agents installed, this creates token overhead that wastes your context window on agents you'll never use in that session.

Agent Registry solves this with on-demand loading: index your agents once, then load only what you need.

The Problem

Claude Code's default behavior loads all agents upfront into every conversation:

  • Token overhead: ~117 tokens per agent × agent count = wasted context
  • Scales poorly: 50 agents ≈ 5.8k, 150 agents ≈ 17.5k, 300+ agents ≈ 35k+ tokens
  • Context waste: Typically only 1-3 agents are relevant per conversation
  • All or nothing: You pay the full cost even if you use zero agents
  • Slow startup: Processing hundreds of agent files delays conversation start

Real-World Impact: Before & After

Here's the actual difference from a real Claude Code session with 140 agents:

❌ Before: All Agents Loaded

Before Agent Registry

Context consumption:

  • 🔴 Custom agents: 16.4k tokens (8.2%)
  • Total: 76k/200k (38%)
  • Problem: 14k tokens wasted on unused agents

✅ After: Agent Registry

After Agent Registry

Context consumption:

  • 🟢 Custom agents: 2.7k tokens (1.4%)
  • Total: 42k/200k (21%)
  • Savings: 13.7k tokens freed = 83% reduction

Bottom line: Agent Registry freed up 34k tokens in total context (38% → 21%), giving you 56% more free workspace (79k → 113k available) for your actual code and conversations.

Testing methodology: Both screenshots were captured from the same repository in separate Claude Code sessions. Each session was started fresh using the /clear command to ensure zero existing context, providing accurate baseline measurements of agent-related token overhead.

The Solution

Agent Registry shifts from eager loading to lazy loading:

Before: Load ALL agents → Context Window → Use 1-2 agents
        (~16-35k tokens)    (limited)      (~200-300 tokens)

        ❌ Wastes 90%+ of agent tokens on unused agents

After:  Search registry → Load specific agent → Use what you need
        (~2-4k tokens)   (instant)          (~200-300 tokens)

        ✅ Saves 70-90% of agent-related tokens

The math (140 agents example):

  • Before: 16.4k tokens (all agents loaded)
  • After: 2.7k tokens (registry index loaded, agents on-demand)
  • Savings: 13.7k tokens saved → 83% reduction

Scaling examples:

  • 50 agents: Save ~3-4k tokens (5.8k → 2.5k) = 60-70% reduction
  • 150 agents: Save ~14k tokens (17.5k → 3k) = 80% reduction
  • 300 agents: Save ~30k tokens (35k → 3.5k) = 85-90% reduction

What This Skill Provides

🔍 Smart Search (BM25 + Keyword Matching)

Find agents by intent, not by name:

python scripts/search_agents.py "code review security"
# Returns: security-auditor (0.89), code-reviewer (0.71)

python scripts/search_agents_paged.py "backend api" --page 1 --page-size 10
# Paginated results for large agent collections

Supported:

  • Intent-based search using BM25 algorithm
  • Keyword matching with fuzzy matching
  • Relevance scoring (0.0-1.0)
  • Pagination for 100+ agent results
  • JSON output mode for scripting

✨ Interactive Migration UI

Beautiful checkbox interface with advanced selection:

  • Multi-level Select All: Global, per-category, per-page selection
  • Pagination: Automatic 10-item pages for large collections (100+ agents)
  • Visual indicators: 🟢 <1k tokens, 🟡 1-3k, 🔴 >3k
  • Category grouping: Auto-organized by subdirectory structure
  • Keyboard navigation: ↑↓ navigate, Space toggle, Enter confirm
  • Selection persistence: Selections preserved across page navigation
  • Graceful fallback: Text input mode if questionary unavailable

Supported:

  • Checkbox UI with questionary
  • Page-based navigation (◀ Previous / ▶ Next)
  • Finish selection workflow
  • Text-based fallback mode

📊 Lightweight Index

Registry stores only metadata — not full agent content:

  • Agent name and summary
  • Keywords for search matching
  • Token estimates for capacity planning
  • File paths for lazy loading
  • Content hashes for change detection

Index size scales slowly:

  • 50 agents ≈ 2k tokens
  • 150 agents ≈ 3-4k tokens
  • 300 agents ≈ 6-8k tokens

Much smaller than loading all agents:

  • Traditional: ~117 tokens/agent × count
  • Registry: ~20-25 tokens/agent in index

Installation

Prerequisites

  • Python 3.7+ (required)
  • Node.js 14+ (for NPX installation method)
  • Git (for traditional installation)

Method 1: NPX (Recommended)

Install via add-skill (one command):

npx add-skill MaTriXy/Agent-Registry

Or install gl

...

Read full README

Publisher

matrixymatrixy

Statistics

Stars6
Forks0
Open Issues0
CreatedJan 19, 2026