auto-claude-optimization

from adaptationio/skrillz

No description

1 stars0 forksUpdated Jan 16, 2026
npx skills add https://github.com/adaptationio/skrillz --skill auto-claude-optimization

SKILL.md

Auto-Claude Optimization

Performance tuning, cost reduction, and efficiency improvements.

Performance Overview

Key Metrics

MetricImpactOptimization
API latencyBuild speedModel selection, caching
Token usageCostPrompt efficiency, context limits
Memory queriesSpeedEmbedding model, index tuning
Build iterationsTimeSpec quality, QA settings

Model Optimization

Model Selection

ModelSpeedCostQualityUse Case
claude-opus-4-5-20251101SlowHighBestComplex features
claude-sonnet-4-5-20250929FastMediumGoodStandard features
# Override model in .env
AUTO_BUILD_MODEL=claude-sonnet-4-5-20250929

Extended Thinking Tokens

Configure thinking budget per agent:

AgentDefaultRecommended
Spec creation16000Keep default for quality
Planning5000Reduce to 3000 for speed
Coding0Keep disabled
QA Review10000Reduce to 5000 for speed
# In agent configuration
max_thinking_tokens=5000  # or None to disable

Token Optimization

Reduce Context Size

  1. Smaller spec files

    # Keep specs concise
    # Bad: 5000 word spec
    # Good: 500 word spec with clear criteria
    
  2. Limit codebase scanning

    # In context/builder.py
    MAX_CONTEXT_FILES = 50  # Reduce from 100
    
  3. Use targeted searches

    # Instead of full codebase scan
    # Focus on relevant directories
    

Efficient Prompts

Optimize system prompts in apps/backend/prompts/:

<!-- Bad: Verbose -->
You are an expert software developer who specializes in building
high-quality, production-ready applications. You have extensive
experience with many programming languages and frameworks...

<!-- Good: Concise -->
Expert full-stack developer. Build production-quality code.
Follow existing patterns. Test thoroughly.

Memory Optimization

# Use efficient embedding model
OPENAI_EMBEDDING_MODEL=text-embedding-3-small

# Or offline with smaller model
OLLAMA_EMBEDDING_MODEL=all-minilm
OLLAMA_EMBEDDING_DIM=384

Speed Optimization

Parallel Execution

# Enable more parallel agents (default: 4)
MAX_PARALLEL_AGENTS=8

Reduce QA Iterations

# Limit QA loop iterations
MAX_QA_ITERATIONS=10  # Default: 50

# Skip QA for quick iterations
python run.py --spec 001 --skip-qa

Faster Spec Creation

# Force simple complexity for quick tasks
python spec_runner.py --task "Fix typo" --complexity simple

# Skip research phase
SKIP_RESEARCH_PHASE=true python spec_runner.py --task "..."

API Timeout Tuning

# Reduce timeout for faster failure detection
API_TIMEOUT_MS=120000  # 2 minutes (default: 10 minutes)

Cost Management

Monitor Token Usage

# Enable cost tracking
ENABLE_COST_TRACKING=true

# View usage report
python usage_report.py --spec 001

Cost Reduction Strategies

  1. Use cheaper models for simple tasks

    # For simple specs
    AUTO_BUILD_MODEL=claude-sonnet-4-5-20250929 python spec_runner.py --task "..."
    
  2. Limit context window

    MAX_CONTEXT_TOKENS=50000  # Reduce from 100000
    
  3. Batch similar tasks

    # Create specs together, run together
    python spec_runner.py --task "Add feature A"
    python spec_runner.py --task "Add feature B"
    python run.py --spec 001
    python run.py --spec 002
    
  4. Use local models for memory

    # Ollama for memory (free)
    GRAPHITI_LLM_PROVIDER=ollama
    GRAPHITI_EMBEDDER_PROVIDER=ollama
    

Cost Estimation

OperationEstimated TokensCost (Opus)Cost (Sonnet)
Simple spec10k~$0.30~$0.06
Standard spec50k~$1.50~$0.30
Complex spec200k~$6.00~$1.20
Build (simple)50k~$1.50~$0.30
Build (standard)200k~$6.00~$1.20
Build (complex)500k~$15.00~$3.00

Memory System Optimization

Embedding Performance

# Faster embeddings
OPENAI_EMBEDDING_MODEL=text-embedding-3-small  # 1536 dim, fast

# Higher quality (slower)
OPENAI_EMBEDDING_MODEL=text-embedding-3-large  # 3072 dim

# Offline (fastest, free)
OLLAMA_EMBEDDING_MODEL=all-minilm
OLLAMA_EMBEDDING_DIM=384

Query Optimization

# Limit search results
memory.search("query", limit=10)  # Instead of 100

# Use semantic caching
ENABLE_MEMORY_CACHE=true

Database Maintenance

# Compact database periodically
python -c "from integrations.graphiti.memory import compact_database; compact_database()"

# Clear old episodes
python query_memory.py --cleanup --older-than 30d

Build Efficiency

Spec Quality = Build Speed

High-quality specs reduce iterations:

``

...

Read full content

Repository Stats

Stars1
Forks0