claude-cost-optimization

from adaptationio/skrillz

No description

1 stars0 forksUpdated Jan 16, 2026
npx skills add https://github.com/adaptationio/skrillz --skill claude-cost-optimization

SKILL.md

Claude Cost Optimization

Overview

Cost optimization is critical for production Claude deployments. A single inefficiently-designed agent can cost hundreds or thousands of dollars monthly, while optimized implementations cost 10-90% less for identical functionality. This skill provides a comprehensive workflow for measuring, analyzing, and optimizing token costs.

Why This Matters:

  • Token costs are your largest Claude expense
  • Small improvements compound over millions of API calls
  • Context optimization alone saves 60-90% on long conversations
  • Model + effort selection can reduce costs 5-10x for specific tasks

Key Savings Available:

  • Effort parameter: 20-70% token reduction (same model, different reasoning depth)
  • Context editing: 60-90% reduction on long conversations
  • Tool optimization: 37-85% reduction with advanced tool patterns
  • Prompt caching: 90% reduction on repeated content
  • Model selection: 2-5x cost difference between models

When to Use This Skill

Use claude-cost-optimization when you need to:

  • Track Token Costs: Understand exactly what your Claude implementation costs
  • Identify Expensive Patterns: Find which operations consume the most tokens
  • Measure ROI: Calculate the business value of your Claude integration
  • Optimize for Production: Reduce costs before deploying expensive agents
  • Analyze Cost Drivers: Break down costs by model, feature, endpoint, or time period
  • Plan Budget: Forecast future costs based on growth projections
  • Implement Optimizations: Apply proven techniques (caching, batching, context editing)
  • Set Alerts: Monitor costs and get notified of anomalies or budget overruns

5-Step Optimization Workflow

Step 1: Measure Baseline Usage

Establish your current cost baseline before optimization.

What to Measure:

- Total monthly tokens (input + output)
- Cost breakdown by model
- Top 10 most expensive operations
- Average tokens per request
- Peak usage times and patterns

How to Measure (using Admin API):

from anthropic import Anthropic

client = Anthropic()

# Get monthly usage
response = client.beta.admin.usage_metrics.list(
    limit=30,
    sort_by="date",
)

total_input_tokens = sum(m.input_tokens for m in response.data)
total_output_tokens = sum(m.output_tokens for m in response.data)
total_cost = (total_input_tokens * 0.000005) + (total_output_tokens * 0.000025)
print(f"Monthly cost: ${total_cost:.2f}")

Where to Start: See references/usage-tracking.md for detailed Admin API integration

Step 2: Analyze Cost Drivers

Understand where your costs actually come from.

Identify Expensive Patterns:

  1. Which operations use the most tokens?
  2. Which models cost the most?
  3. Are you using caching effectively?
  4. Are context windows growing unnecessarily?
  5. Are you making redundant API calls?

Create Cost Breakdown (example):

Agent reasoning loops: 45% of costs
File analysis: 25% of costs
Web search: 15% of costs
Classification tasks: 10% of costs
Other: 5% of costs

Key Metrics to Calculate:

  • Cost per transaction
  • Tokens per transaction
  • Cost per business outcome
  • Cost trend (week-over-week)

Step 3: Apply Optimizations

Apply targeted optimizations to your biggest cost drivers.

Effort Parameter (if using Opus 4.5):

  • Complex reasoning: high effort (default)
  • Balanced tasks: medium effort (20-40% savings)
  • Simple classification: low effort (50-70% savings)

Context Editing (for long conversations):

  • Automatic tool result clearing (saves 60-90%)
  • Client-side compaction (saves automatic summarization)
  • Memory tool integration (enables infinite conversations)

Tool Optimization (for large tool sets):

  • Tool search with deferred loading (supports 10K+ tools)
  • Programmatic calling (37% token reduction on data processing)
  • Tool examples (improve accuracy 72% → 90%)

Prompt Caching (for repeated content):

  • Cache system prompts (90% cost reduction on cached portion)
  • Cache repeated files/documents
  • Cache tool definitions

Model Selection:

  • Opus 4.5: $5/M input, $25/M output (complex tasks)
  • Sonnet 4.5: (see references for pricing)
  • Haiku 4.5: (see references for pricing)

Step 4: Track Improvements

Monitor cost reductions and efficiency gains after optimizations.

Metrics to Track:

  • Cost per transaction (before vs after)
  • Total token reduction percentage
  • Quality impact (did results improve or worsen?)
  • Implementation difficulty and time

Measurement Period: Track for 1-2 weeks per optimization to see impact

Example Impact:

Optimization: Client-side compaction on long research tasks
Before: 450K tokens/request, $11.25 cost
After:  180K tokens/request, $4.50 cost
Savings: 60% cost reduction

Step 5: Report ROI

Calculate business value of your optimizations.

ROI Calculation:

Monthly Savings = (Daily Cost × 30) - (Optimized Cost × 30)
Implementation Hours = Time to implement opt

...
Read full content

Repository Stats

Stars1
Forks0