skill-improvement-from-observability

from adaptationio/skrillz

No description

1 stars0 forksUpdated Jan 16, 2026
npx skills add https://github.com/adaptationio/skrillz --skill skill-improvement-from-observability

SKILL.md

Skill Improvement from Observability

The Self-Improvement Loop: Enhanced Telemetry → Pattern Analysis → Skill Updates → Better Performance

Data Source

Primary: {job="claude_code_enhanced"} in Loki (from enhanced-telemetry hooks)

Workflow

1. Collect Observability Insights

Use observability-analyzer with enhanced telemetry:

# Session analytics
{job="claude_code_enhanced", event_type="session_end"} | json

# Error patterns
{job="claude_code_enhanced", event_type="tool_result", status="error"} | json

# Tool sequences
{job="claude_code_enhanced", event_type="tool_call"} | json

# Prompt patterns
{job="claude_code_enhanced", event_type="user_prompt"} | json

2. Run Pattern Detection

Use observability-pattern-detector operations:

  • detect-failures → Error patterns by tool
  • detect-tool-sequences → Inefficient tool chains
  • detect-conversation-patterns → User behavior insights
  • detect-context-issues → Context management problems
  • detect-waste → Redundant operations

3. Extract Actionable Patterns

Filter high-impact issues from enhanced telemetry:

Error Analysis:

sum by (tool, error_type) (count_over_time({job="claude_code_enhanced", event_type="tool_result", status="error"} | json [7d]))

Tool Inefficiency:

# Repeated Read→Read patterns (waste)
{job="claude_code_enhanced", event_type="tool_call"} | json | previous_tool="Read" and tool_name="Read"

Context Issues:

# Auto compaction frequency
count_over_time({job="claude_code_enhanced", event_type="context_compact", trigger="auto"} [7d])

4. Map Patterns to Skills

PatternLikely SkillAction
Bash command errorsbash-related skillsAdd existence checks
File not foundfile operation skillsAdd path validation
Repeated Glob→Readsearch skillsOptimize file discovery
High context usagecontext-heavy skillsAdd chunking
Many debugging promptscore skillsImprove error messages

5. Generate Improvement Recommendations

Based on enhanced telemetry patterns:

{
  "improvement": {
    "pattern": "File not found errors",
    "occurrences": 45,
    "source_query": "{job=\"claude_code_enhanced\", event_type=\"tool_result\", status=\"error\"} | json | error_type=~\".*not found.*\"",
    "affected_skills": ["file-operations"],
    "recommendation": "Add file existence check before Read/Edit operations",
    "implementation": "Add pathlib.Path(file).exists() check",
    "priority": "high",
    "expected_impact": "Reduce errors by 80%"
  }
}

6. Track Effectiveness

After improvements deployed, measure:

# Before vs After error rates
sum(count_over_time({job="claude_code_enhanced", event_type="tool_result", status="error"} | json [7d]))

# Tool success rate improvement
sum(count_over_time({job="claude_code_enhanced", event_type="tool_result", status="success"} | json [7d])) /
sum(count_over_time({job="claude_code_enhanced", event_type="tool_result"} | json [7d]))

Example Improvement Flows

Flow 1: Error Reduction

Telemetry: "npm not found" × 45 in tool_result errors
    ↓
Pattern: Bash tool failures with npm commands
    ↓
Recommendation: Add npm availability check
    ↓
skill-updater applies changes
    ↓
Telemetry tracks: npm errors = 0 after deployment
    ↓
Result: ✅ 100% reduction

Flow 2: Context Optimization

Telemetry: Auto-compaction triggered 12 times in 7 days
    ↓
Pattern: Large file reads accumulating tokens
    ↓
Recommendation: Add file chunking for large reads
    ↓
skill-updater applies changes
    ↓
Telemetry tracks: Auto-compactions = 2 after deployment
    ↓
Result: ✅ 83% reduction

Flow 3: Tool Sequence Optimization

Telemetry: Glob→Read→Glob→Read pattern 89 times
    ↓
Pattern: Redundant file discovery
    ↓
Recommendation: Cache glob results within session
    ↓
skill-updater applies changes
    ↓
Telemetry tracks: Redundant glob reduced by 70%
    ↓
Result: ✅ Faster file operations

Key Queries for Improvement Analysis

High-Impact Errors

topk(10, sum by (tool, error_type) (count_over_time({job="claude_code_enhanced", event_type="tool_result", status="error"} | json [7d])))

Session Quality Issues

# High error sessions
{job="claude_code_enhanced", event_type="session_end"} | json | error_count > 5

# Low productivity sessions (high turns, few tool calls)
{job="claude_code_enhanced", event_type="session_end"} | json | turn_count > 20 and tools_used < 5

Tool Efficiency

# Tool usage distribution
sum by (tool) (count_over_time({job="claude_code_enhanced", event_type="tool_call"} | json [7d]))

# Error rate by tool
sum by (tool) (count_over_time({job="claude_code_enhanced", event_type="tool_result", status="error"} | json [7d])) /
sum by (tool) (count_over_time({job="claude_code_enhanced", event_type="tool_result"} | json [7d]))

User Beh

...

Read full content

Repository Stats

Stars1
Forks0