testing-validator

from adaptationio/skrillz

No description

1 stars0 forksUpdated Jan 16, 2026
npx skills add https://github.com/adaptationio/skrillz --skill testing-validator

SKILL.md

Testing Validator

Overview

testing-validator provides comprehensive functional testing for Claude Code skills, validating that skills actually work correctly in practice through systematic testing operations.

Purpose: Functional validation - ensure skills work correctly, not just look good

The 5 Testing Operations:

  1. Functional Testing - Core skill functionality works as intended
  2. Example Validation - All code/command examples execute successfully
  3. Integration Testing - Skills work correctly with dependencies and compositions
  4. Regression Testing - Updates don't break existing functionality
  5. Edge Case Testing - Handles unusual scenarios and boundary conditions

Complement to review-multi:

  • review-multi: Quality assessment (structure, content, patterns, usability) - "Is it good?"
  • testing-validator: Functional validation (does it work, examples execute, integrations function) - "Does it work?"
  • Together: Complete validation (quality + functionality)

Key Benefits:

  • Automated example execution (catch broken examples)
  • Integration validation (ensure skills compose correctly)
  • Regression prevention (detect breaks from updates)
  • Edge case coverage (handle unusual scenarios)
  • Systematic testing (consistent, repeatable)

When to Use

Use testing-validator when:

  1. Pre-Deployment Testing - Validate functionality before release
  2. Example Validation - Ensure all examples execute correctly
  3. Integration Validation - Test workflow skills and dependencies
  4. Post-Update Testing - Regression testing after changes
  5. Comprehensive QA - Combined with review-multi for complete validation
  6. CI/CD Integration - Automated testing in pipelines
  7. Edge Case Validation - Test boundary conditions and unusual scenarios
  8. Functional Certification - Certify skills work correctly in practice

Prerequisites

  • Skill to test
  • Ability to execute examples (appropriate environment)
  • Time allocation:
    • Quick Check: 15-30 minutes
    • Single Operation: 20-90 minutes
    • Comprehensive Testing: 2-4 hours

Operations

Operation 1: Functional Testing

Purpose: Validate core skill functionality works as intended

When to Use This Operation:

  • Testing if skill achieves stated purpose
  • Validating core functionality
  • Checking if instructions lead to successful outcomes
  • Pre-deployment functional validation

Automation Level: 30% automated (script checks), 70% manual (scenario execution)

Process:

  1. Select Test Scenarios

    • Choose 2-3 scenarios from "When to Use" section
    • Prioritize: primary use case + common case + edge case
    • Ensure scenarios cover main functionality
  2. Execute Scenarios

    • Actually follow skill instructions
    • Complete the intended task
    • Document results (success/partial/failure)
    • Note any issues encountered
  3. Validate Outputs

    • Does skill produce expected outputs?
    • Are outputs useful and correct?
    • Do outputs match documentation?
  4. Check Error Handling

    • What happens with errors?
    • Are error messages helpful?
    • Can users recover from errors?
  5. Assess Functionality

    • Does skill achieve stated purpose?
    • Is functionality complete?
    • Are there functional gaps?

Validation Checklist:

  • Primary use case tested (from "When to Use")
  • Common use case tested
  • Edge case tested (if applicable)
  • All scenarios completed successfully
  • Outputs correct and useful
  • Error handling works (if errors encountered)
  • Functionality complete (no gaps)
  • Skill achieves stated purpose

Test Results:

  • PASS: All scenarios succeed, functionality complete
  • PARTIAL: Some scenarios succeed, minor issues
  • FAIL: Scenarios fail, functionality broken

Outputs:

  • Test result (PASS/PARTIAL/FAIL)
  • Scenario execution results
  • Functional issues identified (if any)
  • Recommendations for fixes

Time Estimate: 30-90 minutes

Example:

Functional Testing: skill-researcher
====================================

Test Scenarios:
1. Primary: Research GitHub API integration patterns
2. Common: Research for skill development planning
3. Edge: Research with no results found

Scenario 1: GitHub API Integration Research
- Executed: Operation 2 (GitHub Repository Research)
- Result: ✅ SUCCESS
- Time: 25 minutes
- Output: Found 5 repositories, extracted patterns
- Functionality: Achieved purpose (research complete)

Scenario 2: Skill Development Research
- Executed: All 5 operations (Web, GitHub, Docs, Synthesis)
- Result: ✅ SUCCESS
- Time: 60 minutes
- Output: Research synthesis with 4 sources, 3 patterns
- Functionality: Fully achieved purpose

Scenario 3: No Results Edge Case
- Executed: Web search for obscure topic
- Result: ✅ HANDLED
- Time: 10 minutes
- Output: "No results found" with guidance to adjust search
- Error Handling: Good (helpful message, suggests alternatives)

Overall Func

...
Read full content

Repository Stats

Stars1
Forks0