auditing-bdd-tests
from viktor-silakov/bdd-best-practices
BDD best practices skill for AI Agents
npx skills add https://github.com/viktor-silakov/bdd-best-practices --skill auditing-bdd-testsSKILL.md
BDD Test Solution Audit
Goal: evaluate specification executability, flake resistance, maintainability, semantic/a11y quality, and AI-agent operability.
Adaptive Workflow
Workflow adapts based on repository size (auto-detected).
┌─────────────────────────────────────────────────────────────────┐
│ 1. DISCOVER → 2. ANALYZE → 3. SCORE → 4. REPORT → 5. ROADMAP │
└─────────────────────────────────────────────────────────────────┘
↑ │
└──────────── Skip steps for small repos ──────────────┘
| Repo Size | Steps | Sampling | Questions |
|---|---|---|---|
| Small (≤20 scenarios) | 1→3→4 | None | 1 question |
| Medium (21–100) | 1→2→3→4→5 | 30–50% | 2 questions |
| Large (100+) | Full | Stratified | 3 questions |
Step 1: Discovery & Auto-Inference
Target: {argument OR cwd}
Auto-detect (no user input needed):
| What | How to Detect |
|---|---|
| Stack | playwright.config.* → Playwright; playwright-bdd in package.json → playwright-bdd |
| Size | Count *.feature files and Scenario: lines |
| History | Check .bddready/history/index.json exists |
| CI | Check .github/workflows/, Jenkinsfile, .gitlab-ci.yml |
| Artifacts | Check playwright.config.* for trace/video/screenshot settings |
Output immediately:
Target: {path}
Stack: {stack} (auto-detected)
Size: {small/medium/large} ({N} features, {M} scenarios)
History: {yes/no} | CI: {yes/no} | Artifacts: {configured/missing}
See modules/discovery.md for detailed detection rules.
Step 2: Sampling (Medium/Large repos only)
Skip for small repos — analyze all scenarios.
For medium/large repos, use stratified sampling. See modules/sampling.md.
Progress Indicator (Medium/Large repos)
For repositories with 50+ scenarios, show progress during analysis:
Analyzing BDD Test Solution...
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0%
[■■■■■■■■■■░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░] 25%
✓ Discovery complete (playwright-bdd detected)
→ Analyzing features/auth/*.feature (8 scenarios)
[■■■■■■■■■■■■■■■■■■■■░░░░░░░░░░░░░░░░░░░░] 50%
→ Analyzing features/checkout/*.feature (12 scenarios)
[■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■░░░░░░░░░░] 75%
→ Scoring aspects...
[■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■] 100%
✓ Analysis complete
Progress stages:
- Discovery (10%)
- Feature file analysis (10-70%, proportional to file count)
- Step definition analysis (70-85%)
- Scoring (85-95%)
- Report generation (95-100%)
Update progress after each feature file or major step.
Step 3: Score Aspects
Score each aspect using rubrics from criteria/aspects.md.
Aspects and weights:
| # | Aspect | Weight |
|---|---|---|
| 1 | Executable Gherkin | 16% |
| 2 | Step Definitions Quality | 14% |
| 3 | Test Architecture | 14% |
| 4 | Selector Strategy | 12% |
| 5 | Waiting & Flake Resistance | 14% |
| 6 | Data & Environment | 10% |
| 7 | CI, Reporting & Artifacts | 10% |
| 8 | AI-Agent Operability | 10% |
Scoring: 0 (bad) / 5 (partial) / 10 (good) per criterion.
See modules/scoring.md for calculation formulas.
Step 4: Report
4.1 Terminal Output (Always)
Print ASCII dashboard with scores and issues. See modules/output-formats.md.
4.2 Issues by Severity
Classify using reference/severity.md:
- 🔴 CRITICAL — blocks reliable execution
- 🟡 WARNING — hinders speed/maintainability
- 🔵 INFO — optimizations
Every issue MUST have:
- Evidence (file path, pattern, or code snippet)
- Impact (why it matters)
- Effort estimate (Low/Medium/High)
4.3 Save Reports
Save to .bddready/history/reports/:
{REPORT_ID}.json— machine-readable{REPORT_ID}.md— human-readable
Update .bddready/history/index.json for delta tracking.
4.4 HTML Report (Offer to User)
After showing terminal output, ask:
Would you like me to generate an interactive HTML report?
If yes, run:
node scripts/render-html.mjs .bddready/history/reports/{REPORT_ID}.json .bddready/history/reports/{REPORT_ID}.html
Interactive Fix Mode
After showing issues, offer to fix quick wins immediately.
Trigger Conditions
Offer interactive fixes when:
- At least 1 CRITICAL issue with
Effort: Low - Issue has clear, automatable fix pattern
Flow
╔══════════════════════════════════════════════════════════════════╗
║ QUICK FIX AVAILABLE ║
╠══════════════════════════════════════════════════════════════════╣
║ [C1] Flake Resistance: Found 7 arbitrary sleeps ║
║ Fix: Replace `wait X seconds` with condition waits ║
║ Effort: Low | Files: 3 ║
║ ║
║ → Fix C1 now? [y/n/skip all] ║
╚═══════════════════════════════════════════
...