hierarchical-matching-systems
from petekp/agent-skills
Agent skills for UI development, parameter tuning, and design iteration
npx skills add https://github.com/petekp/agent-skills --skill hierarchical-matching-systemsSKILL.md
Hierarchical Matching Systems
This skill provides rigid diagnostic and architectural procedures for hierarchical matching systems. Follow checklists exactly—matching bugs often hide in skipped steps.
Quick Reference
- Algorithm selection: See references/decision-guide.md
- Algorithm details: See references/algorithms.md
1. Problem Classification Checklist
Before any work, classify the problem. Check ALL that apply:
□ TWO-SIDED: Both sides have preferences (students↔schools, workers↔jobs)
□ ONE-SIDED: Only one side has preferences (tasks→workers, items→bins)
□ HIERARCHICAL: Entities exist at multiple levels (org→dept→team→person)
□ WEIGHTED: Matches have costs/scores to optimize
□ CONSTRAINED: Hard limits exist (capacity, exclusions, required pairings)
□ STABLE: Solution must resist defection (no blocking pairs)
□ OPTIMAL: Solution must minimize/maximize objective function
□ FUZZY: Entities may partially match (entity resolution, deduplication)
Classification determines algorithm family. Proceed to Section 2 for architecture or Section 3 for debugging.
2. Architecture Procedure
Follow these phases in order when designing a new matching system.
Phase 2.1: Requirements Translation
Convert each business requirement into formal constraints:
| Business Requirement | Constraint Type | Formal Expression |
|---|---|---|
| "Each student gets one school" | Capacity | ` |
| "Schools have seat limits" | Capacity | ` |
| "Siblings must be together" | Coupling | school(s1) = school(s2) if siblings(s1,s2) |
| "Student X cannot attend Y" | Exclusion | (X, Y) ∉ matches |
| "Priority for residents" | Preference ordering | rank(resident) < rank(non-resident) |
Checklist:
□ List ALL business requirements
□ Classify each as: capacity | coupling | exclusion | ordering | soft preference
□ Identify conflicts between requirements (document tradeoffs)
□ Distinguish HARD constraints (must satisfy) from SOFT (optimize toward)
□ Validate translations with stakeholder examples
Phase 2.2: Algorithm Selection
Use references/decision-guide.md to select algorithm. Verify:
□ Algorithm handles all HARD constraints
□ Algorithm can optimize SOFT constraints (or document gaps)
□ Complexity acceptable for data size (see references/algorithms.md)
□ Stability requirements met (if two-sided)
□ Optimality requirements met (if weighted)
Phase 2.3: Data Model Design
Define entities, relationships, and preference representation:
□ Entity schema for each side (attributes, identifiers)
□ Preference representation (ranked list | score matrix | pairwise comparisons)
□ Constraint encoding (how exclusions/couplings are stored)
□ Hierarchy representation (if multi-level: tree | DAG | adjacency list)
□ Tie-breaking rules (deterministic ordering for equal preferences)
Phase 2.4: Interface Contracts
Specify inputs, outputs, and invariants:
Input Contract:
□ Preference format and validation rules
□ Constraint format and validation rules
□ Required vs optional fields
□ How missing preferences are handled (reject | default rank | exclude)
Output Contract:
□ Match format (pairs | assignment map | ranked list)
□ Unmatched entity handling (explicit list | null matches | error)
□ Match metadata (scores, stability proof, constraint satisfaction report)
Invariants:
□ Determinism: same input → same output (document randomness if any)
□ Completeness: all entities matched OR explicitly unmatched
□ Validity: all matches satisfy hard constraints
Phase 2.5: Testing Strategy
Define validation before implementation:
□ Unit tests for preference parsing and constraint validation
□ Property tests: stability, optimality, constraint satisfaction
□ Edge cases: empty inputs, single entity, all tied preferences
□ Regression tests from known-good examples
□ Performance benchmarks at target scale
3. Debugging Procedure
Follow this diagnostic sequence for any matching issue. Do not skip steps.
Phase 3.1: Symptom Classification
Identify the symptom category:
| Symptom | Category | Go To |
|---|---|---|
| Same inputs, different outputs | INSTABILITY | 3.2 |
| Matches violate business rules | CONSTRAINT VIOLATION | 3.3 |
| Matches technically valid but "wrong" | PREFERENCE MISALIGNMENT | 3.4 |
| Errors with nested/hierarchical data | HIERARCHY BUG | 3.5 |
| Poor performance at scale | PERFORMANCE | 3.6 |
Phase 3.2: Instability Diagnosis
Root causes of non-deterministic matches:
□ RANDOMNESS: Check for unseeded RNG in tie-breaking
→ Fix: Use deterministic tie-breaker (lexicographic ID, timestamp)
□ FLOATING POINT: Check score comparisons for floating point issues
→ Fix: Use epsilon compa
...