Hierarchical Matching Systems

This skill provides rigid diagnostic and architectural procedures for hierarchical matching systems. Follow checklists exactly—matching bugs often hide in skipped steps.

Quick Reference

Algorithm selection: See references/decision-guide.md
Algorithm details: See references/algorithms.md

1. Problem Classification Checklist

Before any work, classify the problem. Check ALL that apply:

□ TWO-SIDED: Both sides have preferences (students↔schools, workers↔jobs)
□ ONE-SIDED: Only one side has preferences (tasks→workers, items→bins)
□ HIERARCHICAL: Entities exist at multiple levels (org→dept→team→person)
□ WEIGHTED: Matches have costs/scores to optimize
□ CONSTRAINED: Hard limits exist (capacity, exclusions, required pairings)
□ STABLE: Solution must resist defection (no blocking pairs)
□ OPTIMAL: Solution must minimize/maximize objective function
□ FUZZY: Entities may partially match (entity resolution, deduplication)

Classification determines algorithm family. Proceed to Section 2 for architecture or Section 3 for debugging.

2. Architecture Procedure

Follow these phases in order when designing a new matching system.

Phase 2.1: Requirements Translation

Convert each business requirement into formal constraints:

Business Requirement	Constraint Type	Formal Expression
"Each student gets one school"	Capacity	`
"Schools have seat limits"	Capacity	`
"Siblings must be together"	Coupling	`school(s1) = school(s2) if siblings(s1,s2)`
"Student X cannot attend Y"	Exclusion	`(X, Y) ∉ matches`
"Priority for residents"	Preference ordering	`rank(resident) < rank(non-resident)`

Checklist:

□ List ALL business requirements
□ Classify each as: capacity | coupling | exclusion | ordering | soft preference
□ Identify conflicts between requirements (document tradeoffs)
□ Distinguish HARD constraints (must satisfy) from SOFT (optimize toward)
□ Validate translations with stakeholder examples

Phase 2.2: Algorithm Selection

Use references/decision-guide.md to select algorithm. Verify:

□ Algorithm handles all HARD constraints
□ Algorithm can optimize SOFT constraints (or document gaps)
□ Complexity acceptable for data size (see references/algorithms.md)
□ Stability requirements met (if two-sided)
□ Optimality requirements met (if weighted)

Phase 2.3: Data Model Design

Define entities, relationships, and preference representation:

□ Entity schema for each side (attributes, identifiers)
□ Preference representation (ranked list | score matrix | pairwise comparisons)
□ Constraint encoding (how exclusions/couplings are stored)
□ Hierarchy representation (if multi-level: tree | DAG | adjacency list)
□ Tie-breaking rules (deterministic ordering for equal preferences)

Phase 2.4: Interface Contracts

Specify inputs, outputs, and invariants:

Input Contract:

□ Preference format and validation rules
□ Constraint format and validation rules
□ Required vs optional fields
□ How missing preferences are handled (reject | default rank | exclude)

Output Contract:

□ Match format (pairs | assignment map | ranked list)
□ Unmatched entity handling (explicit list | null matches | error)
□ Match metadata (scores, stability proof, constraint satisfaction report)

Invariants:

□ Determinism: same input → same output (document randomness if any)
□ Completeness: all entities matched OR explicitly unmatched
□ Validity: all matches satisfy hard constraints

Phase 2.5: Testing Strategy

Define validation before implementation:

□ Unit tests for preference parsing and constraint validation
□ Property tests: stability, optimality, constraint satisfaction
□ Edge cases: empty inputs, single entity, all tied preferences
□ Regression tests from known-good examples
□ Performance benchmarks at target scale

3. Debugging Procedure

Follow this diagnostic sequence for any matching issue. Do not skip steps.

Phase 3.1: Symptom Classification

Identify the symptom category:

Symptom	Category	Go To
Same inputs, different outputs	INSTABILITY	3.2
Matches violate business rules	CONSTRAINT VIOLATION	3.3
Matches technically valid but "wrong"	PREFERENCE MISALIGNMENT	3.4
Errors with nested/hierarchical data	HIERARCHY BUG	3.5
Poor performance at scale	PERFORMANCE	3.6

Phase 3.2: Instability Diagnosis

Root causes of non-deterministic matches:

□ RANDOMNESS: Check for unseeded RNG in tie-breaking
   → Fix: Use deterministic tie-breaker (lexicographic ID, timestamp)

□ FLOATING POINT: Check score comparisons for floating point issues
   → Fix: Use epsilon compa

...

hierarchical-matching-systems

SKILL.md