12-Factor Agents Compliance Analysis

Reference: 12-Factor Agents

Input Parameters

Parameter	Description	Required
`docs_path`	Path to documentation directory (for existing analyses)	Optional
`codebase_path`	Root path of the codebase to analyze	Required

Analysis Framework

Factor 1: Natural Language to Tool Calls

Principle: Convert natural language inputs into structured, deterministic tool calls using schema-validated outputs.

Search Patterns:

# Look for Pydantic schemas
grep -r "class.*BaseModel" --include="*.py"
grep -r "TaskDAG\|TaskResponse\|ToolCall" --include="*.py"

# Look for JSON schema generation
grep -r "model_json_schema\|json_schema" --include="*.py"

# Look for structured output generation
grep -r "output_type\|response_model" --include="*.py"

File Patterns: **/agents/*.py, **/schemas/*.py, **/models/*.py

Compliance Criteria:

Level	Criteria
Strong	All LLM outputs use Pydantic/dataclass schemas with validators
Partial	Some outputs typed, but dict returns or unvalidated strings exist
Weak	LLM returns raw strings parsed manually or with regex

Anti-patterns:

json.loads(llm_response) without schema validation
output.split() or regex parsing of LLM responses
dict[str, Any] return types from agents
No validation between LLM output and handler execution

Factor 2: Own Your Prompts

Principle: Treat prompts as first-class code you control, version, and iterate on.

Search Patterns:

# Look for embedded prompts
grep -r "SYSTEM_PROMPT\|system_prompt" --include="*.py"
grep -r '""".*You are' --include="*.py"

# Look for template systems
grep -r "jinja\|Jinja\|render_template" --include="*.py"
find . -name "*.jinja2" -o -name "*.j2"

# Look for prompt directories
find . -type d -name "prompts"

File Patterns: **/prompts/**, **/templates/**, **/agents/*.py

Compliance Criteria:

Level	Criteria
Strong	Prompts in separate files, templated (Jinja2), versioned
Partial	Prompts as module constants, some parameterization
Weak	Prompts hardcoded inline in functions, f-strings only

Anti-patterns:

f"You are a {role}..." inline in agent methods
Prompts mixed with business logic
No way to iterate on prompts without code changes
No prompt versioning or A/B testing capability

Factor 3: Own Your Context Window

Principle: Control how history, state, and tool results are formatted for the LLM.

Search Patterns:

# Look for context/message management
grep -r "AgentMessage\|ChatMessage\|messages" --include="*.py"
grep -r "context_window\|context_compiler" --include="*.py"

# Look for custom serialization
grep -r "to_xml\|to_context\|serialize" --include="*.py"

# Look for token management
grep -r "token_count\|max_tokens\|truncate" --include="*.py"

File Patterns: **/context/*.py, **/state/*.py, **/core/*.py

Compliance Criteria:

Level	Criteria
Strong	Custom context format, token optimization, typed events, compaction
Partial	Basic message history with some structure
Weak	Raw message accumulation, standard OpenAI format only

Anti-patterns:

Unbounded message accumulation
Large artifacts embedded inline (diffs, files)
No agent-specific context filtering
Same context for all agent types

Factor 4: Tools Are Structured Outputs

Principle: Tools produce schema-validated JSON that triggers deterministic code, not magic function calls.

Search Patterns:

# Look for tool/response schemas
grep -r "class.*Response.*BaseModel" --include="*.py"
grep -r "ToolResult\|ToolOutput" --include="*.py"

# Look for deterministic handlers
grep -r "def handle_\|def execute_" --include="*.py"

# Look for validation layer
grep -r "model_validate\|parse_obj" --include="*.py"

File Patterns: **/tools/*.py, **/handlers/*.py, **/agents/*.py

Compliance Criteria:

Level	Criteria
Strong	All tool outputs schema-validated, handlers type-safe
Partial	Most tools typed, some loose dict returns
Weak	Tools return arbitrary dicts, no validation layer

Anti-patterns:

Tool handlers that directly execute LLM output
eval() or exec() on LLM-generated code
No separation between decision (LLM) and execution (code)
Magic method dispatch based on string matching

Factor 5: Unify Execution State

Principle: Merge execution state (step, retries) with business state (messages, results).

Search Patterns:

# Look for state models
grep -r "ExecutionState\|WorkflowState\|Thread" --include="*.py"

# Look for dual state systems
grep -r "checkpoint\|MemorySaver" --include="*.py"
grep

...

agent-architecture-analysis

SKILL.md

12-Factor Agents Compliance Analysis

Input Parameters

Analysis Framework

Factor 1: Natural Language to Tool Calls

Factor 2: Own Your Prompts

Factor 3: Own Your Context Window

Factor 4: Tools Are Structured Outputs

Factor 5: Unify Execution State

Repository

Repository Stats