Every major AI agent framework puts the LLM in the decision loop. Herald is the only architecture where deterministic code owns every routing and tool selection decision. Here's what that means in practice.
| Capability | Claude CLI | AutoGPT | ChatGPT | Herald / Skeptic |
|---|---|---|---|---|
| Cost per turn | $0.01 - $0.10 | High API costs | $0.01 - $0.05 | $0.00 (Local Hardware) |
| Hallucination control | System Prompts | Self-Reflection | System Prompts | Fact-anchored output gating & Judge |
| Routing | LLM Tool Calling | LLM Agent Loop | LLM Tool Calling | 5-stage deterministic (100% known intents) |
| Tool selection | LLM decides | LLM decides | LLM decides | Deterministic code |
| LLM calls per turn | 2-5+ | 5-20+ | 1-3+ | 0-1 |
| Offline capability | None | None | None | Full (degraded mode) |
| Decision auditability | Partial (chain traces) | Minimal | Partial (run logs) | 100% (code paths) |
| Hallucinated tool calls | Possible | Frequent | Possible | Structurally impossible |
| Typical latency | 1-5s | 10-60s | 1-3s | Sub-100ms routing (+ TTS init delay) |
| Heavy tasks (research, code) | 5-30s | 30-120s | 5-30s | 5-30s |
| World-state model | None (stateless) | Task list only | Thread context | Persistent structured world-state |
| Multi-model coordination | Manual chains | Single model loop | Single model | 10 model seats (7 distinct models) |
| Output verification | None built-in | Self-reflection (same model) | None built-in | Evidence-grounded judge |
The question isn't whether this works.
Meet the full OS layer — 10 seats, sequential relay.