46k+ LOC · 10 Model Seats · 171 Modules

Herald → Skeptic.

From Constrained LLM Renderer to Local Assistant Operating System

Every AI agent framework puts the LLM at the center of every decision. Herald inverts this entirely — a 5-stage deterministic cascade routes, selects tools, and assembles facts across ten specialist model seats. The LLM renders once, at the end, constrained to rephrasing what the code already knows. Sequential relay manages VRAM so all ten seats share a single consumer GPU.

Explore in 3D See how it works View benchmarks

The question isn't whether this works.

Herald is the pattern. Skeptic is the operating system. 10 model seats. Zero cloud dependency.

Explore the architecture →Back to home