Design evals for people who can see the problem.

AHD turns taste into repeatable checks: briefs, rules, live runs, and critique hooks for teams tired of reviewing another plausible interface by hand.

It is not a lecture about bad generated pages. It is a working framework for naming the failure modes, running the same brief across models, and keeping the artifacts inspectable.

The output is deliberately plain: code, tokens, rules, manifests, and the parts that still need keys or infrastructure.

eval-live

$ ahd eval-live swiss-editorial --brief briefs/landing.yml --models <specs> --n 10

What ships

brief compiler
28-rule linter
eval harness
vision-critic scaffold
MCP server
editor plugins
eight tokens

What is gated

live frontier-model calls (needs API keys)
live vision critique (needs multimodal key + screenshot pipeline)
standalone npm packages for the editor plugins
additional tokens

AHD is for teams who want generated design to face the same kind of review pressure as code: explicit fixtures, repeatable runs, visible constraints, and no decorative proof that cannot be inspected.

read-the-code