Design evals for people who can see the problem.

AHD turns taste into repeatable checks: briefs, rules, live runs, and critique hooks for teams tired of reviewing another plausible interface by hand.

It is not a lecture about bad generated pages. It is a working framework for naming the failure modes, running the same brief across models, and keeping the artifacts inspectable.

The output is deliberately plain: code, tokens, rules, manifests, and the parts that still need keys or infrastructure.

What ships

  • brief compiler
  • 28-rule linter
  • eval harness
  • vision-critic scaffold
  • MCP server
  • editor plugins
  • eight tokens

What is gated

  • live frontier-model calls (needs API keys)
  • live vision critique (needs multimodal key + screenshot pipeline)
  • standalone npm packages for the editor plugins
  • additional tokens

AHD is for teams who want generated design to face the same kind of review pressure as code: explicit fixtures, repeatable runs, visible constraints, and no decorative proof that cannot be inspected.

read-the-code