Artificial Human Design
A framework for evaluating web design with explicit rules, not model optimism. You write a brief in YAML, AHD compiles it, runs a 28-rule linter, and scores candidates against a vision-critic scaffold. No rankings without reasoning.
01
ONE LINER
bash
ahd — eval-live
$ ahd eval-live swiss-editorial \
--brief briefs/landing.yml \
--models claude-sonnet-4,gpt-4o,gemini-2.5-pro \
--n 10
02
What ships
- OK Brief compiler
- OK 28-rule linter —typography, color contrast, motion, reflow, focus, alt text
- OK Eval harness —parallel generation, rule enforcement, score aggregation
- OK Vision-critic scaffold —prompt templates, structured output schema, no live calls
- OK MCP server —stdio transport, tool definitions for brief inspection
- OK Editor plugins —VS Code, Zed; .wasm for nvim in progress
- OK Eight tokens —curated design vocabulary, versioned in git
03
What is gated
- KEY Live frontier-model calls —requires your own API keys
- KEY Live vision critique —requires multimodal key + screenshot pipeline
- WIP Standalone npm packages for editor plugins
- WIP Additional tokens