v0.4 · 2026-04-22 · research preview

A design framework that evaluates the model before it ships the page.

You have already seen the pattern this month: the pill badge, the two centred CTAs, the three equal cards, the shimmer on the word it cannot stop shimmering. AHD treats that pattern as a measurable failure mode, not a taste argument.

The framework compiles a brief into constraints, lints the output against 28 typographic and layout rules, and runs a live eval across frontier models so you can see, on the same brief, which ones produce editorial work and which ones produce the house style of 2024.

Read the code on GitHub → Forgejo mirror See the one-liner

The eval § 01 · one command

Given a brief at briefs/landing.yml, AHD renders n candidates per model, scores each against the Swiss-editorial rule set, and writes a diffable report. The command is the whole interface:

bash · ahd/cli~/ahd

$ ahd eval-live swiss-editorial --brief briefs/landing.yml --models <specs> --n 10

<specs> is a comma-separated list of model identifiers. The harness fans out in parallel, caches screenshots, and refuses to score anything the 28-rule linter rejects outright.

What ships § 02 · today, in the repo

Every item below is present at HEAD, tested, and runs with no external credentials.

brief compiler: Parses a YAML brief into typed constraints, token references, and a structural contract the renderer must satisfy.
28-rule linter: Static checks on the rendered DOM and CSS: measure, hierarchy, baseline, contrast, and the specific anti-patterns listed above.
eval harness: Runs n candidates per model per brief, records prompts, outputs, and lint deltas as a signed JSONL trace.
vision-critic scaffold: The prompts, rubric, and report format for visual critique. Wire your own multimodal key to make it live.
MCP server: Exposes brief, tokens, linter, and harness as tools any MCP-aware editor or agent can call.
editor plugins: Source for VS Code, Zed, and Neovim plugins. Install from the monorepo; no registry dependency.
eight tokens: The starter palette: two type scales, three spacing rhythms, one measure, one rule weight, one accent.

What is gated § 03 · external dependencies

These exist in the codebase but are switched off by default. They are not vapour; they are things we will not ship keys or infrastructure for.

frontier-model calls: Live runs against hosted LLMs. Needs your own API keys. Local models via Ollama run without gating.
live vision critique: Needs a multimodal key and the screenshot pipeline (headless Chromium + a storage target). Offline scoring works without it.
npm packages: Standalone, versioned editor-plugin releases. Gated on a signing story we have not finished writing.
additional tokens: Beyond the starter eight. Held back until the compiler can enforce them without the linter turning into a style guide.

Posture § 04 · what this is not

AHD is not a template library, a component kit, or a wrapper over a single model. It does not promise a faster pipeline. It promises a smaller one, with the evaluator treated as a first-class artefact rather than a demo tab.

If the rule set is wrong, fork it. The 28 rules are a file. Swiss-editorial is one profile; the harness accepts others. The point is that there is a rule set at all, and that the model is asked to justify itself against it.