pagevspage.com
// methodology
// how we score

Five dimensions.
One framework each.

Every score is anchored to a named, established framework — not vibes. Same scrape input, same score on a re-run.

dimensions.json
5 dimensions
dimension

Headline Clarity

framework
MECLABS Conversion Heuristic

Flint McGlaughlin's heuristic — C = 4m + 3v + 2(i-f) - 2a — names the H1 as the load-bearing element of value (v) and friction (f). We reward H1s that immediately answer who this is for and what they get; we penalize H1s that lean on the subhead to disambiguate.

dimension

Value Proposition

framework
April Dunford's positioning

We score against Dunford's positioning components: a named competitive alternative, a unique attribute, the value that attribute unlocks, and the best-fit customer. Category-speak with no named alternative or no defensible attribute scores below 50.

We reward CTAs that name the outcome ('Start my 3-minute audit', 'Generate my report'). Generic verbs that name no outcome ('Get Started', 'Learn More', 'Sign Up') default to the 40–59 band regardless of placement.

Specific and verifiable beats voluminous: 'Used by 12,400 Shopify stores' beats a wall of 30 unlabelled logos. Unverifiable claims ('Loved by thousands') and logo walls without context lose points.

dimension

Visual Hierarchy

framework
NN/g F-pattern + Gestalt proximity

Based on the screenshot and headline ordering, does the eye's first fixation land on the value proposition? Or on a nav item, hero illustration, or decoration? Pages whose F-pattern's first horizontal sweep crosses the value prop score higher.

rubric.md
0–100 bands
// per-dimension scoring scale
  • 90–100World-class. You'd screenshot this in a deck.
  • 75–89Strong. Minor gaps.
  • 60–74Solid but forgettable.
  • 40–59Noticeable weaknesses.
  • 20–39Actively hurts conversion.
  • 0–19Broken.

The overall score is the rounded arithmetic mean of the five dimension scores. The model never sets the overall directly — that removes an entire class of "fudged the total to match the gut verdict" failures.

disclosure.json
full transparency
// the parts we don't hide
model
claude-sonnet-4-6
temperature
0 (no creative sampling)
scoring
average of 5 dimensions, no hand-tuning
prompt version
v0.4.0-2026-04-17
reproducibility
Same scrape input → same score on re-run, gated by an internal eval harness with stddev ≤ 2.
non_negotiables.md
guardrails
// rules baked into the prompt
  • Every analysis paragraph must contain at least two direct quotes from the actual page text. No quote, no score.
  • When a score crosses below 60 or above 85, the analysis must name the framework being applied. That is the evidence you paid for.
  • Each "steal this" is a concrete, copy-pasteable tactic — implementable in under 15 minutes — not an abstraction like "clarify your value prop."
  • Banned phrases: could be improved, might benefit from, consider adding, overall,. If the model writes one, the report is rejected.