Banja
About
Services
Products
Case Studies
Lab
Contact Us
Let us pitch to you

LET'S BUILD
THE FUTURE.

Start a Project
or
Meet Jett
banja.au

We build digital products for people who move fast.

Explore

•About•Case Studies•Blog•Careers•Contact

Services

•Product Design & Build•AI Agents & Automation•Website & Brand Setup

Products

•Boosta

Contact

helloremovethis@andthisbanja.au
50 Miller St
North Sydney NSW 2060

© 2026 Banja Labs. All rights reserved.

Privacy PolicyTerms of Use

Banja Lab / Benchmarks / Test

RESPO-0007Websites · medium

CTA buttons that stack full-width on mobile and sit in a row on desktop

The same task, run on 27 models. Compare the outputs side by side, or open any one in a popup to inspect it.

Top result: claude-opus-4-8 (low reasoning) at 0.0% composite. Lowest: deepseek-v4-flash at 0.0%. 27 models compared on this task.

How it ran
  • Each model was given the brief below in a fresh, isolated session with no access to our tools, and returned a single self-contained index.html (inline CSS and JS, no external requests, no build step).
  • The rendered output was scored 1 to 5 on brief fidelity, visual design, craft, and impact by a four-family vision panel - Anthropic (Claude Opus 4.8), OpenAI (GPT-5.5), Google (Gemini 3.1 Pro), and xAI (Grok 4.3) - using one identical prompt so the scores compare. The published judge score is leave-one-family-out: a model is never scored by a judge of its own family, so same-family self-preference is removed.
The brief

Build a single self-contained HTML file (`index.html`) that renders with no build step and no network calls (inline all CSS and JS; no external fonts, scripts, or images). Build a hero section with an <h1>, a paragraph, and a pair of call-to-action links inside a container with id="actions": a primary link (id="primary") and a secondary link (id="secondary"). Make the pair responsive: - on a phone (around 360px wide) the two links stack vertically into a single column, each link stretched to the full width of the container (so the secondary link sits below the primary one, at the same left edge), - on a tablet and desktop (around 768px and 1440px wide) the two links sit side by side on one row, each sized to its own content (so the secondary link is to the right of the primary one, on the same row). The page must not scroll horizontally at any of 360px, 768px, or 1440px wide. Use plain, accessible markup.

Anthropicclaude-opus-4-8
Low reasoning
claude-opus-4-8 rendering of the CTA buttons that stack full-width on mobile and sit in a row on desktop benchmark - composite 0.0%
Open
Composite 0.0%Objective 0.0%
Open outputFull run
Anthropicclaude-opus-4-8
Medium reasoning
claude-opus-4-8 rendering of the CTA buttons that stack full-width on mobile and sit in a row on desktop benchmark - composite 0.0%
Open
Composite 0.0%Objective 0.0%
Open outputFull run
Anthropicclaude-opus-4-8
High reasoning
claude-opus-4-8 rendering of the CTA buttons that stack full-width on mobile and sit in a row on desktop benchmark - composite 0.0%
Open
Composite 0.0%Objective 0.0%
Open outputFull run
Anthropicclaude-opus-4-8
Extra-high reasoning
claude-opus-4-8 rendering of the CTA buttons that stack full-width on mobile and sit in a row on desktop benchmark - composite 0.0%
Open
Composite 0.0%Objective 0.0%
Open outputFull run
Anthropicclaude-opus-4-8
Max reasoning
claude-opus-4-8 rendering of the CTA buttons that stack full-width on mobile and sit in a row on desktop benchmark - composite 0.0%
Open
Composite 0.0%Objective 0.0%
Open outputFull run
Anthropicclaude-sonnet-4-6
High reasoning
claude-sonnet-4-6 rendering of the CTA buttons that stack full-width on mobile and sit in a row on desktop benchmark - composite 0.0%
Open
Composite 0.0%Objective 0.0%
Open outputFull run
Anthropicclaude-sonnet-5
High reasoning
claude-sonnet-5 rendering of the CTA buttons that stack full-width on mobile and sit in a row on desktop benchmark - composite 0.0%
Open
Composite 0.0%Objective 0.0%
Open outputFull run
Anthropicclaude-fable-5
High reasoning
claude-fable-5 rendering of the CTA buttons that stack full-width on mobile and sit in a row on desktop benchmark - composite 0.0%
Open
Composite 0.0%Objective 0.0%
Open outputFull run
Anthropicclaude-haiku-4-5
High reasoning
claude-haiku-4-5 rendering of the CTA buttons that stack full-width on mobile and sit in a row on desktop benchmark - composite 0.0%
Open
Composite 0.0%Objective 0.0%
Open outputFull run
Zhipuglm-5.2
default reasoning
glm-5.2 rendering of the CTA buttons that stack full-width on mobile and sit in a row on desktop benchmark - composite 0.0%
Open
Composite 0.0%Objective 0.0%
Open outputFull run
Moonshotkimi-k2.7-code
default reasoning
kimi-k2.7-code rendering of the CTA buttons that stack full-width on mobile and sit in a row on desktop benchmark - composite 0.0%
Open
Composite 0.0%Objective 0.0%
Open outputFull run
OpenAIgpt-5.5
High reasoning
gpt-5.5 rendering of the CTA buttons that stack full-width on mobile and sit in a row on desktop benchmark - composite 0.0%
Open
Composite 0.0%Objective 0.0%
Open outputFull run
OpenAIgpt-5.4-mini
High reasoning
gpt-5.4-mini rendering of the CTA buttons that stack full-width on mobile and sit in a row on desktop benchmark - composite 0.0%
Open
Composite 0.0%Objective 0.0%
Open outputFull run
Googlegemini-3.1-pro-preview
High reasoning
gemini-3.1-pro-preview rendering of the CTA buttons that stack full-width on mobile and sit in a row on desktop benchmark - composite 0.0%
Open
Composite 0.0%Objective 0.0%
Open outputFull run
Googlegemini-3.5-flash
default reasoning
gemini-3.5-flash rendering of the CTA buttons that stack full-width on mobile and sit in a row on desktop benchmark - composite 0.0%
Open
Composite 0.0%Objective 0.0%
Open outputFull run
Googlegemini-3.1-flash-lite
default reasoning
gemini-3.1-flash-lite rendering of the CTA buttons that stack full-width on mobile and sit in a row on desktop benchmark - composite 0.0%
Open
Composite 0.0%Objective 0.0%
Open outputFull run
xAIgrok-4.3
default reasoning
grok-4.3 rendering of the CTA buttons that stack full-width on mobile and sit in a row on desktop benchmark - composite 0.0%
Open
Composite 0.0%Objective 0.0%
Open outputFull run
xAIgrok-4.20-reasoning
default reasoning
grok-4.20-reasoning rendering of the CTA buttons that stack full-width on mobile and sit in a row on desktop benchmark - composite 0.0%
Open
Composite 0.0%Objective 0.0%
Open outputFull run
xAIgrok-build-0.1
default reasoning
grok-build-0.1 rendering of the CTA buttons that stack full-width on mobile and sit in a row on desktop benchmark - composite 0.0%
Open
Composite 0.0%Objective 0.0%
Open outputFull run
xAIgrok-composer-2.5-fast
default reasoning
grok-composer-2.5-fast rendering of the CTA buttons that stack full-width on mobile and sit in a row on desktop benchmark - composite 0.0%
Open
Composite 0.0%Objective 0.0%
Open outputFull run
Anthropicclaude-opus-4-8
High reasoning
claude-opus-4-8 rendering of the CTA buttons that stack full-width on mobile and sit in a row on desktop benchmark - composite 0.0%
Open
Composite 0.0%Objective 0.0%
Open outputFull run
Anthropicclaude-sonnet-4-6
High reasoning
claude-sonnet-4-6 rendering of the CTA buttons that stack full-width on mobile and sit in a row on desktop benchmark - composite 0.0%
Open
Composite 0.0%Objective 0.0%
Open outputFull run
Anthropicclaude-sonnet-5
High reasoning
claude-sonnet-5 rendering of the CTA buttons that stack full-width on mobile and sit in a row on desktop benchmark - composite 0.0%
Open
Composite 0.0%Objective 0.0%
Open outputFull run
Anthropicclaude-fable-5
High reasoning
claude-fable-5 rendering of the CTA buttons that stack full-width on mobile and sit in a row on desktop benchmark - composite 0.0%
Open
Composite 0.0%Objective 0.0%
Open outputFull run
Anthropicclaude-haiku-4-5
default reasoning
claude-haiku-4-5 rendering of the CTA buttons that stack full-width on mobile and sit in a row on desktop benchmark - composite 0.0%
Open
Composite 0.0%Objective 0.0%
Open outputFull run
DeepSeekdeepseek-v4-pro
default reasoning
deepseek-v4-pro rendering of the CTA buttons that stack full-width on mobile and sit in a row on desktop benchmark - composite 0.0%
Open
Composite 0.0%Objective 0.0%
Open outputFull run
DeepSeekdeepseek-v4-flash
default reasoning
deepseek-v4-flash rendering of the CTA buttons that stack full-width on mobile and sit in a row on desktop benchmark - composite 0.0%
Open
Composite 0.0%Objective 0.0%
Open outputFull run