Banja
About
Services
Products
Case Studies
Lab
Contact Us
Let us pitch to you

LET'S BUILD
THE FUTURE.

Start a Project
or
Meet Jett
banja.au

We build digital products for people who move fast.

Explore

•About•Case Studies•Blog•Careers•Contact

Services

•Product Design & Build•AI Agents & Automation•Website & Brand Setup

Products

•Boosta

Contact

helloremovethis@andthisbanja.au
50 Miller St
North Sydney NSW 2060

© 2026 Banja Labs. All rights reserved.

Privacy PolicyTerms of Use

Banja Lab / Benchmarks / Test

AYGAT-0002UI components · hard

Keyboard-operable select-only combobox with a listbox popup

The same task, run on 27 models. Compare the outputs side by side, or open any one in a popup to inspect it.

Top result: claude-opus-4-8 (extra-high reasoning) at 100.0% composite. Lowest: deepseek-v4-pro at 0.0%. 27 models compared on this task.

How it ran
  • Each model was given the brief below in a fresh, isolated session with no access to our tools, and returned a single self-contained index.html (inline CSS and JS, no external requests, no build step).
  • The rendered output was scored 1 to 5 on brief fidelity, visual design, craft, and impact by a four-family vision panel - Anthropic (Claude Opus 4.8), OpenAI (GPT-5.5), Google (Gemini 3.1 Pro), and xAI (Grok 4.3) - using one identical prompt so the scores compare. The published judge score is leave-one-family-out: a model is never scored by a judge of its own family, so same-family self-preference is removed.
The brief

Build a single self-contained page as one HTML file (`index.html`) that renders with no build step and no network calls (inline all CSS and JS, no external fonts or scripts). Build a select-only combobox (a button that opens a listbox of options, NOT a native <select>). Requirements: - The trigger has id="combo", role="combobox", aria-haspopup="listbox", and aria-expanded that is "false" while closed and "true" while the listbox is open. - The popup is a role="listbox" with id="listbox" containing at least four role="option" items; one option starts with aria-selected="true" and the rest aria-selected="false". Give the options the ids opt-starter, opt-team, opt-business, opt-enterprise. - With the combobox focused, ArrowDown opens the listbox (if closed) and moves the active option down; ArrowUp moves it up. The combobox keeps aria-activedescendant pointing at the id of the currently active option while open. - Enter (or Space) selects the active option: its aria-selected becomes "true" (and the previously selected option becomes "false"), the listbox closes (aria-expanded "false"), and focus returns to the combobox. Escape closes the listbox without selecting. The combobox must be fully operable with the keyboard alone. Use plain, accessible markup.

Anthropicclaude-opus-4-8
Extra-high reasoning
claude-opus-4-8 rendering of the Keyboard-operable select-only combobox with a listbox popup benchmark - composite 100.0%
Open
Composite 100.0%Objective 100.0%
Open outputFull run
Anthropicclaude-opus-4-8
Max reasoning
claude-opus-4-8 rendering of the Keyboard-operable select-only combobox with a listbox popup benchmark - composite 100.0%
Open
Composite 100.0%Objective 100.0%
Open outputFull run
Zhipuglm-5.2
default reasoning
glm-5.2 rendering of the Keyboard-operable select-only combobox with a listbox popup benchmark - composite 100.0%
Open
Composite 100.0%Objective 100.0%
Open outputFull run
Anthropicclaude-fable-5
High reasoning
claude-fable-5 rendering of the Keyboard-operable select-only combobox with a listbox popup benchmark - composite 99.1%
Open
Composite 99.1%Objective 99.1%
Open outputFull run
Anthropicclaude-opus-4-8
Medium reasoning
claude-opus-4-8 rendering of the Keyboard-operable select-only combobox with a listbox popup benchmark - composite 98.1%
Open
Composite 98.1%Objective 98.1%
Open outputFull run
Anthropicclaude-opus-4-8
High reasoning
claude-opus-4-8 rendering of the Keyboard-operable select-only combobox with a listbox popup benchmark - composite 98.1%
Open
Composite 98.1%Objective 98.1%
Open outputFull run
Anthropicclaude-opus-4-8
High reasoning
claude-opus-4-8 rendering of the Keyboard-operable select-only combobox with a listbox popup benchmark - composite 98.1%
Open
Composite 98.1%Objective 98.1%
Open outputFull run
Anthropicclaude-sonnet-5
High reasoning
claude-sonnet-5 rendering of the Keyboard-operable select-only combobox with a listbox popup benchmark - composite 97.2%
Open
Composite 97.2%Objective 97.2%
Open outputFull run
Anthropicclaude-haiku-4-5
High reasoning
claude-haiku-4-5 rendering of the Keyboard-operable select-only combobox with a listbox popup benchmark - composite 97.2%
Open
Composite 97.2%Objective 97.2%
Open outputFull run
Anthropicclaude-sonnet-5
High reasoning
claude-sonnet-5 rendering of the Keyboard-operable select-only combobox with a listbox popup benchmark - composite 97.2%
Open
Composite 97.2%Objective 97.2%
Open outputFull run
Anthropicclaude-fable-5
High reasoning
claude-fable-5 rendering of the Keyboard-operable select-only combobox with a listbox popup benchmark - composite 97.2%
Open
Composite 97.2%Objective 97.2%
Open outputFull run
Googlegemini-3.5-flash
default reasoning
gemini-3.5-flash rendering of the Keyboard-operable select-only combobox with a listbox popup benchmark - composite 94.4%
Open
Composite 94.4%Objective 94.4%
Open outputFull run
xAIgrok-4.20-reasoning
default reasoning
grok-4.20-reasoning rendering of the Keyboard-operable select-only combobox with a listbox popup benchmark - composite 94.4%
Open
Composite 94.4%Objective 94.4%
Open outputFull run
Anthropicclaude-sonnet-4-6
High reasoning
claude-sonnet-4-6 rendering of the Keyboard-operable select-only combobox with a listbox popup benchmark - composite 88.8%
Open
Composite 88.8%Objective 88.8%
Open outputFull run
Anthropicclaude-haiku-4-5
default reasoning
claude-haiku-4-5 rendering of the Keyboard-operable select-only combobox with a listbox popup benchmark - composite 88.8%
Open
Composite 88.8%Objective 88.8%
Open outputFull run
DeepSeekdeepseek-v4-flash
default reasoning
deepseek-v4-flash rendering of the Keyboard-operable select-only combobox with a listbox popup benchmark - composite 88.8%
Open
Composite 88.8%Objective 88.8%
Open outputFull run
Anthropicclaude-opus-4-8
Low reasoning
claude-opus-4-8 rendering of the Keyboard-operable select-only combobox with a listbox popup benchmark - composite 0.0%
Open
Composite 0.0%Objective 0.0%
Open outputFull run
Moonshotkimi-k2.7-code
default reasoning
kimi-k2.7-code rendering of the Keyboard-operable select-only combobox with a listbox popup benchmark - composite 0.0%
Open
Composite 0.0%Objective 0.0%
Open outputFull run
OpenAIgpt-5.5
High reasoning
gpt-5.5 rendering of the Keyboard-operable select-only combobox with a listbox popup benchmark - composite 0.0%
Open
Composite 0.0%Objective 0.0%
Open outputFull run
OpenAIgpt-5.4-mini
High reasoning
gpt-5.4-mini rendering of the Keyboard-operable select-only combobox with a listbox popup benchmark - composite 0.0%
Open
Composite 0.0%Objective 0.0%
Open outputFull run
Googlegemini-3.1-pro-preview
High reasoning
gemini-3.1-pro-preview rendering of the Keyboard-operable select-only combobox with a listbox popup benchmark - composite 0.0%
Open
Composite 0.0%Objective 0.0%
Open outputFull run
Googlegemini-3.1-flash-lite
default reasoning
gemini-3.1-flash-lite rendering of the Keyboard-operable select-only combobox with a listbox popup benchmark - composite 0.0%
Open
Composite 0.0%Objective 0.0%
Open outputFull run
xAIgrok-4.3
default reasoning
grok-4.3 rendering of the Keyboard-operable select-only combobox with a listbox popup benchmark - composite 0.0%
Open
Composite 0.0%Objective 0.0%
Open outputFull run
xAIgrok-build-0.1
default reasoning
grok-build-0.1 rendering of the Keyboard-operable select-only combobox with a listbox popup benchmark - composite 0.0%
Open
Composite 0.0%Objective 0.0%
Open outputFull run
xAIgrok-composer-2.5-fast
default reasoning
grok-composer-2.5-fast rendering of the Keyboard-operable select-only combobox with a listbox popup benchmark - composite 0.0%
Open
Composite 0.0%Objective 0.0%
Open outputFull run
Anthropicclaude-sonnet-4-6
High reasoning
claude-sonnet-4-6 rendering of the Keyboard-operable select-only combobox with a listbox popup benchmark - composite 0.0%
Composite 0.0%Objective 0.0%
Full run
DeepSeekdeepseek-v4-pro
default reasoning
deepseek-v4-pro rendering of the Keyboard-operable select-only combobox with a listbox popup benchmark - composite 0.0%
Open
Composite 0.0%Objective 0.0%
Open outputFull run