Banja Lab / Benchmarks / Test
The same task, run on 28 models. Compare the outputs side by side, or open any one in a popup to inspect it.
Top result: claude-opus-4-8 (low reasoning) at 100.0% composite. Lowest: claude-sonnet-4-6 at 0.0%. 28 models compared on this task.
Build a single self-contained page as one HTML file (`index.html`) that renders with no build step and no network calls (inline all CSS and JS, no external fonts or scripts). The page combines two interactive parts: - a pricing section with three tiers and a monthly/annual billing toggle implemented as an accessible switch (role="switch" with aria state, id="billing"); clicking the toggle must change every displayed price between its monthly and annual value, and - a FAQ section (id="faq") of at least three question/answer items where each answer is collapsed by default; clicking a question must reveal (make visible) its own answer, and clicking it again may collapse it. The first answer must have id="a1". Use plain, readable, accessible markup.