Banja Lab / Benchmarks / Test
The same task, run on 28 models. Compare the outputs side by side, or open any one in a popup to inspect it.
Top result: claude-opus-4-8 (low reasoning) at 100.0% composite. Lowest: grok-4.3 at 96.1%. 28 models compared on this task.
Build a self-contained results section as a single HTML file (`index.html`) that renders with no build step and no network calls (inline all CSS and JS). Requirements: - a clear section heading, - exactly three cards laid out as a responsive grid; each card pairs a headline statistic with a short testimonial quote and an attribution (name and role), - on a narrow screen the cards stack to a single column with no horizontal scroll. This is a visual piece: aim for balanced spacing, a clear hierarchy, and tasteful type and colour. Use plain, readable markup. No external fonts or scripts.