Banja Lab / Benchmarks / Test
The same task, run on 28 models. Compare the outputs side by side, or open any one in a popup to inspect it.
Top result: claude-opus-4-8 (low reasoning) at 100.0% composite. Lowest: grok-build-0.1 at 22.2%. 28 models compared on this task.
Build a single self-contained dashboard page as one HTML file (`index.html`) that renders with no build step and no network calls (inline all CSS and JS, no external fonts or scripts). Requirements: - a heading and a data table with id="orders", a header row of at least three columns, and at least five seeded data rows already filled in (no fetch, no placeholders), - the column headers are clickable, and clicking a header re-sorts the table rows by that column (so the order of the rows in the table body actually changes), and - clicking the same header again may reverse the sort direction. Seed real-looking values. Use plain, readable, accessible markup.