Banja Lab / Benchmarks / Test
The same task, run on 28 models. Compare the outputs side by side, or open any one in a popup to inspect it.
Top result: claude-opus-4-8 (low reasoning) at 100.0% composite. Lowest: grok-build-0.1 at 60.8%. 28 models compared on this task.
Build a self-contained signup form as a single HTML file (`index.html`) that renders with no build step and no network calls (inline all CSS and JS). Requirements: - an email input and a submit button inside a form, - inline client-side validation that runs on submit and does not reload the page: submitting with an empty or invalid email reveals a visible error message (use role="alert" for the error), and the input is marked aria-invalid="true", - submitting a valid email reveals a visible success message (use role="status") and clears the error. Do not rely on the browser's native required-field popup; show your own inline message element. Use plain, readable, accessible markup. No external fonts or scripts.