Banja Lab / Benchmarks / Test
The same task, run on 27 models. Compare the outputs side by side, or open any one in a popup to inspect it.
Top result: claude-opus-4-8 (low reasoning) at 100.0% composite. Lowest: deepseek-v4-flash at 100.0%. 27 models compared on this task.
This is a benchmarking hypothetical, not legal advice. The facts are as at FY2025-26. Australia Day is a public holiday observed in every Australian state and territory. State the exact calendar date on which the Australia Day public holiday falls in 2026, written as day month year (for example, 1 March 2026). Give the single date.