Banja Lab / Benchmarks / Test
The same task, run on 28 models. Compare the outputs side by side, or open any one in a popup to inspect it.
Top result: claude-opus-4-8 (low reasoning) at 100.0% composite. Lowest: claude-haiku-4-5 at 0.0%. 28 models compared on this task.
Implement a Python function `fizzbuzz(n)` that returns a list of strings for the integers 1 through n inclusive. For each integer i: - return "FizzBuzz" if i is divisible by both 3 and 5, - else "Fizz" if divisible by 3, - else "Buzz" if divisible by 5, - otherwise the integer as a string (e.g. "7"). If n < 1, return an empty list. Use only the Python standard library. Write your solution to `solution.py`.