Banja Lab / Benchmarks / Test
The same task, run on 28 models. Compare the outputs side by side, or open any one in a popup to inspect it.
Top result: claude-opus-4-8 (low reasoning) at 100.0% composite. Lowest: deepseek-v4-flash at 100.0%. 28 models compared on this task.
Draw a simple bar chart in SVG for the four values [3, 7, 5, 9]. Use exactly four rectangles, one per value, sitting on a shared baseline, with each bar's height proportional to its value (so the value 9 bar is the tallest and the value 3 bar is the shortest). Use a 120x110 viewBox. Use vector primitives only - no raster images and no text. Write the chart to `chart.svg`.