Banja
About
Services
Products
Case Studies
Lab
Contact Us
Let us pitch to you

LET'S BUILD
THE FUTURE.

Start a Project
or
Meet Jett
banja.au

We build digital products for people who move fast.

Explore

•About•Case Studies•Blog•Careers•Contact

Services

•Product Design & Build•AI Agents & Automation•Website & Brand Setup

Products

•Boosta

Contact

helloremovethis@andthisbanja.au
50 Miller St
North Sydney NSW 2060

© 2026 Banja Labs. All rights reserved.

Privacy PolicyTerms of Use

Banja Lab / Benchmarks / Test

SCENE-0001svg-scene · hard

Sydney Harbour at golden hour (SVG scene)

The same task, run on 28 models. Compare the outputs side by side, or open any one in a popup to inspect it.

Top result: claude-opus-4-8 (extra-high reasoning) at 72.9% composite. Lowest: claude-fable-5 at 6.3%. 28 models compared on this task.

How it ran
  • Each model was given the brief below in a fresh, isolated session with no access to our tools, and returned its answer from scratch.
  • The rendered output was scored 1 to 5 on brief fidelity, visual design, craft, and impact by a four-family vision panel - Anthropic (Claude Opus 4.8), OpenAI (GPT-5.5), Google (Gemini 3.1 Pro), and xAI (Grok 4.3) - using one identical prompt so the scores compare. The published judge score is leave-one-family-out: a model is never scored by a judge of its own family, so same-family self-preference is removed.
The brief

Compose ONE single, self-contained SVG illustration of Sydney Harbour at golden hour - a busy, fullscreen, edge-to-edge scene. Use viewBox="0 0 1600 900". Vector primitives ONLY: no <image>, no raster, no data: URIs, no base64, no <foreignObject>, no <script>, no external references, and no <text> (the scene must be carried entirely by shapes). It must render correctly opened directly in a browser. You are composing blind - you will not see your own render - so reason carefully about coordinates, gradient stops, layering and one consistent light direction. LIGHT: a single warm sun low on the RIGHT (about x=1240, just above the horizon), so every object's right-facing surfaces are warm-lit and left-facing surfaces fall to cool shadow, and every reflection drops straight DOWN from its source onto the water. THE SCENE, back to front (build real depth - far things are hazier, bluer, lower-contrast and smaller): 1. Sky: a vertical gradient, deep dusk indigo/violet at the top melting through rose and amber to a hot pale-gold band at the horizon (~y=560). 2. Sun: a radial-gradient glow disc low-right with a soft blurred bloom bleeding into the sky. 3. The MOON rising opposite the sun, low on the LEFT - a pale disc in the darker part of the sky. 4. Far north-shore headland: a low, hazy, desaturated band along the horizon (aerial perspective). 5. City skyline: a cluster of varied silhouetted skyscrapers behind and between the landmarks, a few windows catching the last warm light, partly occluded by the Bridge deck. 6. The Sydney Harbour Bridge spanning the right two-thirds: ONE great parabolic steel arch (not a semicircle, not a suspension catenary) rising clear of a flat deck slung below it on a lattice of vertical hangers and cross-bracing, with four square pylons. Rim-lit on its sun side, shadowed body. 7. The Sydney Opera House left-of-centre on its low stepped podium: the interlocking, asymmetric, NESTED shell sails of decreasing size (not symmetric tents, not plain arcs), nearer shells overlapping farther ones, each shell warm-lit on its right cheek and cool in shadow on its left. 8. The water: a gradient plane (warm amber near the sun, cooling to deep indigo in the foreground), with a vertical SUN-GLITTER column of many short bright slivers under the sun that shorten and spread with distance, and broken, blurred, vertically-squashed REFLECTIONS of the Bridge, the sails, the skyline towers and the moon sitting directly beneath each source and fading with distance. 9. Harbour traffic: at least three distinct ferries at staggered depths (nearer ones larger, each with a short mirrored reflection and a warm-lit window strip) and at least SIX sailing yachts of varying size scattered across the water, their triangular sails catching the light. 10. A flock of birds as small dark silhouettes wheeling in the mid-sky, smaller and higher toward the horizon. 11. Foreground foreshore: dark sandstone rocks along the bottom edge with a small human figure (a person or a kayaker) for scale, framing the lower corner as a dark repoussoir. HIDDEN EASTER EGGS (small, deliberate, rewarded - tuck these in so a careful viewer discovers them): - the Southern Cross constellation (four bright stars plus the two pointers) emerging in the darker upper-LEFT of the sky; - a dolphin fin breaking the water surface somewhere in the mid-water; - one tiny sailing yacht with a green-and-gold (Australian) sail among the white ones; - a small seaplane or hot-air balloon drifting high in the sky. TECHNIQUE you must actually use: at least two gradient types (a linear sky gradient AND a radial sun glow), a blur filter (for the sun bloom, the soft water reflections and the far-headland haze), and a clipPath or mask to keep the glitter column and the reflections confined to the water below the horizon. Spend your detail budget on the two landmarks, the depth, and the reflections. Aim for a dense, layered, atmospheric, premium scene that reads instantly as Sydney Harbour at dusk and rewards a long look.

Anthropicclaude-opus-4-8
Extra-high reasoning
claude-opus-4-8 rendering of the Sydney Harbour at golden hour (SVG scene) benchmark - composite 72.9%, judge 3.9/5
Open
Composite 72.9%Judge 3.9/5
Judge panelAnthropic 4.0/5OpenAI 4.0/5Google 3.0/5
single-judge (Claude) 4.0/5 → leave-one-family-out 3.9/5

Anthropic: Nearly every brief element is present and correct: dusk sky gradient, left moon, parabolic Bridge with deck/hangers/pylons, nested Opera House shells, warm-to-cool water with a sun-glitter column and reflections, multiple ferries and 6+ yachts at staggered depths, birds, foreshore rocks with a tiny figure, plus all four easter eggs (Southern Cross, dolphin fin, green-and-gold sail, seaplane). The colour grading and depth are atmospheric and premium-feeling. The main flaw is a large dead black band across the bottom ~20% where the scene fails to fill the viewBox edge-to-edge, cropping the foreg

OpenAI: The scene includes most requested Sydney Harbour elements: Opera House, Harbour Bridge, skyline, moon, birds, ferries, yachts, dolphin fin, foreground rocks/person, and several easter eggs. It has strong atmosphere and color, but the sun itself is not clearly visible, the Bridge pylons/cross-bracing are simplified, and the large black empty band at the bottom prevents it from feeling truly fullscreen edge-to-edge.

Google: The illustration successfully includes nearly all requested elements and easter eggs, establishing a recognizable scene with appropriate gradients and lighting. However, the execution is hampered by basic geometric shapes, an unexplained floating arc above the bridge, and a large unrendered area at the bottom.

Open outputFull run
Anthropicclaude-opus-4-8
High reasoning
claude-opus-4-8 rendering of the Sydney Harbour at golden hour (SVG scene) benchmark - composite 68.8%, judge 3.8/5
Open
Composite 68.8%Judge 3.8/5
Judge panelAnthropic 3.3/5OpenAI 4.0/5Google 3.0/5
single-judge (Claude) 3.3/5 → leave-one-family-out 3.8/5

Anthropic: Most brief elements are present: the indigo-to-gold sky, left moon, parabolic bridge with deck and hangers, an Opera-House cluster, white yachts, a bird flock, foreground rocks with a small figure, and upper-left stars plus a faint aircraft. However several key details are weak or off: the sun reads as a flat glow band rather than a bloomed radial disc, the Opera sails are simplified fan-like arcs rather than nested interlocking shells, and the central reflection column renders as an oddly geometric white staircase that looks like a broken/janky artifact rather than scattered sun glitter. The

OpenAI: The scene includes almost all requested elements: Harbour Bridge, Opera House, moon, skyline, boats, yachts, birds, foreground figure, reflections, and the easter eggs are visible. Some brief specifics are weakened, especially the missing/hidden sun disc, limited bridge lattice/cross-bracing and pylons, and somewhat simplified Opera House shells, but the overall golden-hour Sydney read is strong and polished.

Google: The composition successfully incorporates all requested elements and hidden Easter eggs, adhering strongly to the prompt. However, the vector execution is somewhat basic, featuring inaccurate landmark shapes, blocky water reflections, and a noticeable stray curve extending from the bridge. These technical flaws detract from the overall premium feel despite the pleasant atmospheric lighting.

Open outputFull run
Anthropicclaude-opus-4-8
Max reasoning
claude-opus-4-8 rendering of the Sydney Harbour at golden hour (SVG scene) benchmark - composite 66.7%, judge 3.7/5
Open
Composite 66.7%Judge 3.7/5
Judge panelAnthropic 3.0/5OpenAI 4.0/5Google 2.3/5
single-judge (Claude) 3.0/5 → leave-one-family-out 3.7/5

Anthropic: Most brief elements are present and recognisable: the warm dusk sky gradient, right-side sun with a glitter column, left moon, parabolic arch bridge with pylons and hangers, nested Opera House sails, city skyline, multiple ferries, six-plus yachts (including the green-gold Australian sail), birds, a dolphin fin, an aircraft, and a foreshore figure - so fidelity is strong. However craft is hurt badly by a large dead-black band filling the bottom third of the canvas, which breaks the edge-to-edge composition and reads as a rendering/viewBox error rather than foreshore. Reflections are minimal (o

OpenAI: The scene includes most requested Sydney Harbour elements: gradient sky, moon, hazy horizon, skyline, Bridge, Opera House, water reflections, boats, birds, foreshore figure, and several easter eggs. The composition is attractive and atmospheric, though the sun itself is not clearly visible, the Bridge/Opera House detailing is somewhat simplified, and the large black area below the artwork undermines the fullscreen edge-to-edge requirement.

Google: The scene captures most requested elements and Easter eggs, utilizing the correct lighting palette and gradients. However, a massive black block obscures the bottom third, breaking the full-screen layout, and the Opera House geometry is inaccurate.

Open outputFull run
Anthropicclaude-fable-5
High reasoning
claude-fable-5 rendering of the Sydney Harbour at golden hour (SVG scene) benchmark - composite 65.6%, judge 3.6/5
Open
Composite 65.6%Judge 3.6/5
Judge panelAnthropic 3.3/5OpenAI 4.0/5Google 2.3/5
single-judge (Claude) 3.3/5 → leave-one-family-out 3.6/5

Anthropic: Nearly every briefed element is present and legible: the dusk sky gradient, low-right sun with glitter column, left-side moon, city skyline with a tower, a proper parabolic Bridge arch with lattice and pylons, multiple green ferries, 6+ yachts, wheeling birds, a kayaker for scale, and all four easter eggs (Southern Cross stars, dolphin fin, green-gold sail, hot-air balloon), which is strong fidelity. The colour and atmosphere read instantly as Sydney Harbour at golden hour and the composition of the upper scene is coherent and tasteful. But the render fails the explicit edge-to-edge/fullscreen

OpenAI: The scene includes nearly all requested Sydney Harbour elements: gradient sky, sun and moon, Opera House, Harbour Bridge, skyline, reflections, ferries, yachts, birds, foreshore figure, and visible easter eggs. Composition and colour are strong, though some landmark/reflection details are simplified and the large black band at the bottom prevents it from feeling fully edge-to-edge or premium. Craft is generally clean and atmospheric, but not quite at the highest level of density or realism requested.

Google: The illustration includes most requested elements and easter eggs, capturing the basic dusk atmosphere. However, it fails to implement the required landmark reflections and lighting rules, while a large, unexplained black band at the bottom severely disrupts the overall composition.

Open outputFull run
Anthropicclaude-opus-4-8
Medium reasoning
claude-opus-4-8 rendering of the Sydney Harbour at golden hour (SVG scene) benchmark - composite 64.6%, judge 3.6/5
Open
Composite 64.6%Judge 3.6/5
Judge panelAnthropic 3.3/5OpenAI 4.0/5Google 2.0/5
single-judge (Claude) 3.3/5 → leave-one-family-out 3.6/5

Anthropic: Many requested elements are present and reasonably placed: linear sky gradient through indigo/rose/amber to a gold horizon band, a sun glow, a left-side moon, a city skyline, water with a sun-glitter column, multiple ferries and at least six yachts (including a green-and-gold one), wheeling birds, a foreground headland with a tiny figure, plus easter eggs (Southern Cross upper-left, a balloon/seaplane high in the sky). But two of the three hero subjects are wrong: the bridge reads as a single thin parabolic arch over a row of buildings rather than the Harbour Bridge's deck-on-pylons lattice, a

OpenAI: The scene includes nearly all required elements: Sydney Harbour Bridge, Opera House, skyline, moon, water reflections, boats, yachts, birds, foreground rocks, kayaker, dolphin fin, green-and-gold sail, stars, and a small aircraft. It is atmospheric and well-composed, though some brief-specific details are simplified or imperfect, especially the bridge pylons/lattice, the nested Opera House sail complexity, the visible sun disc, and the precision/density of reflections.

Google: The composition includes most requested elements and easter eggs, but suffers from major structural issues like rogue path lines projecting from the bridge and Opera House. A missing sun disc and a massive black void at the bottom break the edge-to-edge requirement, severely undermining the scene's polish and accuracy.

Open outputFull run
Moonshotkimi-k2.7-code
default reasoning
kimi-k2.7-code rendering of the Sydney Harbour at golden hour (SVG scene) benchmark - composite 64.1%, judge 3.6/5
Open
Composite 64.1%Judge 3.6/5
Judge panelAnthropic 3.8/5OpenAI 4.0/5Google 2.3/5
single-judge (Claude) 3.8/5 → leave-one-family-out 3.6/5

Anthropic: Nearly all brief elements are present and correct: vertical dusk gradient, low-right sun with a vertical glitter column, upper-left moon and Southern Cross stars, skyline, an arched bridge with deck and pylons, staggered ferries and several yachts including the green-and-gold one, birds, a seaplane top-left, and dark foreshore rocks with a tiny figure. Colour grading and composition are atmospheric and read instantly as a harbour at dusk, giving real impact. Craft is the weak point: the Opera House reads as a vague swoosh rather than convincing nested asymmetric sails, the bridge arch is thin

OpenAI: The scene includes nearly all requested Sydney Harbour elements, including the bridge, Opera House, skyline, moon, boats, birds, foreground figure, reflections, and several easter eggs, though the sun/glow and four bridge pylons are not especially convincing. The composition and golden-hour palette are attractive and readable, with good depth, but some reflections and foreground areas feel simplified or overly dark, keeping it just below excellent.

Google: The composition successfully incorporates the majority of requested elements, including the specific Easter eggs and lighting direction. However, the execution relies on overly primitive, blocky shapes and features broken structural details on the landmarks, significantly limiting its overall visual appeal.

Open outputFull run
Anthropicclaude-opus-4-8
Low reasoning
claude-opus-4-8 rendering of the Sydney Harbour at golden hour (SVG scene) benchmark - composite 62.5%, judge 3.5/5
Open
Composite 62.5%Judge 3.5/5
Judge panelAnthropic 3.3/5OpenAI 3.8/5Google 2.3/5
single-judge (Claude) 3.3/5 → leave-one-family-out 3.5/5

Anthropic: The scene reads clearly as a harbour at dusk with a strong warm-to-indigo sky gradient, a glowing sun, moon, far headland, skyline, a parabolic bridge arch, sailing yachts, ferries, birds, a green-and-gold sail, a seaplane, star cluster and a foreground figure - good fidelity on many elements and a pleasing, coherent palette. However it misses or fumbles key brief items: the Opera House sails are absent (no recognisable nested shells left-of-centre), and the light direction is inconsistent - the sun glitter column and stepped reflection sit centre/left of frame rather than under a right-side s

OpenAI: The scene includes nearly all required Sydney Harbour elements and easter eggs: Opera House, Harbour Bridge, moon, birds, ferries, yachts, dolphin fin, green-and-gold sail, and a seaplane, with convincing golden-hour colour and water glitter. Some brief details are only partially realized, especially the bridge lattice/four pylons, the sun disc, richer object reflections, and the screenshot is not truly edge-to-edge due to the large black area below. Overall it reads clearly and attractively as Sydney Harbour, but the finish and density fall short of a premium fully polished build.

Google: Most requested elements and Easter eggs are included, but a massive black void at the bottom fails the edge-to-edge requirement. The greyish sun and rigid, simplistic sun-glitter heavily compromise the intended atmospheric lighting and overall polish.

Open outputFull run
Zhipuglm-5.2
default reasoning
glm-5.2 rendering of the Sydney Harbour at golden hour (SVG scene) benchmark - composite 62.5%, judge 3.5/5
Open
Composite 62.5%Judge 3.5/5
Judge panelAnthropic 3.0/5OpenAI 4.0/5Google 2.3/5
single-judge (Claude) 3.0/5 → leave-one-family-out 3.5/5

Anthropic: The scene reads instantly as a harbour at dusk with a credible sky gradient, parabolic bridge arch, moon, skyline, scattered yachts, birds and foreground rocks, so most macro elements are present. But the two signature landmarks miss their spec: the Opera House reads as one rounded dome rather than nested asymmetric sails, and the bridge deck/pylon lattice is thin. The sun glow is weak with no clear glitter column, reflections are muted and generic, and the required easter eggs (Southern Cross, dolphin, green-gold sail, seaplane) are not discernible, leaving a competent but not premium result.

OpenAI: The scene includes nearly all requested components: recognizable Opera House and Harbour Bridge, skyline, moon, sun glow, glitter/reflections, boats/yachts, birds, foreground rocks, and multiple easter eggs. Composition and atmosphere are strong, though the bridge pylons/reflections are somewhat simplified and the large flat black band at the bottom hurts the fullscreen edge-to-edge finish and premium polish.

Google: Most requested elements and easter eggs are present, but the structural execution of the landmarks is quite poor. Craft suffers significantly from stray floating lines near the bridge and an improperly implemented, highly visible bounding shape for the reflections instead of a true clip path. The massive untextured foreground block also detracts from the overall visual balance.

Open outputFull run
OpenAIgpt-5.5
High reasoning
gpt-5.5 rendering of the Sydney Harbour at golden hour (SVG scene) benchmark - composite 62.5%, judge 3.5/5
Open
Composite 62.5%Judge 3.5/5
Judge panelAnthropic 3.3/5OpenAI 4.0/5Google 2.5/5
single-judge (Claude) 3.3/5 → leave-one-family-out 3.5/5

Anthropic: Many brief elements land: a dusk sky gradient, left-side moon, a few upper-left stars, the white Opera House sails left-of-center, an arching bridge over a city silhouette on the right, multiple yachts with reflections at staggered depths, mid-sky birds, dark foreground rocks, and what looks like a balloon top-right. But several headline requirements are weak or missing - the warm low-right sun and its bloom/glitter column barely read, the bridge is a thin shallow arch with no legible deck, hangers, lattice or four pylons (not the great parabolic steel arch specified), reflections are simplist

OpenAI: The scene clearly reads as Sydney Harbour at golden hour with the Opera House, Harbour Bridge, skyline, moon, birds, boats, reflections, foreground rocks, and several easter eggs present. It falls short of full fidelity because the sun disc is not clearly visible, the bridge pylons are underdeveloped, and the required small human figure is hard to identify or absent. The composition is atmospheric and polished overall, though some landmark detailing and reflections are simplified rather than truly premium.

Google: Includes most requested elements (skyline, bridge, Opera House, moon, hidden items), but misses the specific sun placement/glow on the right, contradicting the light direction. The bridge lattice is poorly drawn, and the water reflections are messy and detached.

Open outputFull run
Googlegemini-3.5-flash
default reasoning
gemini-3.5-flash rendering of the Sydney Harbour at golden hour (SVG scene) benchmark - composite 62.5%, judge 3.5/5
Open
Composite 62.5%Judge 3.5/5
Judge panelAnthropic 3.3/5OpenAI 4.3/5Google 2.3/5
single-judge (Claude) 3.3/5 → leave-one-family-out 3.5/5

Anthropic: The scene reads instantly as Sydney Harbour at dusk with a strong, coherent warm-right light direction: vertical indigo-to-amber sky gradient, a radial sun glow with a vertical glitter column on the water, crescent moon and stars upper-left, a recognisable Bridge arch and Opera House cluster, and dark repoussoir rocks in the lower corners. But fidelity to the dense brief is partial: the Opera House is flat green-gold triangular tents rather than nested overlapping warm-lit shells, the Bridge lacks visible lattice hangers/cross-bracing and four clear pylons, harbour traffic is sparse (only a co

OpenAI: The scene clearly reads as Sydney Harbour at golden hour with the Opera House, Harbour Bridge, skyline, moon, sun glitter, boats, birds, foreground rocks, and several easter eggs visible. Some brief details are weaker or incomplete, especially the bridge pylons/lattice complexity, fully broken reflections beneath all sources, and the exact count/distinctness of ferries/yachts. The composition and color are polished and atmospheric, with strong premium appeal despite a few simplified or less detailed areas.

Google: The piece captures the requested dusk lighting and successfully includes the easter eggs. However, a major coordinate error leaves the bottom third of the canvas entirely blank, severely disrupting the composition. Additionally, the landmark geometries are rudimentary, with the Opera House relying on simple sharp triangles.

Open outputFull run
OpenAIgpt-5.5-pro
High reasoning
gpt-5.5-pro rendering of the Sydney Harbour at golden hour (SVG scene) benchmark - composite 58.3%, judge 3.3/5
Open
Composite 58.3%Judge 3.3/5
Judge panelAnthropic 3.3/5OpenAI 5.0/5Google 2.5/5
single-judge (Claude) 3.3/5 → leave-one-family-out 3.3/5

Anthropic: The core scene reads instantly as Sydney Harbour at dusk: a believable indigo-to-amber sky, sun glow low-right, moon low-left, a parabolic bridge arch with deck/pylons, Opera House sails left-of-centre, a lit skyline, many yachts, a ferry, birds, foreground rocks with a tiny figure, and reflections plus a glitter column - so most brief elements are present. Craft is only moderate: the sails are closer to plain symmetric triangles than the called-for interlocking nested shells, the bridge lattice and cross-bracing are sparse, the reflections and far-headland haze are shallow, and the warm-right

OpenAI: The scene visibly includes the required Sydney Harbour landmarks, golden-hour sky, moon, skyline, bridge structure, Opera House shells, water glitter/reflections, boats, birds, foreground figure, and the hidden details such as Southern Cross, dolphin fin, green-and-gold sail, and balloon. The composition is atmospheric and polished, with strong depth, cohesive warm/cool lighting, and premium illustrative detail throughout.

Google: The illustration successfully includes all requested elements and Easter eggs, showing good adherence to the prompt. However, the composition is severely compromised by a massive black void dominating the bottom half of the canvas, which diminishes the overall visual appeal and impact.

Open outputFull run
DeepSeekdeepseek-v4-pro
default reasoning
deepseek-v4-pro rendering of the Sydney Harbour at golden hour (SVG scene) benchmark - composite 57.8%, judge 3.3/5
Open
Composite 57.8%Judge 3.3/5
Judge panelAnthropic 3.3/5OpenAI 4.0/5Google 2.3/5
single-judge (Claude) 3.3/5 → leave-one-family-out 3.3/5

Anthropic: The scene captures the dusk mood well - a coherent warm gradient sky, a low-right sun glow, a left moon and balloon, a city skyline, water reflections and a jetty path - and the composition is balanced and atmospheric. But it misses two of the brief's core requirements: the Sydney Opera House sails are essentially absent (the left cluster reads as angular yachts, not nested shells), and the bridge reads as a flat truss/cable hybrid rather than the iconic single parabolic arch. Ferries, the scale figure, and the easter eggs are not discernible, leaving a pleasant but generic harbour illustratio

OpenAI: Most required elements are visible, including the Harbour Bridge, Opera House, moon, boats, birds, reflections, foreground rocks, and several easter eggs, though the sun disc itself is not clearly shown on the right and the Opera House reads more like sharp triangular sails than nested shell forms. The composition is attractive and atmospheric with strong golden-hour colour and reflections, but some landmark detailing and depth cues are simplified, keeping it just short of excellent.

Google: Most required elements and easter eggs are included, but the execution uses overly simplistic, jagged geometric primitives that fail to capture the iconic landmark shapes. The visible sun disc is missing, and the rigid lighting, basic gradients, and lack of atmospheric depth result in an unpolished visual.

Open outputFull run
Anthropicclaude-sonnet-4-6
High reasoning
claude-sonnet-4-6 rendering of the Sydney Harbour at golden hour (SVG scene) benchmark - composite 56.3%, judge 3.3/5
Open
Composite 56.3%Judge 3.3/5
Judge panelAnthropic 3.0/5OpenAI 3.3/5Google 2.0/5
single-judge (Claude) 3.0/5 → leave-one-family-out 3.3/5

Anthropic: Most brief elements are present and identifiable: dusk sky gradient, sun bloom low-right, moon low-left, an arched bridge with deck/hangers, an Opera House left-of-centre, a skyline, multiple ferries and yachts, birds, foreground rocks, plus a seaplane and what appears to be an Australian green-gold sail. However the composition is flat and sparse, with a large empty central water plane that undercuts the dense, layered, atmospheric premium look the brief demanded. Craft is uneven: the Opera House shells use a literal rainbow gradient rather than the warm-lit/cool-shadow treatment and read lik

OpenAI: The scene is instantly recognizable with the Opera House, harbour traffic, moon, skyline, birds, dolphin fin, green-and-gold sail, and aircraft, but several core brief points are only partial: the bridge reads as a double arch without clear pylons/cross-bracing, and the sun glitter/reflections are sparse or missing. The palette and composition are attractive and atmospheric, though the simplified geometry, weak water reflection work, and large dark foreground keep it from feeling fully premium.

Google: Although many requested elements and easter eggs are included, the core landmarks severely violate the detailed brief constraints, with the bridge rendered as two intersecting arcs and the Opera House as a floating dome. Additionally, the composition is broken by a massive, empty black void filling the bottom quarter of the image.

Open outputFull run
Googlegemini-3.1-pro-preview
High reasoning
gemini-3.1-pro-preview rendering of the Sydney Harbour at golden hour (SVG scene) benchmark - composite 54.2%, judge 3.2/5
Open
Composite 54.2%Judge 3.2/5
Judge panelAnthropic 2.8/5OpenAI 3.8/5Google 2.0/5
single-judge (Claude) 2.8/5 → leave-one-family-out 3.2/5

Anthropic: Many brief elements are present and legible: the dusk sky gradient to a gold horizon band, sun glow low-right, Southern Cross upper-left, an Opera House cluster of warm-lit nested sails, a skyline, water glitter and reflections, several yachts including the green-and-gold one, ferries with lit window strips, and a foreground rock with a tiny human figure. But the build fails the fullscreen/edge-to-edge requirement badly: roughly the bottom fifth is dead black empty space, leaving the scene floating in the upper frame. The Harbour Bridge reads as a solid dark dome rather than a clean parabolic

OpenAI: The scene includes most required elements: gradient dusk sky, Sydney Harbour Bridge, Opera House, skyline, water glitter/reflections, boats, yachts, birds, foreground rocks/person, and several easter eggs. Fidelity is weakened by the moon reading like an orange object on a tower, only two clear pylons, a somewhat minimal sun bloom, and simplified/occasionally awkward reflections. Overall it is recognizable and atmospheric, but the large dark bridge mass and some janky details keep it from feeling fully premium.

Google: While incorporating most requested elements and easter eggs, the scene suffers from rudimentary shape construction, notably the solid bridge and rigidly triangular Opera House sails. A large unrendered black space dominates the lower section, significantly detracting from the edge-to-edge requirement and overall polish.

Open outputFull run
Anthropicclaude-haiku-4-5
High reasoning
claude-haiku-4-5 rendering of the Sydney Harbour at golden hour (SVG scene) benchmark - composite 54.2%, judge 3.2/5
Open
Composite 54.2%Judge 3.2/5
Judge panelAnthropic 2.3/5OpenAI 3.0/5Google 2.0/5
single-judge (Claude) 2.3/5 → leave-one-family-out 3.2/5

Anthropic: Many brief elements are attempted and recognisable: the dusk gradient sky, a right-side sun glow, a left moon, Southern Cross stars, a seaplane easter egg, skyline silhouettes, a water plane with faint reflections, scattered small boats/yacht markers, birds, and foreground rocks with a tiny figure. But the two hero landmarks fail: the central form reads as a single tent-like arch fused with the skyline rather than a parabolic Bridge spanning the right two-thirds with pylons and a deck, and the "Opera House" is a cluster of cream leaf/cypress shapes, not nested asymmetric shells. The aspect han

OpenAI: The scene includes many requested motifs—Sydney Harbour Bridge, Opera House, sun, moon, boats, birds, constellation, foreshore, and warm dusk gradients—but several requirements are simplified or incomplete, especially the bridge pylons/cross-bracing, dense reflections, sun-glitter slivers, ferries, yacht count, and headland detail. The composition is atmospheric and readable, but the rendering feels sparse and somewhat flat compared with the premium, highly layered brief.

Google: The composition includes the requested elements but renders them as primitive shapes, completely missing the iconic forms of the Bridge and Opera House. While basic gradients are present, poor detailing and execution severely limit the overall visual appeal.

Open outputFull run
Anthropicclaude-opus-4-8
High reasoning
claude-opus-4-8 rendering of the Sydney Harbour at golden hour (SVG scene) benchmark - composite 54.2%, judge 3.2/5
Open
Composite 54.2%Judge 3.2/5
Judge panelAnthropic 2.8/5OpenAI 4.0/5Google 2.3/5
single-judge (Claude) 2.8/5 → leave-one-family-out 3.2/5

Anthropic: The sky gradient, right-side sun and bloom, left moon, bridge with arch/deck/pylons, sun-glitter column, ferries, a 6+ yacht fleet (including the green-and-gold sail), the hot-air balloon and foreground rocks with a figure are all present and atmospheric, giving an instantly readable harbour-at-dusk mood. However two brief-critical failures hurt it: the Opera House's nested shell sails are effectively absent (left-of-centre reads as a generic rocky mound, not the icon), and a large flat black band fills the bottom ~20% of the frame, breaking the edge-to-edge composition and swallowing the fore

OpenAI: Most required elements are visible: Sydney Harbour Bridge, Opera House, layered sky and water, reflections, ferries, yachts, birds, foreground rocks, and several easter eggs. Some brief-specific details are weaker or ambiguous, especially the actual sun disc on the right, the four bridge pylons, dense skyline detail, and a clearly visible human figure, but the scene reads well as Sydney Harbour at golden hour with strong atmosphere.

Google: While the composition includes almost all requested elements and easter eggs, the visual execution is clunky. Craft issues like the stray diagonal line extending from the bridge, simplistic overlapping shapes for landmarks, and a massive empty black foreground detract from the overall quality.

Open outputFull run
Anthropicclaude-sonnet-5
High reasoning
claude-sonnet-5 rendering of the Sydney Harbour at golden hour (SVG scene) benchmark - composite 53.1%, judge 3.1/5
Open
Composite 53.1%Judge 3.1/5
Judge panelAnthropic 2.8/5OpenAI 4.0/5Google 2.3/5
single-judge (Claude) 2.8/5 → leave-one-family-out 3.1/5

Anthropic: Many brief elements are present and legible: a warm right-sun with a glitter column, a left-rising moon, a graded dusk sky, multiple staggered ferries, 6+ yachts (including the green-and-gold Australian sail easter egg), wheeling birds, a hot-air balloon and faint upper-left stars. But the Sydney Opera House - one of the two core landmarks - is essentially absent; the left cluster reads as clouds/pebbles, not nested shells, and the skyline floats oddly in mid-harbour under the span. Craft is dragged down by a large flat-black dead band filling the bottom quarter (looks cropped/unfinished), a g

OpenAI: The scene includes most requested elements: golden-hour gradient sky, sun and moon, Opera House, Harbour Bridge with skyline, water glitter/reflections, boats, birds, foreground rocks, and several easter eggs such as the balloon and green sail. Some specifics are simplified or imperfect, especially the Opera House shell complexity, bridge pylons, far headland, and the count/definition of ferries and reflections, but the overall composition reads clearly as Sydney Harbour at dusk with polished atmosphere.

Google: While the composition includes most requested elements and Easter eggs, the structural execution of the landmarks is highly inaccurate, with skyscrapers resting on the bridge deck and the Opera House depicted as rounded blobs. Additionally, the massive, untextured black foreground severely unbalances the visual hierarchy.

Open outputFull run
OpenAIgpt-5.4-mini
High reasoning
gpt-5.4-mini rendering of the Sydney Harbour at golden hour (SVG scene) benchmark - composite 52.1%, judge 3.1/5
Open
Composite 52.1%Judge 3.1/5
Judge panelAnthropic 3.0/5OpenAI 4.0/5Google 2.5/5
single-judge (Claude) 3.0/5 → leave-one-family-out 3.1/5

Anthropic: The core elements read instantly as Sydney Harbour: a graded dusk sky, left-side moon, nested Opera House sails, an arched Bridge with deck and pylons, ferries, yachts, birds and a foreshore figure for scale. But the light logic is broken - the warm sun disc on the right is essentially missing and the gold horizon band sits centre-left, so reflections and sail shading don't agree with a right-hand sun; the glitter column is thin and misplaced, and easter eggs (Southern Cross, dolphin, green-gold sail) are not discernible. The bottom third is a large flat black void that feels unfinished, and t

OpenAI: The scene reads clearly as Sydney Harbour at golden hour with the Opera House, Harbour Bridge, moon, skyline, boats, birds, water glitter, foreground rocks, and several easter eggs visible. Some required details are incomplete or understated, such as the Southern Cross, a clearly placed sun disc, fuller bridge pylons/lattice/reflections, and the required count/detail of ferries. Overall it is coherent, atmospheric, and polished, though somewhat simplified rather than richly dense or premium.

Google: Includes most required elements, but lacks key water reflections and the Southern Cross. The primitive shape execution, especially on the bridge and ferries, undermines the requested premium depth and atmosphere.

Open outputFull run
DeepSeekdeepseek-v4-flash
default reasoning
deepseek-v4-flash rendering of the Sydney Harbour at golden hour (SVG scene) benchmark - composite 51.6%, judge 3.1/5
Open
Composite 51.6%Judge 3.1/5
Judge panelAnthropic 2.0/5OpenAI 4.0/5Google 2.5/5
single-judge (Claude) 2.0/5 → leave-one-family-out 3.1/5

Anthropic: Many elements are present (sky gradient, left moon, Southern Cross, skyline, yachts including the green-and-gold easter egg, water glitter and reflections, a few ferries and birds), but the two priority landmarks fail: the Opera House sails are essentially absent and the Bridge is a thin weak arc with no deck, hangers, or pylons. The light direction contradicts the brief - warmth and glow sit on the LEFT while the sun was specified low-right - and a large dead black band fills the bottom third, wasting the frame and dropping the foreshore figure, dolphin, and seaplane. The result reads as a ge

OpenAI: The scene reads clearly as Sydney Harbour at golden hour with the skyline, bridge, Opera House, moon, boats, yachts, birds, foreshore figure, and several easter eggs visible. It is atmospheric and well-composed, though the sun disc/bloom itself is not clearly visible, the Bridge pylons are weak, and some landmark detailing/reflections feel simplified rather than fully premium.

Google: The dusk color palette and required Easter eggs are included, establishing the correct mood. However, the composition is unbalanced by an overly dominant black foreground, and structural inaccuracies in the landmarks and reflections weaken the overall execution.

Open outputFull run
xAIgrok-4.20-reasoning
default reasoning
grok-4.20-reasoning rendering of the Sydney Harbour at golden hour (SVG scene) benchmark - composite 41.7%, judge 2.7/5
Open
Composite 41.7%Judge 2.7/5
Judge panelAnthropic 2.8/5OpenAI 3.3/5Google 2.0/5
single-judge (Claude) 2.8/5 → leave-one-family-out 2.7/5

Anthropic: Most brief elements are present: dusk sky gradient, sun glow, lit skyline, a parabolic arch bridge, Opera-House-style sails, water with a sun-glitter column, 6+ yachts, birds, foreground rocks with a tiny figure, plus easter eggs (Southern Cross, green-and-gold sail). But execution is uneven - the sails read as soft rounded blobs rather than crisp interlocking shells, the bridge pylons and reflections are weak, ferries are indistinguishable, and the moon placement is ambiguous. The largest flaw is a flat black band filling the bottom third that cuts off the foreshore and breaks the edge-to-edg

OpenAI: Most requested Sydney Harbour elements are visible, including the bridge, Opera House, skyline, moon, birds, boats, yachts, dolphin fin, constellation and balloon, but the bridge lacks convincing cross-bracing/four pylons and the reflections/glitter are only partially developed. The colour palette and composition are attractive, though the large black empty band at the bottom breaks the fullscreen edge-to-edge requirement and makes the render feel unfinished.

Google: Features requested elements and easter eggs but suffers a severe layout failure with a massive bottom void. Architectural shapes are crude, and reflections incorrectly angle across the water.

Open outputFull run
xAIgrok-build-0.1
default reasoning
grok-build-0.1 rendering of the Sydney Harbour at golden hour (SVG scene) benchmark - composite 41.7%, judge 2.7/5
Open
Composite 41.7%Judge 2.7/5
Judge panelAnthropic 2.8/5OpenAI 3.0/5Google 2.3/5
single-judge (Claude) 2.8/5 → leave-one-family-out 2.7/5

Anthropic: Most brief elements are present: the dusk sky gradient, sun glow low-right, moon low-left, a recognisable arched Harbour Bridge, scattered yachts, a few ferries, sun glitter, foreground foreshore with a tiny figure, a green easter-egg sail and the hot-air balloon. But the second key landmark fails: the Opera House reads as plain sharp white triangles/tents rather than the nested, interlocking curved shells the brief demands, which undercuts the instant "Sydney" read. Craft is uneven - the bridge structure looks segmented and slightly janky, reflections are weak, the skyline is sparse, and a mu

OpenAI: The scene includes many required motifs—Opera House, Harbour Bridge, sun/moon, boats, birds, foreground rocks, and several easter eggs—but several are simplified or only partially correct, especially the bridge pylons/lattice, skyline depth, headland haze, and richness of reflections/glitter. The composition and color gradient are coherent and readable as Sydney Harbour at dusk, but the flat shapes and sparse detailing keep it from feeling dense, premium, or highly polished.

Google: While the composition includes most requested elements and easter eggs, the execution relies heavily on crude geometric primitives. The architectural landmarks lack accurate form, and harsh gradients alongside awkward overlaps reduce the overall polish.

Open outputFull run
Anthropicclaude-haiku-4-5
default reasoning
claude-haiku-4-5 rendering of the Sydney Harbour at golden hour (SVG scene) benchmark - composite 41.7%, judge 2.7/5
Open
Composite 41.7%Judge 2.7/5
Judge panelAnthropic 2.3/5OpenAI 2.3/5Google 2.3/5
single-judge (Claude) 2.3/5 → leave-one-family-out 2.7/5

Anthropic: Many brief elements are attempted - dusk gradient sky, moon left and sun glow right, an arched bridge, sail/Opera-House shapes, triangular yachts, ferry-like blocks, a green-and-gold easter-egg sail, a faint constellation upper-left, a balloon upper-right and a tiny scale figure bottom-left. But execution is badly compromised: a large dead black band fills the entire bottom third where the foreground foreshore and most of the water should be, killing the composition and most reflections. The Opera House reads as flat blades rather than nested shells, the bridge arch is thin and oddly proportio

OpenAI: The scene includes recognizable Sydney elements, the sun/moon, birds, yachts, rocks, and several easter eggs, but many brief-critical requirements are missing or very underdeveloped: no convincing ferry traffic, weak or absent bridge/opera/skyline reflections, minimal glitter column, no clear four pylons, and a large black unused area breaks the fullscreen edge-to-edge requirement. The composition and color palette are pleasant in a simplified poster style, but the execution is sparse and lacks the dense atmospheric depth, precision, and premium finish requested.

Google: While most checklist items and Easter eggs are technically present, the execution relies on crude, disconnected primitives. Severe structural issues, such as floating landmarks and a massive black void at the bottom, undermine the composition and overall polish.

Open outputFull run
xAIgrok-composer-2.5-fast
default reasoning
grok-composer-2.5-fast rendering of the Sydney Harbour at golden hour (SVG scene) benchmark - composite 41.7%, judge 2.7/5
Open
Composite 41.7%Judge 2.7/5
Judge panelAnthropic 2.0/5OpenAI 3.5/5Google 2.5/5
single-judge (Claude) 2.0/5 → leave-one-family-out 2.7/5

Anthropic: The scene reads loosely as a dusk harbour (indigo-to-rose sky, left moon, a bridge arch, skyline, scattered white sails and a few birds/stars), but it misses or botches the brief's two anchor landmarks: there is no Opera House at all - a grey dome/disc stands in its place - and the bridge is a thin flat arc with no slung deck, hangers, or four pylons. The sun is not a clear radial bloom low-right, reflections are sparse, ferries lack the warm window strips, no foreground human figure reads for scale, and the bottom third is a dead near-black band that wastes the composition. Overall it is unfi

OpenAI: Most requested elements are visible, including the Harbour Bridge, Opera House, moon, skyline, boats/yachts, birds, foreground rocks, and several easter eggs, but the sun itself is not clearly defined, the Opera House podium is missing, and the Bridge pylons/detail are simplified. The palette and composition are atmospheric and readable as Sydney Harbour, though the scene feels somewhat sparse and the large dark foreground plus simplified reflections reduce the premium finish.

Google: The color palette effectively establishes the golden hour mood, and requested easter eggs are present. However, key landmarks lack structural accuracy, and the large unrendered black area severely compromises the composition.

Open outputFull run
xAIgrok-4.3
default reasoning
grok-4.3 rendering of the Sydney Harbour at golden hour (SVG scene) benchmark - composite 33.3%, judge 2.3/5
Open
Composite 33.3%Judge 2.3/5
Judge panelAnthropic 2.0/5OpenAI 2.8/5Google 2.3/5
single-judge (Claude) 2.0/5 → leave-one-family-out 2.3/5

Anthropic: Many brief elements are missing or unreadable: there is no recognizable Opera House (the left shape reads as a box of buns/rocks), the Bridge is a thin floating arc with no deck, hangers, or pylons, and the "water" is a flat orange plane that looks like desert rather than an amber-to-indigo harbour with reflections. The sky gradient, moon, balloon easter egg, and a tentative Southern Cross are present, but the sun bloom, ferries, convincing yachts, and reflections are weak or absent. Composition fails edge-to-edge with roughly a quarter of the frame as dead black space at the bottom, and the w

OpenAI: The scene is recognizably Sydney Harbour with a bridge, Opera House, moon, skyline, boats, birds, foreground rocks, and several easter eggs, but key brief details are weak or missing, including a clear sun disc, four pylons, convincing nested Opera House shells, rich reflections, and distinct ferries. The palette and composition are coherent, but the rendering is sparse and simplified, with flat water, minimal depth, and many elements reading as generic icons rather than premium atmospheric detail.

Google: The composition includes most requested elements, including the easter eggs and lighting sources, but the execution is highly primitive. Key landmarks like the Opera House are unrecognizable, and the scene lacks the requested atmospheric depth and premium finish.

Open outputFull run
Googlegemini-3.1-flash-lite
default reasoning
gemini-3.1-flash-lite rendering of the Sydney Harbour at golden hour (SVG scene) benchmark - composite 14.6%, judge 1.6/5
Open
Composite 14.6%Judge 1.6/5
Judge panelAnthropic 1.5/5OpenAI 1.8/5Google 2.0/5
single-judge (Claude) 1.5/5 → leave-one-family-out 1.6/5

Anthropic: The dusk sky gradient and a low-right sun bloom land reasonably, and a pale moon disc plus a few Southern-Cross-like stars sit upper-left, but almost everything else is broken or missing: the "Opera House" is a flat dark rectangle with random pale arcs, the "Bridge" is a thin semicircular hoop (not a parabolic arch with deck, hangers, and pylons), and the water is just flat dark blocks with no glitter column or reflections. Yachts read as three stray triangles (one green) floating on land above the waterline, there are no ferries, foreshore rocks, or human figure, and the whole lower half is e

OpenAI: The image includes a sunset gradient, moon, simplified Opera House, bridge arch, a ferry, a couple of sails, and a few easter-egg-like details, but it misses or severely under-delivers many required elements such as dense skyline depth, four pylons, lattice hangers, proper water reflections, three ferries, six yachts, bird flock, foreground figure, and overall busy edge-to-edge richness. The composition has a pleasant simple color mood, but the execution is very sparse and schematic rather than a premium, atmospheric Sydney Harbour scene.

Google: The build attempts the core scene but falls severely short on required detail, lacking complex structure for the landmarks and skyline. Lighting and depth are poorly realized, with reflections rendered as a crude solid block rather than atmospheric elements. Overall execution feels flat and unfinished.

Open outputFull run
Anthropicclaude-sonnet-4-6
High reasoning
claude-sonnet-4-6 rendering of the Sydney Harbour at golden hour (SVG scene) benchmark - composite 8.3%, judge 1.3/5
Composite 8.3%Judge 1.3/5
Judge panelAnthropic 1.0/5OpenAI 1.0/5Google 2.0/5
single-judge (Claude) 1.0/5 → leave-one-family-out 1.3/5

Anthropic: The screenshot is an essentially blank, uniform near-black canvas with no visible content whatsoever. None of the briefed elements render - no sky gradient, sun, moon, Harbour Bridge, Opera House, water, glitter, ferries, yachts, birds, foreground rocks, or any easter eggs. This is a failed render with zero discernible scene, so it cannot satisfy fidelity, design, craft, or impact.

OpenAI: The screenshot is essentially a blank dark page with none of the requested Sydney Harbour elements visible: no sky gradient, sun, moon, landmarks, water details, traffic, birds, foreground, or easter eggs. As a rendered build it does not demonstrate the required composition, depth, craft, or visual impact.

Google: The render failed entirely, displaying only a blank, dark canvas. None of the requested scene elements, lighting, or details are present.

Full run
Anthropicclaude-sonnet-5
High reasoning
claude-sonnet-5 rendering of the Sydney Harbour at golden hour (SVG scene) benchmark - composite 6.3%, judge 1.3/5
Composite 6.3%Judge 1.3/5
Judge panelAnthropic 1.0/5OpenAI 1.0/5Google 2.0/5
single-judge (Claude) 1.0/5 → leave-one-family-out 1.3/5

Anthropic: The screenshot is an essentially blank, uniform near-black/dark-indigo canvas with no discernible content. None of the required elements rendered: there is no sky gradient, sun glow, moon, Harbour Bridge, Opera House sails, water with glitter or reflections, ferries, yachts, birds, or foreground rocks. This is a broken/empty render that fails every criterion.

OpenAI: The screenshot appears essentially blank/dark, with none of the required Sydney Harbour elements visible: no sky gradient, sun/moon, bridge, Opera House, water reflections, traffic, birds, foreground, or easter eggs. As a result it reads as an unfinished or failed render rather than a dense atmospheric SVG scene.

Google: The provided image is entirely blank, indicating a total rendering failure. None of the requested elements from the brief are present.

Full run
Anthropicclaude-fable-5
High reasoning
claude-fable-5 rendering of the Sydney Harbour at golden hour (SVG scene) benchmark - composite 6.3%, judge 1.3/5
Open
Composite 6.3%Judge 1.3/5
Judge panelAnthropic 1.0/5OpenAI 1.0/5Google 2.0/5
single-judge (Claude) 1.0/5 → leave-one-family-out 1.3/5

Anthropic: The screenshot is an entirely blank, uniform near-black canvas with no visible content whatsoever. None of the required elements rendered - no sky gradient, sun, moon, Harbour Bridge, Opera House, water, glitter, ferries, yachts, birds, or foreground. This is a failed render that delivers nothing against the brief.

OpenAI: The screenshot is essentially a blank dark page with none of the requested Sydney Harbour elements visible: no sky gradient, sun/moon, bridge, Opera House, water reflections, boats, birds, foreground, or easter eggs. Because the scene does not render as an illustration, it fails the brief and has no appreciable composition, craft detail, or visual impact.

Google: The screenshot is completely blank, indicating a total rendering failure. No elements from the detailed brief are visible.

Open outputFull run