3DMark'03 - A word of warning

X

Xavier

Guest
3DMark'03 - A word of warning.

If you didn't know already, today at 6pm GMT futuremark are releasing the latest of the 3DMark series - 3DMark'03...

But before you whisk off to futuremark.com and add to the obscene traffic created as thousands of users try to download code over 150Mb in size I'd like you to have a read of our reactions to the benchmark, having worked with it over the last couple of weeks...



First and foremost, the aims of any 'gamers benchmark' is exactly that- to simulate realistic load based on one or more generations of todays gaming APIs. The ideal game benchmark would be one which, rather than simulating said workload with bespoke code, instead uses real-world game engines in real-world situations.

Having tested 3DMark'03 for over a fortnight, and looked at its tests in great detail, it appears MadOnion/FutureMark have done quite the opposite with the latest generation of their synthetic benchmarking tool, making it less relevant than ever before.

So, onto those game tests in full...

Note: it's only the game tests themselves which contribute to the final score, the tests after, in 2001 and '03, are for the users benefit only and are solely there to help identify bottlenecks and performance issues in their systems.

Game 1 - 1940's WW2 midair battle.
The first of the 3DMark'03 tests is that of a DX7 based 'battle of britain' style conflict. At high altitudes, well above cloud level, scores of planes engage and attack one another.

Ignoring the alarming factoid that the flightsim genre accounts for less than 2% of todays game market, few if any utilise such high-altitude scenes - in this case, multiple T'n'L objects with a software vertex shader moving through a single-textured cubemap scene.

Although the aircraft themselves are quad-textured objects, they are so few, and are rendered at such distance that this has little overall effect on the rendering throughput of the system.

This relegates the entire scene to becoming little more than a single-texture fillrate test - in essence the scene is actually simpler than any of the three DX7 games used in the first three tests of 3DMark2001. I'm not sure why FutureMark have taken such a huge step backward - no games since DX6 have solely relied on single-texturing throughput... you have to look four years back at least to find any games which correlate to this kind of workload.

Game 2/3 - First Person Shooter/Fantasy

As DirectX8 benchmarks, games 2 and 3 can be addressed together - they both appear to use identical render paths and feature sets. It's a tad annyoing that we're being fed the same test twice with different content genre, but that's not even the half of it when you look under the hood.

Both tests try to duplicate the "Z-First" render style seen in Doom 3, amongst others, in some areas the similarities are obvious, but the way in which FutureMark are achieving these visual effects are quite different to any method a games developer would undertake... from sheer sense alone.

The best, and most prominent of these faux-pas is the shadow calculation used in both scenes, FutureMark attempts to use a technique seen in the doom3 previews (and leaked code) known as shadow stencil volumes. It's a multi-pass algorythm done for all objects in the scene. In 3DMark the passes look something like this:

Objects:
Pass 1 (early z)
Skin object in Vertex Shader
Pixel Shader writes Z,RGB = ambient and Alpha = perspective Z

Lights:
For every object:

Pass 2 (Stencil Shadow Volume calculation)
Set stencil to increment/decrement
Skin Object in Vertex Shader
Stencil extrusion calculation
No Pixel Shader
Pass 3 (Lighting)
Skin Object in Vertex Shader
Pixel Shader (lighting) write RGB = color

The Skin Object in Vertex Shader step means the code repeats the same skinning calculation over and over for each object, every time producing the same result and unnecessarily adding to the workload of the chip. It's bugs like these which if found in real-world game code are removed in the alpha phase at the very latest.

The unneccessary processing is seen further in their stencil extrusion calculations, rather than use the widely available method utilised in Doom, 3DMark'03 uses a method which adds six times the number of verticies per extrusion, something which would never happen in commercial game code.

Together these produce a huge bottleneck in the Vertex portion of the pipeline and areas such as the pixel shaders, raster operations and texture units never really get tested. Vertex shading in one form or another has been with us since DX7 - and as none of the DX8 features are properly loaded you could almost consider both these tests 'DX7 benchmarking for DX8+ chips.

More unfortunate is the fact that these two tests have rather large bugs in the same shadow code, check out the limbs as the girls and trolls move in game3 and you'll see shadows popping out of lit surfaces, the glitches are only sporadic but they're there, and having witnessed it on more than one DX9 platform it's pretty clear the bug exists in their code rather than hardware or drivers.

Before I get onto game4 there's one more thing to note. For developers DirectX8 was a bit of a mess - as a first encounter with pixel shading, the introduction of multiple pixel shader standards(ps1.1/ps1.3/ps1.4) and dx versions (8.0/8.1) complicated matters, but with a little common sense and application things did settle down eventually... Because different GPUs offered varying levels of shader support most DX8 games offer fallback support from 1.4 first to 1.3 and then 1.1 standards as and when necessary. 3DMark'03 fails to offer any such redundancy - when a GPU turns its nose up at PS1.4 the tests revert directly to 1.1, resulting in a massive performance hit on many GPUs. The Nature scene, one of the most impressive of 3DMark2001SE itself is PS1.3 based - and to our knowledge there are no a-list games which follow the tangential logic in circumventing 1.3 straight from 1.4->1.1. Why FutureMark feel this can in any way represent any real-world gaming situations is a total mystery.


And finally...

Game 4 - "Mother Nature"

As 3DMark'03s DX9 benchmark we expected a lot from Mother Nature, almost considered "Nature Scene II", FutureMark claimed to have pulled out all the stops with this test and were promising to really wow gamers with DX9 shady goodness.

Excuse me for being confused therefore, when I note that seven of the nine shaders in use in this scene are infact only DirectX 8 level code (all PS1.4 with no PS1.3 fallback, as in Game 2+3), leaving only two shaders, less than a third of the effects in the scene to 'measure' ALL the DX9 performance of the ENTIRE benchmark (remember FutureMark list a DX9 board as a requirement of this benchmark) - and all DX9 hardware ends up doing is two measly chunks of code across four game tests.



At first we were chuffed with being able to sample the new code early and provide them with feedback - but now on the day of release, seeing what they've signed off as 'gold' code, I personally am stunned. In building systems based on the recommendations of what is appearing to be a seriously dubious synthetic benchmark, gamers will just end up cheated of the the potential performce that their investment could have delivered.



So, before you join the ratrace in trying to get at this mammoth download, consider what use, if any it will be to you, personally we're going to stick to real-world game benchmarks, which over the last 12 months have come a long way - most games now have them, and those that don't directly can be measured in one way or another with a tool such as FRAPS, overall these numbers will be far more representative of what we actually buy our PCs for - gaming...
 
M

Mr. 47

Guest
UT2003 comes with a performance testing do da... that uses a level from the game and lotts-o-bots...
 
X

Xavier

Guest
actually the botmatch isn't a fair benchmark as the AI scales dependant on the CPU load which can vary from card to card, and even between individual runs of the exact same test. The Flyby however from the same benchmark is bang on the money.
 
S

Sawtooth

Guest
Ran it last night as is and got 4500 ish, compared with 12558 on the old one.
 
J

Jonty

Guest
Originally appeared on GameSpot
... Nvidia has contacted us to say that it doesn't support the use of 3DMark 2003 as a primary benchmark in the evaluation of graphics cards, as the company believes the benchmark doesn't represent how current games are being designed. Specifically, Nvidia contends that the first test is an unrealistically simple scene that's primarily single-textured, that the stencil shadows in the second and third tests are rendered using an inefficient method that's extremely bottlenecked at the vertex engine, and that many of the pixel shaders use specific elements of DX8 that are promoted by ATI but aren't common in current games.
 
X

Xavier

Guest
heh, same has appeared on HardOCP - so if we're able to spot this so easily why can't futuremark??
 
S

Scouse

Guest
Nvidia should have spotted this ages ago - they've been on the design team for 2 years - until this december that is...
 

Users who are viewing this thread

Top Bottom