Today I read a review of AMD’s new Radeon RX470 on Tweakers.net, by Jelle Stuip. He used Time Spy as a benchmark, and added the following description:
About 3DMark Time Spy has recently been some controversy, because it is found that the benchmark does not use vendor specific code paths, but a generic code path used for all hardware. Futuremark see that as a plus; it therefore does not matter which video card you test, because when turning Time Spy they all follow the same code path, so you can make fair comparisons. That also means that there is no specific optimization for AMD GPUs and AMD’s implementation of asynchronous compute is not fully exploited. In games that do use can be the relationship between AMD and Nvidia GPUs also different from the Time Spy benchmark represents.
Sorry Jelle, but you don’t get it.
Indeed, Time Spy does not use vendor specific code paths. However, ‘vendor’ is a misnomer here anyway. I mean, you can write a path specific for AMD or NVidia’s current GPU architecture, but that is no guarantee that it is going to be any good on architectures from the past, or architectures from the future. You need to write architecture-specific paths, is what people of the “vendor specific code path”-school of thought really mean. In this case, it is not just the microarchitecture itself, but the actual configuration of the video card has a direct effect on how async compute code performs as well (balance between number of shaders, shader performance and memory bandwidth and such factors).
However, in practice that is not going to happen of course, because that means that games have to receive updates to their code indefinitely, for each new videocard that arrives, until the end of time. So in practice your shiny new videocard will not have any specific paths for yesterday’s games either.
But, more importantly… He completely misinterprets the results of Time Spy. Yes, there is less of a difference between Pascal and Polaris than in most games/benchmarks using Async Compute. However, the reason for that is obvious: Currently Time Spy is the only piece of software where Async Compute is enabled on nVidia devices *and* results in a performance gain. The performance gains on AMD hardware are as expected (around 10-15%). However, since nVidia hardware now benefits from this feature as well, the difference between AMD and nVidia hardware is smaller than in other async compute scenarios.
Important to note also is that both nVidia and AMD are part of FutureMark’s Benchmark Development Program: http://www.futuremark.com/business/benchmark-development-program
As such, both vendors have been actively involved in the development process, had access to the source code throughout the development of the benchmark, and have actively worked with FutureMark on designing the tests and optimizing the code for their hardware. If anything, Time Spy might not be representative of games because it is actually MORE fair than various games out there, which are skewed towards one vendor.
So not only does Time Spy exploit async compute very well on AMD hardware (as AMD themselves attest to here: http://radeon.com/radeon-wins-3dmark-dx12/), but Time Spy *also* exploits async compute well on nVidia hardware. Most other async compute games/benchmarks were optimized by/for AMD hardware alone, and as such do not represent how nVidia hardware would perform with this feature, since it is not even enabled in the first place. We will probably see more games that benefit as much as Time Spy does on nVidia hardware, once they start optimizing for the Pascal architecture as well. And once that happens, we can judge how well Time Spy has predicted the performance. Currently, DX12/Vulkan titles are still too much of a vendor-specific mess to draw any fair conclusions (eg. AoTS and Hitman are AMD-sponsored, ROTR is nVidia-sponsored, DOOM Vulkan doesn’t have async compute enabled for nVidia (yet?), and uses AMD-specific shader extensions).
Too bad, Jelle. Next time, please try to do some research on the topic, and get your facts straight.
First time we get non optimized program and and we get whining. Anything else than Pro AMD, is non optimized.
AMD’s worst enemy is AMD itself.
This is exactly the problem I have with how things are reported. It’s not “not optimized” for AMD, it’s optimized for *both* AMD and nVidia. And if you look at the gains that AMD hardware gets, you’ll see that it is equivalent to what eg DOOM gain from async compute: around 10-15%.
What I want to know; given both Polaris 10 and Polaris 11 have launched, does it deliver on the original claim “Polaris delivers 2.5x perf/watt over ’28 nm GPUs'”?
(Now re-reading that, it could mean any 28nm GPU product they delivered!)
Well, at least we can be sure they don’t deliver 2.5x perf/watt compared to their own previous generation, let alone compared to nVidia’s last 28nm generation (they barely break-even with that one). And well, I can’t be bothered to check, but I believe they technically only had one generation of 28nm really. They re-branded and re-configured their existing GCN architecture, but no significant jumps in perf/watt (except for using HBM, but they don’t use that on Polaris, so comparing those cards would actually work against them).
For the Polaris 10 at least, not even close. There were also some more specific claims such as “2.8x performance/Watt” when comparing the RX 470 to the old R9 270X. Unfortunately, “Watts” turned out to be the cards’ rated power, not what they actually consume in the tested games. The gaming consumption of the R9 270X can be much lower (by up to a third) than the rated power, whereas the RX 470 often consumes a bit more than the rated power in that slide (120 W instead of 110 W). So, the implied energy efficiency was about 50 % better than what was actually realized. In Techpowerup’s tests the situation was worse than this, with the RX 470 having only about 1.5 times the energy efficiency of the R9 270X (based on the results of the RX 470 and the RX 480 reviews; the RX 470 was not directly compared to the R9 270X).
Of course, the slide mentioned that the “perf/Watt” was based on performance and “Boardpower”, so one might argue that it wasn’t a lie, just a meaningless metric (something like performance measured in games, power consumption in Furmark).