AMD fanboys posing as developers

The other day I found that this blog was linking back to one of my blogposts. Since I did not agree with the point he was trying to make by referring to my blog, I decided to comment. The discussion that ensued was rather sad.

This guy pretends to be knowledgeable about GPUs and APIs, and pretends to give a neutral, objective view of APIs… But when push comes to shove, it’s all GCN-this, GCN-that. He makes completely outrageous claims, without any kind of facts or technical arguments to back them up.

I mean, firstly, he tries to use C64 as an example of why being able to hack hardware at a low level is good… But it isn’t. Even though we know a lot more about the hardware now than we did at its release in 1982, the hardware is still very limited, and no match for even the Amiga hardware from 1985. Hacking only gets you so far.

He also tries to claim that GCN is relevant, and how the consoles were a huge win for AMD. But they weren’t. On the PC, nVidia still out-sells AMD by about 2:1. Only about 12-17% of the installed base of DX11-capable GPUs is GCN.

Also he makes claims of “orders of magnitude” gains of performance by knowing the architecture. Absolute nonsense! Yes, some intricate knowledge of the performance characteristics of a given architecture can gain you some performance… But orders of magnitude? Not in this day and age.

As I said in the comments there: It doesn’t make sense to design a GPU that suddenly has completely different performance characteristics from everything that went before. That would also mean that all legacy applications would be unoptimized for this architecture. A PC is not a console, and should not be treated as such. A PC is a device that speaks x86 and D3D/OGL, and CPUs and GPUs should be designed to handle x86 and D3D/OGL code as efficiently as possible.

Because this is how things work in practice, ‘vanilla’ code will generally run very well out-of-the-box on all GPUs. You could win something here and there by tweaking, but generally that’d be in the order of 0-20%, certainly not ‘orders of magnitude’. In most cases, PC software just has a single x86 codepath and just a single D3D/OGL path (or at least, it has a number of detail levels, but only one implementation of each, rather than per-GPU optimized variations). Per-GPU optimizations are generally left to the IHVs, who can apply application-specific optimizations (such as shader replacements) in their drivers, at a level that the application programmer has no control over anyway.

It’s just sad that so many developers spread nonsense about AMD/GCN/Mantle these days. Don’t let them fool you: GCN/Mantle are niche products. And Mantle doesn’t even gain all that much on most systems, as discussed earlier. So if a fully Mantle-optimized game is only a few percent faster than a regular D3D one (where we know AMD’s D3D drivers aren’t as good as nVidia’s D3D drivers anyway), then certainly ‘orders of magnitude’ of gains by doing GCN-specific optimizations is a bit of a stretch. Especially since virtually all of the gains with Mantle come from the CPU-side. The gains do not come from the GPU running more efficient shaders or anything.

Advertisements
This entry was posted in Direct3D, Hardware news, OpenGL, Software development and tagged , , , , , , . Bookmark the permalink.

15 Responses to AMD fanboys posing as developers

  1. Thomas Bruckschlegel says:

    “GPUs should be designed to handle x86 and D3D/OGL code as efficiently as possible”

    Considering the age of those APIs and the amount of work it takes to build “good” drivers that work well with current GPUs, sometimes its better to throw away old code and concepts and do a fresh start. GPUs should not be crippled around outdated APIs. The same goes for other HW like SSDs.

    • Scali says:

      Considering the age of those APIs and the amount of work it takes to build “good” drivers that work well with current GPUs, sometimes its better to throw away old code and concepts and do a fresh start.

      As I’ve covered on my blog elsewhere… OpenGL and D3D are vastly different when it comes to this.
      I agree that the APIs should be kept up-to-date with the hardware, but D3D has been doing that since the beginning (and DX12 is just around the corner now, as you know).
      For OpenGL this is more of a problem, the API has only been extended over the years, never redesigned… however, in practice the performance difference between the outdated, single-threaded OpenGL and the multithreaded, reasonably up-to-date D3D11 is not that big, and even with D3D12 I don’t expect huge differences, except on very low-end CPUs combined with very high-end GPUs.

      D3D’s shared runtime and precompiled shader support also make it much easier to write good drivers.

      GPUs should not be crippled around outdated APIs.

      Not in an ideal world no… But it’s not very practical to rewrite all existing software with new APIs everytime a new GPU is launched. There’s always legacy software. Nobody will buy a GPU that is no good at running any existing code.
      I’m just not sure why people aren’t getting this in an x86-world. I don’t see people like you complaining that x86 CPUs are still using the horribly outdated instructionset from 1978, rather than something radically redesigned and modernized.
      D3D has a good balance of updating itself regularly and being redesigned to fit new hardware at an early stage (developed together with the IHVs, often before the hardware is on the market), and being a stable platform that is easy to support for IHVs.

    • Klimax says:

      What do you consider old API? BTW: Old fixed pipeline is already translated in to shaders since DirectX 9.
      There is no more point for just change, because we won’t gain anything and lose quite few things. And you definitely won’t gain anything on drivers side, because no matter what you do, drivers will have to do a lot of work. Nature of field…

      And DX 11 is more or less the end of the road for programmability. (It covers DirectCompute which like CUDA and OpenCL allows full code to execute) Primary limitation of GPUs is in branching and some non-parallelizable code. (Their cores are still very simple in nature and size is from massive multiplication of them)

      You can get some things by adding features similar to tessellation, but that is all, until we can go to brand old/new technology like raytracing and such.

      So no, “old” API no longer limits or cripples, because we already reached most of goals. There is currently nothing in near future which would we gain by full restart.

      • Thomas Bruckschlegel says:

        I consider OpenGL as pretty old and full of legacy code. This becomes pretty clear when you look at the OpenGL drivers of the big 3 IHVs.
        DX11 is better,but still a bit “old”.
        None of those APIs is capable of using multiple GPUs natively – like for example multi GPU rendering algorithms or one for rendering and one for compute work.

      • Scali says:

        None of those APIs is capable of using multiple GPUs natively – like for example multi GPU rendering algorithms or one for rendering and one for compute work.

        Uhh, yes they are.
        Especially Direct3D is quite useful at that. The code I write for my company is very much multi-threaded and multi-GPU. I’ve even done a blogpost on a problem I ran into a while ago: https://scalibq.wordpress.com/2012/05/25/the-story-of-multi-monitor-rendering/

      • Thomas Bruckschlegel says:

        I think you got me wrong. In DX11 & OpenGL you cannot assign workloads to different GPUs connected to a single screen. For computing there is some very limited interop functionality for OpenCL and rendering APIs.

      • Scali says:

        I don’t get you wrong, you ARE wrong, you don’t seem to know how the APIs work.
        In D3D you can create a device context for each GPU, which allows you to assign workloads specifically to each GPU. You can also share buffers between device contexts, and each GPU is capable of outputting to each screen, so multiple GPUs rendering to a single screen is certainly possible.

      • Thomas Bruckschlegel says:

        Not efficiently. You cannot share resources directly, you can copy resource via the CPU to each GPU(http://msdn.microsoft.com/en-us/library/windows/desktop/ff476259(v=vs.85).aspx) but not between GPU1GPU2. This kills any practical usage and that’s one reason why there is not a single game out there,that has native multi GPU rendering not relying on driver profiles and driver tricks (crossfire & SLI setting).

      • Scali says:

        As I say, you don’t seem to know how the APIs work. You’re looking at the wrong stuff. This is what you should be looking at: http://msdn.microsoft.com/en-us/library/windows/desktop/ee913554(v=vs.85).aspx

      • Thomas Bruckschlegel says:

        Interesting,I did not know that API. Is it performing well on multiple physical devices? Will it utilize the GPU’s DMA if supported or is it just a CPU driven solution?

      • Klimax says:

        @Thomas Bruckschlegel
        “Interesting,I did not know that API. Is it performing well on multiple physical devices? Will it utilize the GPU’s DMA if supported or is it just a CPU driven solution?”
        That would be up to drivers, but based upon driver documentation DMA should be used everywhere possible:
        http://msdn.microsoft.com/en-us/library/windows/hardware/ff570591(v=vs.85).aspx

      • Scali says:

        Yup, it is core functionality in Windows 7 and newer: this is also used when you drag a window from one screen to the next, where each screen is connected to a different GPU.
        This has always worked in Windows, but there’s a huge improvement in performance from Windows XP to Vista and newer. Vista still had the limitation that you could only use a single driver though, so your GPUs had to be from the same vendor at the very least (and not one running on some kind of ‘legacy’ driver). Back then there probably was a proprietary solution for it.
        In Windows 7 there’s a vendor-agnostic interface in place in the driver model (which is also used for things like nVidia’s Optimus), so any two GPUs can generally share data efficiently. I’ve had systems with an onboard Intel GPU and a discrete card, and rendering code on the discrete GPU, then dragging the window onto a screen on the Intel GPU output was very efficient.

  2. lordmarcus says:

    “A PC is a device that speaks x86 and D3D/OGL, and CPUs and GPUs should be designed to handle x86 and D3D/OGL code as efficiently as possible.”

    My question is, is it possible to get into the “assembly language” of a GPU? this always eludes me from a technical point of view, and google is not helping.

    AFAIK we always need “fine tuned drivers” before you can tap into a GPU full potential, as generic VGA drivers included in the OS won’t be enough. Those drivers implement both the D3D and OGL APIs and there is a severe penalty if the driver (made by Intel, AMD, Nvidia) is not-as-good-as-it-should-be.

    Why do we even need drivers at all? cant we see the GPUs as just co-procesors or extensions to the x86 instructions? am I getting something wrong here?

    • Scali says:

      As far as I know, there are no tools to write code directly in a GPU’s native assembly language. Not even low-level solutions such as Cuda and Mantle.
      We need drivers because not all GPUs speak the same language. Just like not all CPUs are x86-compatible. It’s just that over time, x86 has become so dominant that other CPUs aren’t relevant anymore, at least, not in certain markets. The instructionset has been ‘frozen in time’.
      x86 still has a ‘driver’ in a way: modern x86 CPUs have a post-RISC backend, and translate x86 instructions to their native format.
      With GPUs, this is handled in software, which allows for more flexibility and cheaper GPU designs.
      Especially in the early days of GPUs this was very important, because each new generation would have a very different architecture from what went before it. Because of drivers, they could still run existing software without a problem.

      These days, some people claim that GPUs have reached their final form, so the instructionset can be ‘frozen in time’ like with x86. But I personally don’t think this is the case. I think a well-designed driver system has very little overhead, and the added flexibility and compatibility are of greater value than a few percent of extra performance you may gain.

  3. Pingback: DirectX 12 is out, let’s review | Scali's OpenBlog™

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s