The myth of HBM

It’s amazing, but AMD has done it again… They have managed to trick their customer base into believing yet another bit of nonsense about AMD’s hardware.

This time it is about HBM. As we all know, AMD has traded in GDDR5 memory for HBM on their high-end cards, delivering more bandwidth. The downside is that with the current state of technology, it is not feasible to put more than 4 GB on a GPU. This while even AMD’s own GDDR5 cards already have 8 GB on board, and the competing nVidia cards are available with 6 or even 12 GB of memory.

So far, so good. Now, the problem is that AMD somehow brainwashed their followers into believing that more bandwidth can compensate for less capacity. So everywhere on the forums you read people arguing that 4 GB is not a problem because it’s HBM, not GDDR5.

The video memory on a video card acts mostly as a texture and geometry cache. For the GPU to reach its expected level of performance, it needs to be able to access its textures, geometry and other data from the high-speed memory on the video card, rather than from the much slower system memory.

As long as your data fits inside video memory, the bandwidth determines your performance. However, as soon as you run into the capacity limit of your video memory, you need to start paging in data from system memory. Since system memory is generally an order of magnitude slower than video memory, the speed of the video memory is completely irrelevant. The speed at which the data can be transferred to video memory is completely bound by the system memory speed.

So, HBM does in no way make paging data in/out to system memory any faster than any other memory technology would. Therefore the only performance problem we’re dealing with here is the point at which the paging becomes necessary. Which is solely dependent on capacity. A card with 4 GB will hit that point sooner than a card with 6, 8 or 12 GB. And when that point is hit, performance will become erratic, because your game will have to page textures periodically, resulting in frame drops. That should be pretty easy to understand for anyone who bothers to think it through for a few moments.

The one thing you can say is that because the initial performance is higher, it can ‘absorb’ a frame drop slightly better. That is, if you only look at average framerates. If you look at frame times, you’ll still see nasty jitter, and the overall experience will be far from smooth. You will be experiencing stutter every time the system has to wait for a new texture or other data to be loaded.

This entry was posted in Hardware news and tagged , , , , , , , , , , , . Bookmark the permalink.

22 Responses to The myth of HBM

  1. Luciano says:

    I recall reading somewhere that even nVidia is going to use HBM in their Pascal microarchitecture. Do you think their 1000 series will also be capped at 4 gigs of vram?

    • Scali says:

      I don’t think so. nVidia will be using HBM2, which should be a more mature technology. I think by then 6 GB or perhaps even 8 GB should be possible.

      • k1net1cs says:

        IIRC the 4GB limitation on HBM that AMD currently uses is due to its prohibitive cost.
        It wasn’t a technical limitation per say (I think the spec suggested 32GB is the max), but going over 4GB wouldn’t be cost effective.

        That said, yeah, the bandwidth is irrelevant when games start to hit that 4GB ceiling.
        I still recall the days where AMD fanboys were laughing when nVidia’s top end was still using 4GB while AMD’s had 6GB, with the argument of ‘HD gaming needs extra VRAM’…which is true.
        Now that the situation is reversed (AMD having the lower amount) they’re still laughing that nVidia has the ‘slower’ one.

        *sigh*

        Pretty sure at some point in the near future those people will argue that AMD’s HBM is like Mantle: it sets a precedent to something better.
        At least they’d like to believe so.

      • Scali says:

        Well, that’s what I meant by ‘feasible’. I didn’t say it was technically impossible, it just wouldn’t be very cost-effective with the current state of the technology.

        And yes, HBM is a joint project of AMD, Hynix, UMC and Amkor/ASE. AMD is already taking all the credit.

  2. Jim says:

    Scali, great post. NVDA was wise to choose to wait for HBM2, which is a much better solution overall and it will have Samsung also making it. AMD’s HBM hype on Fury X was clearly driven by marketing and, as you say, AMD tried to do a snowjob on the unsuspecting. Fury X clearly did not live up to the hype. Neither did Nano as it was grossly mispriced. Nvidia must be very happy with AMD stumbling.

    Keep up the great work.

    • Scali says:

      Well, credit where credit’s due: someone had to be the first to take the gamble and try a new memory technology, and AMD was clearly the first.
      I simply don’t agree with how AMD (mainly Richard Huddy) spreads lies to try and cover up the 4 GB limitation they currently have.

    • semitope says:

      nvidia did not choose to wait for HBM2. They were in no position to use HBM1.

      • Scali says:

        Not literally no, but they could have tried to make their own alternative technology. Intel did exactly that, with Hybrid Memory Cube: http://blogs.intel.com/intellabs/2011/09/15/hmc/

      • semitope says:

        I doubt nvidia is actually capable of that. The only reason nvidia is even comparable to AMD is AMDs failure on CPUs. In terms of technical competence they are not in the same league. Nvidia is still just a GPU company, AMD is a fallen tech giant.

        Intel did that for high end CPUs, servers etc iirc. Intel still supports HBM. They will use whatever benefits their consumer CPUs or god forbid they start making discrete GPUs.

      • Scali says:

        Why wouldn’t nVidia be capable of it, when AMD is? nVidia is a larger and richer company than AMD is at this point. Also, to call nVidia ‘just a GPU company’ is rather clueless. nVidia also makes SoCs. Besides, ‘just GPUs’? The complexity of current GPUs has arguably surpassed that of CPUs.

  3. qwerty says:

    AMD’s Joe Macri (at the time of Fury’s release) was claiming for AMD to have recently developed a revolutionary on-die delta color compression method that would apparently negate the HBM’s VRAM limitation.

    • Scali says:

      Just more lies from AMD. Firstly, this color compression was part of GCN 1.2 already, even for the GDDR5-based models. Secondly, this is performed by the ROPs, and has no effect on the data that has to be transferred between system memory and video memory. Lastly, nVidia has introduced a next-generation delta compression scheme on Maxwell as well, and AMD probably will not have much of an advantage over the competition, if at all.
      “Negating HBM’s VRAM limitation”? Hardly.

    • mm says:

      I don’t think that Macri said that. He did claim that 4 GB would be enough but the part about compression being the solution seems to have been added by other people and does not appear in actual quotes. Indeed, the purpose of the framebuffer compression is probably related to bandwidth (more important for Tonga than for Fiji), not space, since compressibility is not guaranteed in lossless compression.

  4. semitope says:

    More AMD hate. Even though the fury x with its 4GB HBM is strong competition, often surpassing 980ti, you still want to bash it because AMD!!! Instead of “OMG they lied” why don’t you try making a post about why the hell its actually acting the way he said it would at higher resolutions? sheesh.

    Where’s the post about nvidia’s lies at CES? Since this is all objective.

    • Scali says:

      Please pay attention, I am not talking about performance here. I’m explaining why it is simply not true that memory bandwidth can compensate for memory capacity.

      As for “nVidia’s lies at CES”… I have no idea what that is even supposed to mean.

      • semitope says:

        Its ultimately performance you are talking about. Its pointless to trick people that 4GB HBM is enough when it’s enough. Are you tricking people into believing the truth? If it were a problem shown in performance there would be a point to your post.

        In the end all you are doing is interpreting a couple sentences as meaning swapping from system RAM to VRAM. He never said its the reason HBM 4GB would be enough, he said that would not get in the way of the GPU.

        if you want to look for lies then check this more technical answer

        http://arstechnica.com/information-technology/2015/05/the-tech-behind-hbm-why-amds-high-bandwidth-memory-matters/2/

        nvidia lies that seem to fit your kind of posts against AMD

        claiming pascal was on their PX2
        claiming their setup was 150 (?) times faster than macbook pro

      • Scali says:

        No, it is not performance. Performance is always about a certain context. A certain game, using certain settings, running with a certain driver.
        Sure, you can always make games behave properly with 4 GB, you can even limit texture resolution/quality in the driver (Quack much?).
        But none of that proves in any way that having more bandwidth is a substitute for having more memory capacity (which is an absolute statement, so no context of any kind is required. It doesn’t depend on which game, which settings or which drivers you use).

        Do you also spot the bitter irony that Mantle/DX12/Vulkan actually push memory management out of the driver/API and into the hands of the game engine? This means that AMD is losing control over memory management in the near future.

        Also, I’m not sure why you’re talking to me in a way that it seems like you’re trying to explain things to me.
        I’m not the one who has a deficit in knowledge or understanding here. Just because I don’t agree with you/AMD doesn’t mean I don’t understand it. On the contrary. I do understand it, and I understand that what they’re saying is nonsense, and they can only cheat their way out of this one (which they fully intend to do, no doubt).

    • qwerty says:

      You could put some sort of a rocket booster on a Mazda Miata and make it run nearly as fast as a Bugatti Veyron on select tracks, but that doesn’t put the two vehicles in the same class.

      A Fury X is essentially an overclocked Nano with a water block. When you put water cooling on a 980Ti and then overclock it as much as you can, the Ti leaves the Fury X behind in the dust by a significant margin.

      • semitope says:

        the nano is an underclocked fury x. who cares how its put.

        The 980ti overclocks just the same with air cooling. its going to use a ton more power but everybody knows it overclocks

  5. Pingback: The damage that AMD marketing does | Scali's OpenBlog™

  6. Ron Peacetree says:

    HBM is no more and no less than a new take on caching. If your working set fits into 4GB (HBM) or 8GB (HBM2), then you will get a huge performance improvement out of HBM compared to previous memory technologies. If your working set does not fit into the big cache of HBM, then the question becomes how cache friendly is your memory access pattern and how effective is the cache management policy of the HBM being used. Like any other form of caching, it is possible to create a data set and memory access pattern that is pessimal for it, but realistically developers will strive to do the exact opposite and wring every bit of performance out of this HW just as they have every other HW improvement that has been invented. https://en.wikipedia.org/wiki/High_Bandwidth_Memory

    There is no myth here. But there is also no miracle here.

    • Scali says:

      There is a myth, see the interview with AMD’s Richard Huddy, linked in the article, where he makes the claim that HBM’s extra bandwidth can compensate for the lower capacity. Which obviously is a false claim, hence they are creating a myth around HBM.

Leave a reply to Scali Cancel reply