At the recent Intel Developer Forum, Intel disclosed new details about the upcoming Haswell architecture. For the full coverage, you can read the coverage from various tech sites, such as Anandtech. As usual, I will just pick out a few things that I think are worth commenting on.
I think the most relevant bit of news is that Intel continues to aim for lower power envelopes. From Nehalem on, there was already a trend of Intel bringing the TDP down, while maintaining or even increasing performance. Lynnfield would take the TDP from 130W to 95W with virtually the same performance. Sandy Bridge then held on to the 95W TDP while improving performance. Ivy Bridge dropped the TDP down to 77W while improving performance slightly once again. And now Haswell is aiming to reduce TDP yet again. In fact, Anandtech suspects that Intel wants to use Haswell-derivatives in tablets and similar devices, rather than the Atoms they’ve used for this purpose so far. Anand is even speculating that Intel may drop Atom altogether and use a single architecture for anything from smartphones up to desktops. Not yet with Haswell, but perhaps with its successor (as always, the high-end architectures of today are the low-end architectures of tomorrow). This would mean a full head-on attack on ARM.
Yes, I’ve said it: ARM. That seems to be Intel’s target for the future, not AMD. As I discussed earlier, AMD is still pursuing performance more than power consumption and efficiency. Since Haswell clearly focuses on efficiency, AMD will have its work cut out with Steamroller.
Does this mean that Haswell is only about lower power consumption? No, not at all. Intel is also introducing some instruction set extensions, such as the Transactional Synchronization eXtensions (TSX), which I already mentioned in the Steamroller article earlier. They are also adding new Advanced Vector eXtensions (AVX2), which should improve both floating point and integer performance.
But another thing that I found remarkable, is that Intel is adding 2 new execution ports to the architecture for the first time in years (it started with 5 ports in the Pentium Pro, Conroe added a 6th port. Haswell will take it up to 8). For years I’ve been wondering just how much extra instruction level parallelism (ILP) could be extracted from x86 code. And everytime Intel managed to surprise me by improving its instructions-per-cycle (IPC), which at least partly was the result of better ILP (whereas AMD has been stuck on more or less the same level of IPC since the Athlon64, and actually reduced IPC in their Bulldozer architecture). However, since the introduction of HyperThreading, the question has also lost its relevance somewhat. Even if the extra execution ports will do little for a single thread, it would mean that there will be more resources to share between two threads, so the performance in multithreaded performance may improve.
Intel has also steadily increased its Out-of-Order-execution (OoO) instruction window over the years, and in Haswell they are increasing it yet again. Like with the extra execution ports, I am not sure how much performance they can get from a single thread, if any, but again the increased OoO-window will clearly benefit HyperThreading scenarios, since the OoO-window is partitioned equally between the two threads. In Haswell the OoO-window will be so large that each logical core will have roughly the same number of instructions in flight as a single core in the Conroe architecture.
At any rate, it looks like Haswell has some interesting improvements in store. Lower power consumption and higher performance (by the looks of it, especially in multithreaded scenarios) is a great combination.