BenchPress
Senior member
- Nov 8, 2011
- 392
- 0
- 0
I do. Haswell doesn't just add FMA support, but will also double the bandwidth, add gather support (replacing 18 legacy instructions with 1), add 256-bit integer support (replacing 3 instructions with 1), and add a whole bunch of other useful instructions. So in practice the effective throughput for parallel algorithms should easily double.That's the keyword. In practice we'll get nowhere near that, or do you think otherwise?
Between Nehalem and Haswell there will be a fourfold increase in peak throughput, but Sandy Bridge (which doubled the peak floating-point throughput) was severely held back by low bandwidth and a lack of many other 256-bit operations. So even though there won't be a fourfold increase in performance over Nehalem in practice, it's easy to see that since Haswell solves all of Sandy Bridge's shortcomings, it has lots of headroom to at least double the throughput in practice.