Thoses implementations seems to be trivial to implement in BDPower efficiency is going to be quite a bit in TSMC and GlobalFoundries ball court, yes? Of course AMD can greatly help by de-Pentium4ing their design and targeting more reasonable clocks. Remember that die shot that looked quite nice in terms of improving upon Bulldozer/Piledriver? Even more likely that was a peek at Excavator, it seemed to have dual or quad 256 bit FPUs.
wich seems to have a quite versatile pipeline but infortunately
for such a design to materialize they need a smaller node than
28nm , i guess they are badly slowed by GF unability or eventualy
unwillingness to speed up their nodes transitions.