First of all "fat" core is reference to design. It's 4way OO processor with SMT. Each Gulftown/Nehalem/SB core is around 50% larger than each K10 core and more than this than each BD core(note: I'm talking about non-cache die area,the parts that are doing actual computing: integer/fp parts with belonging front ends/backends). This is the way intel and AMD designed the CPUs. AMD went the "narrow" route and intel went the "fat" core route. The latter means more IPC but also more complexity and less clocking ability. Intel combat the clocking issues with their superior process technology.
AMD's approach can now be considered as a "narrow" 2way integer core with no SMT. It can clock rather high but since AMD depends on crappy GloFo's performance they are stuck for now. PD core will address this with some novel ideas and solutions to overcome the power/clock wall current BD iteration has.
SO we have this fat ,very IPC strong core(32nm 2nd iteration of Nehalem) that is 15% faster in single thread workloads and 16.4% faster in MT workloads than 4M/8T Bulldozer chip,both at stock. Die size is intel's advantage since they always had better cache density numbers than AMD (by a large factor). Since cache is making most of the bulldozer die it's natural it's 315mm^2 and yes,it's larger than Westmere. If AMD had access intel's process technology do you all think this would be the case? No of course. It would be roughly a parity between the two.
On to the question about why I use Westmere. It's simple,it's 12T chip and it's STILL faster than 8T IB/SB in MT workloads. Yet, this 12T chip is only 16.4% faster than FX8150 at measly 3.6Ghz(a slim high clocking design,not IPC monster). Who is doing better now? AMD is doing awesome in real workloads with their measly clocked FX chip against Westmere.
Lastly,how much is SB or IB actually faster than Westmere in ST workloads anyway? Let's find out:
i7 980x(ST Turbo @ 3.6Ghz)- 9:32 or 572s
i7 2600K(ST Turbo @ 3.8Ghz) - 8:11 or 491s
2600K is : 491/572s~=0.86 or 14% faster than 980x while clocking 5.5% higher. SO essentially 10% difference at similar clock. Not that much, is it? "Super duper" 3rd gen SB core and all you get is 10% more IPC. It's not small either but this chip is not a Westmere crusher,not even close.
To illustrate how small a difference in single thread performance is, if numbers from THG's Trinity review are replicated in VIshera(I don't see why they wouldn't ,Vishera can only be faster due to L3 and some other core changes Vs Trinity). Vishera will run at 4Ghz starting clock and we can assume that at least it will have SOME single thread Turbo uplift too. 4.5Ghz for Turbo is reasonable although I think we may see 4.6Ghz for single core Turbo. THG article shows us ~10-15% IPC uplift in 3 different workloads(itunes,lame and 3dstudiomax). Average is therefore 12.5%. All summed up : 4.5Ghz x 1.125 / 4.2Ghz ~=1.2 or 20%. In the THG chart this translates into:
FX8350 @ 4.4Ghz Turbo : FX8150 score x 0.8 <=> 673x0.8=538s. Lets round it to 540s.
SB 2600K- 491s
FX8350- 540s
Fat IPC monster core with solid Turbo clock boost(close to 4Ghz) is 491/540= 0.91 or 9% faster than slim IPC weakling with high Turbo clock. Pure integer/FP core sizes are just not comparable at the same node. Each Intel's core is 40-50% larger in die area taken versus each tiny Bulldozer integer core(and belonging half of flexfp unit). It's amazing AMD can compete and they deserve a congrats. Smart engineering.
And remember , these are NOT the workloads AMD targeted their cores in the first place. These client workloads are actually not priority for AMD.