One thing that blows me away when I see the pic of Trinity, or Llano for that matter, is just how little real-estate is devoted to the core logic itself and yet it is the performance characteristics of that core logic that nearly entirely determines the selling price of the entire IC.
I mean just look at the ratio of CPU logic (difficult and complex to design) to the L2$ area (a dumb/easy copy-and-paste cell if there ever was one in design/layout).
Now obviously I am not saying this without consideration for all the obvious stuff (Pollack's Rule, Amdahl's Rule, Godwin's Rule, Hanlon's Rule, etc), but I can't help to feel that if they would only just throw their design/layout guys another 10mm^2/core budget then they'd be able to buy themselves some serious IPC improvements without necessarily making an already rather large IC become all that much larger.
Just look at the die area afforded to the hypertransport IO, or the DDR3 mem controller. Those parts of the CPU occupy nearly as much area as an entire bulldozer module, and yet it is the bulldozer module that is a performance-degrading CMT design
by design just so a few mm^2 can be saved in a die that is already well over 200mm^2.
Doesn't this seem just a bit "penny wise but pound foolish" here? Why not widen out those castrated CMT cores back to being CMP demons, make your die a, gasp, 250mm^2 chip instead of the 240mm^2 it currently is, and put yourself in a position to field some competive IPC.