So what's left of CMT in this, shared L2? Branch predictor?
floating point unit and fetch also
So what's left of CMT in this, shared L2? Branch predictor?
also L1i
the only thing that has moved from shared to dedicated is decode. If AMD really wanted to they could have expanded decode and still kept it shared. By beefing up the execution resources they actually show the value of CMT. Go look at piledrivers dieshot then look at this, we are talking about the doubling of lots of resources but its nowhere near double the diesize.
Is Steamroller 970 chipset (AM3+) compatible? I haven't been keeping track of latest rumors.
Maybe this diagram is off or my interpretation but it looks like fetch and L1I are separate here.
Makes business sense to go directly for most single thread performance, the 8 "core" approach didn't really make a big splash in the server market, and worrying less about die size (GF WSA), rolling back CMT is a R&D light way of increasing ST performance.
In another thread I did some straightforward (perfect scaling) +15% IPC +10% clocks on a FX 6300 and that would be just a bit behind a 3570K in ST but ~20% faster in MT. Stands to reason a 2 Module version would get pretty close to a stock 3570K in both ST and MT (bit higher clocks than 3 module). http://forums.anandtech.com/showthread.php?t=2321195
There are no Steamroller AM3+ chips on any roadmaps. It seems the socket is dead in favour of FM2+. Single socket servers will use FM2+ as well.
Compare original Steamroller as presented on Hot chips with this "new one". They doesn´t look the same....
Yes, no idea if this is actually Steamroller but it does seem to have regressed more in CMT choices than the Hot Chips one.
Slowly but surely, AMD is undoing all of the mistakes they made with Bulldozer, and admitting that CMT has too much of a single threaded performance penalty for it to be worth it.
Wow. What an F U to AMD's current customers. That just seals the deal as far as my not going back to AMD for CPUs.
I dont see it as a bad thing, on the contrary. Its better for everyone and leads to better products.
But it doesn't have one, the only people who say that are people who cant seperate what CMT is vs what bulldozer is. CMT's bottlenecks only occurred with Multithreaded workloads but even that was caused by design choices not CMT itself.
please name one restriction that CMT imposes on single thread performance.
Sorry, I worded that badly. Single threaded performance suffered because they stripped out some of its integer prowess and increased the pipeline length. Multi threaded performance suffered because of this resource sharing idea, which now even AMD admits was a mistake.
It looks like, to redeem FailDozer, AMD is moving away from CMT. Less and less is being shared.
And again, AMD is not "moving away" from CMT, they're moving to a different implementation. To have a properly inane comparison, you don't hear that Intel is moving away from x86 decode when they power down the frontend on micro op cache hits.
and the implementation is closer to full cores than CMT, compared to FailDozer.
No, if they can execute 4 threads within a single Module.
This implementation could be like a Single Module, 2 Cores (CMT) with 4 Threads (SMT) or,
Single Module, 2 Cores 4 Threads (CMT) like BD/PD.Im leaning towards this one.
Either way, they continue evolving the CMT design.
Please post a link supporting this idea that a single module will execute 4 threads?
The die pic shows 4 ALUs + 4 AGUs per Integer Core. I dont believe that they will use 8 pipes per Thread. The utilization of all 8 pipes from a single thread will be very low, not to mention the performance gains to die area ratio used will be even lower. This implementation is surely a 4 Threads design.
I suspected this, thus my purchase of a 8350. However, in fairness Intel does this. It was 1366, then 1156, them 1155 and within a week 1150. Progress. And implementation of faster overall systems.Wow. What an F U to AMD's current customers. That just seals the deal as far as my not going back to AMD for CPUs.
That would be contrary to everything AMD has said about Steamroller. I'll believe it when Anandtech writes a detailed article on it.