First Steamroller processor core exposure

Page 2 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

vltra

Junior Member
Apr 18, 2013
6
0
66
http://translate.google.ro/translat...&u=http://diybbs.zol.com.cn/11/11_106489.html

Some alleged info on EX . The translation is very bad , but this is a recent article and there's stuff we didn't know (?) .

Offtopic: HUH? God these chinese sites are like area 51 . I don't get it , how come the chinese get info on this stuff . The 4770K , the 6800K ... I won't be surprised if some obscure chinese website posts Kaveri benchmarks by the end of the year .
 

inf64

Diamond Member
Mar 11, 2011
3,763
4,221
136
Uh, links to both resumes and the 15h manual?
I posted the link to the news about new SOG manual update in Kabini thread. It got swallowed up by spam and offtopic bickering unfortunately. Manual itself is public and anyone can DL it .

Since I'm such a nice guy here it is(news about it at p3dnow):
http://www.planet3dnow.de/cgi-bin/newspub/viewnews.cgi?category=1&id=1368122313

Old manual (prior to May 2013 update):


New manual (May update):


Direct link to the manual @ AMD's website.
You want the page 591- it's the table above .

Simple google search will get you another information you seek:
http://ca.linkedin.com/pub/james-fry/49/216/a56

James Fry's Experience


SOC Director for Kaveri and Kaveri2.0 APUs

AMD


Public Company; 10,001+ employees; AMD; Semiconductors industry
September 2010 – Present (2 years 9 months)
Managed a large project team spanning the globe on a multi-year project with multi-billion dollars of expected revenue
 
Last edited:
Mar 10, 2006
11,715
2,012
126
LOL, Excavator is still on 28nm but they went ahead and doubled FPU resources?

What power envelope are they targeting? And they'd better be able to charge high prices for these, since the dies will be huge.
 

Abwx

Lifer
Apr 2, 2011
11,166
3,862
136
New manual (May update):



Piledriver can exe 256bit AVX but to do so it must first broke them in two 128b instructions before handling the said two sub instructions.

It seems that SR can exe the 256bit instruction in a row , without
the need to break it , wich reduce the number of necessary cycles
but this doesnt imply more FP units , just a better use of the existing ones.
 
Last edited:

inf64

Diamond Member
Mar 11, 2011
3,763
4,221
136
LOL, Excavator is still on 28nm but they went ahead and doubled FPU resources?

What power envelope are they targeting? And they'd better be able to charge high prices for these, since the dies will be huge.

Please you are talking about things you simple do not know (none of us do, except AMD themselves).

1st we do not know if the screenshot is genuine or not.

2nd what is the die size of it? It's not done on 32nm so the only option is 28nm and below. We know AMD has means to reduce die area and power by using specialized tools plus the new core has advantage of smaller node(presumably 28nm).
Present module die area with L2 cache(2MB) is ~30.9mm^2. Given that AMD can reduce the size with HDL library by up to 30% (their claim for an example done on FP unit in BD) plus you get advantage of smaller structures due to 32->28nm shrink, it's possible the new module won't be larger (if that) than BD/PD module.The cache will scale nicely with a shrink so 4C/2M part with no L3 and an (beefed up GCN) iGPU may end up being just somewhat bigger than Richland @ 32nm is. Something in the range of 265-280mm^2 for 2M+512SP APU with 4MB of L2 cache(total).

3rd, if the shot is real we do not know what that module in the OP is. Is it SR,SR+? AN EX module?
 
Mar 10, 2006
11,715
2,012
126
Piledriver can exe 256bit AVX but to do so it must first broke them in two 128b instructions before handling the said two sub instructions.

It seems that SR can exe the 256bit instruction in a row , without
the need to break it , wich reduce the number of necessary cycles
but this doesnt imply more FP units , just a better use of the existing ones.

That's much more likely considering AMD has officially stated that Steamroller packs 2x128bit FMACs.
 

SiliconWars

Platinum Member
Dec 29, 2012
2,346
0
0
What power envelope are they targeting? And they'd better be able to charge high prices for these, since the dies will be huge.

Even if the die was 50% larger than BD/PD it would still have roughly 2/3rd's of the die per wafer. That's ~460mm2 at around 120 DPW.

The question is would you pay $300 for a CPU that will crush even an 8-thread Haswell (at an obvious power consumption penalty), because that would get AMD the same money as they are making on the $200 8350 now (assuming a 50% larger die).

AMD has a lot of wafers to use up at Global F(l)oundries, so no better way to do it that go large and take back the performance crown. We're just guessing, but if you were AMD what would you have done?
 
Last edited:
Mar 10, 2006
11,715
2,012
126
Even if the die was 50% larger than BD/PD it would still have roughly 2/3rd's of the die per wafer. That's ~460mm2 at around 120 DPW.

The question is would you pay $300 for a CPU that will crush even an 8-thread Haswell (at an obvious power consumption penalty), because that would get AMD the same money as they are making on the $200 8350 now (assuming a 50% larger die).

AMD has a lot of wafers to use up at Global F(l)oundries, so no better way to do it that go large and take back the performance crown. We're just guessing, but if you were AMD what would you have done?

You assume GF's yields on the 28nm process will be good. Remember how craptacular yields were on 32nm at first? So 50% larger die doesn't mean that cost per die scales up proportionally...it could be much worse.

I would much rather they move their GPUs and "Cat" cores to GloFo...not sure why they're having their stuff built at TSMC other than that GloFo is a bunch of hypsters that don't yet have a viable 28nm process.
 

itsmydamnation

Platinum Member
Feb 6, 2011
2,863
3,413
136
Not only that, what's the TDP of such a big chip? 200W?

it isn't that much bigger, look at the L2 array on the right hand edge, compare that to bulldozer, relative to the L2 it's maybe 10-15% bigger, factor in say 10% smaller from 28nm and its almost the same size.


all the expert naysayers explain how you go about faking something like this, your obviously so across mirco uarch that you can point out all the BS in this fake.

i think question is what is it, not is it fake. its not SR as detailed @ hotchips, thats for sure.


It seems that SR can exe the 256bit instruction in a row , without
the need to break it , wich reduce the number of necessary cycles
but this doesnt imply more FP units , just a better use of the existing ones.
to me that seems unlikely that increases FPU scheduler complexity for only a 1 cycle gain, adds no MT benifit ( could actually be a penalty) and almost no ST benefit ( 1 cycle in a 22 stage pipeline..........)
 
Last edited:
Mar 10, 2006
11,715
2,012
126
it isn't that much bigger, look at the L2 array on the right hand edge, compare that to bulldozer, relative to the L2 it's maybe 10-15% bigger, factor in say 10% smaller from 28nm and its almost the same size.


all the expert naysayers explain how you go about faking something like this, your obviously so across mirco uarch that you can point out all the BS in this fake.

i think question is what is it, not is it fake. its not SR as detailed @ hotchips, thats for sure.


to me that seems unlikely that increases FPU scheduler complexity for only a 1 cycle gain, adds no MT benifit ( could actually be a penalty) and almost no ST benefit ( 1 cycle in a 22 stage pipeline..........)

My guess is that this is Excavator.
 

itsmydamnation

Platinum Member
Feb 6, 2011
2,863
3,413
136
My guess is that this is Excavator.

Yet if they have Excavator in this state ( a taped out module) why even bother with steamroller. This floor plan still looks "bulldozer era" Excavator was supposed to bring much more automated floor plan, thus units would be less symmetrical ( look at bobcat/jaguar). Everything is still symmetric and the FPU's / alu's, aglu's all look like bulldozer/piledriver.

Maybe with the less aggressive node transitions this is a 1/2 between what SR and EX was going to be.

If this is the module thats getting released as kaveri then i have to buy one regardless of performance becuase they have obviously given it a red hot go :biggrin:.
 
Mar 10, 2006
11,715
2,012
126
Yet if they have Excavator in this state ( a taped out module) why even bother with steamroller. This floor plan still looks "bulldozer era" Excavator was supposed to bring much more automated floor plan, thus units would be less symmetrical ( look at bobcat/jaguar). Everything is still symmetric and the FPU's / alu's, aglu's all look like bulldozer/piledriver.

Maybe with the less aggressive node transitions this is a 1/2 between what SR and EX was going to be.

If this is the module thats getting released as kaveri then i have to buy one regardless of performance becuase they have obviously given it a red hot go :biggrin:.

Erm, if this is scheduled for late 2014/early 2015, then they had better have something taped out by now...
 

itsmydamnation

Platinum Member
Feb 6, 2011
2,863
3,413
136
Erm, if this is scheduled for late 2014/early 2015, then they had better have something taped out by now...

No way! to get to this point they have already completely finalized the uarch, they have finalized a floor plan, have sent it off to the Fab, samples have comeback and then someone has decided to give out a dieshot.

show me one other dieshot that came out 18-24 month ahead of the chip! Where is the steamroller dieshot and where is the more automated design of excavator. There hasn't even been a single detail of excavator given yet we have a complete dieshot.

Sorry that doesn't add up.
 
Mar 10, 2006
11,715
2,012
126
No way! to get to this point they have already completely finalized the uarch, they have finalized a floor plan, have sent it off to the Fab, samples have comeback and then someone has decided to give out a dieshot.

show me one other dieshot that came out 18-24 month ahead of the chip! Where is the steamroller dieshot and where is the more automated design of excavator. There hasn't even been a single detail of excavator given yet we have a complete dieshot.

Sorry that doesn't add up.

Then what is this die shot of? Steamroller has already been detailed and the much wider FPU/VPU doesn't match the uarch description.
 

Roland00Address

Platinum Member
Dec 17, 2008
2,196
260
126
The question is would you pay $300 for a CPU that will crush even an 8-thread Haswell (at an obvious power consumption penalty), because that would get AMD the same money as they are making on the $200 8350 now (assuming a 50% larger die).
No it actually gets AMD more money. It may be the same amount of $ per mm2 (assuming good yields) but when the actual variable cost of die is measure in the dozens of dollars and you are getting paid $200+ vs $100+ for a variable item it is better to get the $200 dollar profit. Your fixed costs, the R&D and the Salaries are going to be the same regardless.
 
Last edited:

itsmydamnation

Platinum Member
Feb 6, 2011
2,863
3,413
136
Then what is this die shot of? Steamroller has already been detailed and the much wider FPU/VPU doesn't match the uarch description.

I know it doesn't. i don't know what it is for sure, if it is excavator it looks like its 28nm not 22/20nm but without some kind of external interface its very hard to tell. I think the question to ask is what happen/ where is the steamroller core that was detailed at hotchips, could it have already been dead before that presentation was given :awe:.

do you really expect AMD to be on 28nm in 2015.
 

Exophase

Diamond Member
Apr 19, 2012
4,439
9
81
Then what is this die shot of? Steamroller has already been detailed and the much wider FPU/VPU doesn't match the uarch description.

Wouldn't be the first time AMD published a design detail that was totally wrong.

Like the correction moving Zambezi from 2b to 1.2b transistors..

If you ask me, the timing of the SOG update fits Steamroller better than Piledriver. Make that change too far in advance and it'll just confuse people. There were some Steamroller related updates to GCC not that much longer ago.
 
Mar 10, 2006
11,715
2,012
126
Wouldn't be the first time AMD published a design detail that was totally wrong.

Like the correction moving Zambezi from 2b to 1.2b transistors..

If you ask me, the timing of the SOG update fits Steamroller better than Piledriver. Make that change too far in advance and it'll just confuse people. There were some Steamroller related updates to GCC not that much longer ago.

So you think each SR module gets 2x256bit FMACs?
 

itsmydamnation

Platinum Member
Feb 6, 2011
2,863
3,413
136
So you think each SR module gets 2x256bit FMACs?


It could be 4 128, they look very much like the existing ones ( higher/lower order bits are split), there is just double the amount of them. There also looks to be double the amount of register/queue for the FPU so maybe some kind of course grain separation between them can take pressure off needing so many read/write ports on one register/queues with a 4x128bit design.
 
Last edited:
Mar 10, 2006
11,715
2,012
126
It could be 4 128, they look very much like the existing ones ( higher/lower order bits are split), there is just double the amount of them. There also looks to be double the amount of register/queue for the FPU so many some kind of course grain separation between them can take pressure off needing so many read/write ports on one register/queues with a 4x128bit design.

Interesting. AMD may be trying to gain back share in HPC by beefing up the FPU. Should be interesting to see Steamroller v.s. Haswell or, probably more likely, Steamroller v.s. Broadwell.
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |