First Steamroller processor core exposure

Page 12 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

inf64

Diamond Member
Mar 11, 2011
3,765
4,223
136
That slide depicting Kaveri's Gflops numbers is for the whole package (CPU+GPU). The most of it is the iGPU part while CPU makes a small fraction of the Gflops. It would be great if each module could execute 2x256bit ops natively in 256bit pipelines but for now it's safe to say it won't be the case (unless we get some performance leaks telling us otherwise).
 

Dresdenboy

Golden Member
Jul 28, 2003
1,730
554
136
citavia.blog.de
It kind of makes sense for the die shot seen earlier in this thread- that appeared to have 4 x 128bit FMACs in a single module. That sort of doubling up of the FP hardware could make SMT worthwhile. However, I still remain unconvinced that that is a die shot of a Steamroller module- there have been so many slides showing SR modules having 2 x 128bit FMAC. Three options:

1. Its a fake
2. Its Excavator
3. Its a massively overhauled version of Kaveri, with little resemblance to what it used to be
The FPU already uses SMT.

And I'm also still not fully convinced that this is not a fake. If it's not, I'd also opt for Excavator, since it goes in line with the increased parallelism mentioned before.
 

ZGR

Platinum Member
Oct 26, 2012
2,054
661
136
AMD is keeping their cards close to their chest. We keep speculating but I think all we want is a competitive product to keep Intel in line. We all believe in competition right? I love it since it brings out some cool products for good prices.

If AMD is able to push out a 20nm based CPU before 2014, I will be very impressed. I do expect Kaveri to be an awesome HTPC/Internet browsing beast, but I really hope Piledriver has significant single threaded gains. If AMD can fix that, then Intel will have to try a little harder.
 

inf64

Diamond Member
Mar 11, 2011
3,765
4,223
136
The FPU already uses SMT.

And I'm also still not fully convinced that this is not a fake. If it's not, I'd also opt for Excavator, since it goes in line with the increased parallelism mentioned before.
"Increased parallelism' terminology was used by AMD for SR core on their roadmaps


Excavator core has just "greater performance" and roughly 2x the flops of *that* SR according to the chart (if it's correct of course). Now if AMD decided to can SR and to implement Excavator in Kaveri2.0 then it all makes sense. Especially now in the light of server roadmap that has no 4M parts but just 2M APUs. If these perform like 4M/8T Piledrivers we have today with an added bonus of GPGPU power iGPU brings, it makes sense they have no 4M parts on their roadmap.
 

galego

Golden Member
Apr 10, 2013
1,091
0
0
@virge @insertcarehere. I have found another analysis that also suggest that the module can deal with four threads in parallel.

As said before, only if Steamroller is going to be 2C/4T it makes sense for me that AMD will be replacing Opteron 8C (PD) by Berlin 4C (SR) whereas the early rumoured 6C kaveri disappears from the new roadmap. Otherwise I don't understand the die neither the roadmaps.
 
Last edited:

galego

Golden Member
Apr 10, 2013
1,091
0
0
That slide depicting Kaveri's Gflops numbers is for the whole package (CPU+GPU).

Yes. And the mentioned 848 GFLOP of the i7-4770k are also for CPU+GPU.

Now if AMD decided to can SR and to implement Excavator in Kaveri2.0 then it all makes sense. Especially now in the light of server roadmap that has no 4M parts but just 2M APUs. If these perform like 4M/8T Piledrivers we have today with an added bonus of GPGPU power iGPU brings, it makes sense they have no 4M parts on their roadmap.

This is my point also, except that I think that current Steamroller design must be somewhat between the previously planned Steamroller and the previous Excavator. Something like this:



I have seen several people in the internet claiming 2 SR cores having about the same GFLOPs than 4 PD cores (clock to clock of course).

Maybe the disappeared 6C Kaveri will reappear in some future roadmap to match 6C Haswells.
 
Last edited:

Dresdenboy

Golden Member
Jul 28, 2003
1,730
554
136
citavia.blog.de
"Increased parallelism' terminology was used by AMD for SR core on their roadmaps


Excavator core has just "greater performance" and roughly 2x the flops of *that* SR according to the chart (if it's correct of course). Now if AMD decided to can SR and to implement Excavator in Kaveri2.0 then it all makes sense. Especially now in the light of server roadmap that has no 4M parts but just 2M APUs. If these perform like 4M/8T Piledrivers we have today with an added bonus of GPGPU power iGPU brings, it makes sense they have no 4M parts on their roadmap.
Ah, you're right. But "greater parallelism" is vague enough to cover everything from wider IF to 4x512b FMA units.

I see multiple things causing that ping pong between whether it's SR, SR 2.0 or EC: announced parallelism, doubled decoders, 2x128b FMAC vs. doubled FMAC units, IEUs, bigger caches and - besides many smaller improvements - a die photo that early in the development process. But internally AMD might already have some EC silicon to conduct first tests.
 

guskline

Diamond Member
Apr 17, 2006
5,338
476
126
I think part of the problem here is the original SteamRoller "roadmap" started quite awhile ago and there has been very little change officially from AMD since then. Remember CEO Rory took over AFTER these maps were developed. I think he has changed the course. This ties in with a "souped-up" Piledriver core being released this summer to keep interest high while delaying what we will call SteamRoller for a changed design. It appears the socket FM2+ will be the first beneficiary of this change. Perhaps the "moar cores" mantra is being changed at AMD to "more efficient and powerful cores". If so, despite the likely socket change AMD appears headed in the right direction.
 

SiliconWars

Platinum Member
Dec 29, 2012
2,346
0
0
Well Kaveri is how late already? If it was just down to GF's lagging 28nm then maybe AMD decided to push ahead to Excavator instead, or at least combine some elements with Steamroller.
 

inf64

Diamond Member
Mar 11, 2011
3,765
4,223
136
Ah, you're right. But "greater parallelism" is vague enough to cover everything from wider IF to 4x512b FMA units.

I see multiple things causing that ping pong between whether it's SR, SR 2.0 or EC: announced parallelism, doubled decoders, 2x128b FMAC vs. doubled FMAC units, IEUs, bigger caches and - besides many smaller improvements - a die photo that early in the development process. But internally AMD might already have some EC silicon to conduct first tests.
Well we already know this is Kaveri2.0 (from linkedin profiles of AMD engineers). Fudzilla called the SR core in Kaveri that will launch this Q4 an "Steamroller B" core. So something def. changed from HC's time when AMD presented SR core.

Well Kaveri is how late already? If it was just down to GF's lagging 28nm then maybe AMD decided to push ahead to Excavator instead, or at least combine some elements with Steamroller.
Original Kaveri should have launched in Q1 this year. Instead of that it will effectively launch 1 year later (roughly) since shipping in Q4 usually means Q1 availability. The change in schedule could be due to GF but also due to AMD's shifted plans. It would be great if we would get EX core in Kaveri2.0, even on GF 28nm node.
 

SiliconWars

Platinum Member
Dec 29, 2012
2,346
0
0
Original Kaveri should have launched in Q1 this year. Instead of that it will effectively launch 1 year later (roughly) since shipping in Q4 usually means Q1 availability. The change in schedule could be due to GF but also due to AMD's shifted plans. It would be great if we would get EX core in Kaveri2.0, even on GF 28nm node.

If you tally up the rumours you could end up at that conclusion. It's late, there were rumours of cancellations, dropped hexa-cores, die shots etc etc.

Telling for me is that a quad core is replacing an octo-core in the server lineup and there appears to be no octo-core desktop part on the near horizon. I'm not about to predict a huge AMD comeback or that Kaveri will blow away Haswell or anything like that, but the upcoming core is likely to be a lot meatier than Piledriver.
 

inf64

Diamond Member
Mar 11, 2011
3,765
4,223
136
If you tally up the rumours you could end up at that conclusion. It's late, there were rumours of cancellations, dropped hexa-cores, die shots etc etc.

Telling for me is that a quad core is replacing an octo-core in the server lineup and there appears to be no octo-core desktop part on the near horizon. I'm not about to predict a huge AMD comeback or that Kaveri will blow away Haswell or anything like that, but the upcoming core is likely to be a lot meatier than Piledriver.
That's exactly what I think too :thumbsup:
 

Dresdenboy

Golden Member
Jul 28, 2003
1,730
554
136
citavia.blog.de
Sure it's fishy but it's not easy to make such a good fake and fool two guys like Hans and 3dilettante. I guess anything is possible though.
The reference to true FP256bit pipelines is true tho, it's in the latest AMD's 15h SOG manual .

That's correct. But there is no link between 256b FPU pipelines and SR. It's just an already defined feature bit. SR slides named 128b units.
 

inf64

Diamond Member
Mar 11, 2011
3,765
4,223
136
That's correct. But there is no link between 256b FPU pipelines and SR. It's just an already defined feature bit. SR slides named 128b units.
Indeed. But SR was supposed to launch in Q1 this year and "it" will end up in a product called Kaveri2.0 that is basically 1 year late . On top of this we have seen purported leaked SR?/EX? module shot which does not resemble to SR AMD presented on last year's HC. Alse we see no 4M 1P/2P SR based Opterons on AMD's server roadmap, just this 2M+iGPU APU one augmented with PD refresh.

Something definitely changed- whether it's a core update or just roadmap slip due to crappy GF execution is anyone's guess.
 

Abwx

Lifer
Apr 2, 2011
11,172
3,868
136
What is sure is that given the 6-8 month delay SR
will be roughly on par with the numbers unveiled at HC 2012.

On second thoughts , why would be the comparison be Bulldozer
while Piledriver was released three months before HC , i doubt
the slides were edited months before the event.
 

krumme

Diamond Member
Oct 9, 2009
5,956
1,595
136
There has a first Kaveri ES been reported by a BOINC client (as usual ):

http://citavia.blog.de/2013/07/02/amd-kaveri-engineering-sample-sighted-in-the-wild-16196102/

The original account has been created in March. I didn't see yet, when the computer has been added.

Great work as always !

"The measured performance numbers (BOINC Whetstone/Dhrystone) are rather low, which indicates a low clock frequency during that measurement run, possibly caused by missing power management or other CPU drivers."

What is the prior history of this beeing the case? Do you read anything else into the results?
 

Hitman928

Diamond Member
Apr 15, 2012
5,622
8,847
136
The numbers Dresdenboy posted are certainly curious. I'm guessing this particular chip / solution is bugged (improper BIOS support?). For instance, if we assume (just to have a base of reference), that the sample is running at 1 GHz, then compared to mobile Trinity from his post here, you would have kaveri with:

99% integer performance/GHz compared to trinity
67% fpu performance/GHz compared to trinity

So it would seem that in this sample, kaveri (or the system) is bugged and running at a very low frequency
and/or
kaveri is running really at a low frequency but has massive integer performance versus trinity.
My vote is option 1.
 
Last edited:

Dresdenboy

Golden Member
Jul 28, 2003
1,730
554
136
citavia.blog.de
Great work as always !

"The measured performance numbers (BOINC Whetstone/Dhrystone) are rather low, which indicates a low clock frequency during that measurement run, possibly caused by missing power management or other CPU drivers."

What is the prior history of this beeing the case? Do you read anything else into the results?

It's actually an A0 stepping. So from the looks of it I think it's to early to conclude anything here.
 

inf64

Diamond Member
Mar 11, 2011
3,765
4,223
136
Good find Dresdenboy.
The only way the numbers make sense is that the ES part was running at around 600-700Mhz (in order to have somewhat better perf./Mhz than Trinity). That or SR is seriously borked lol.
 

NostaSeronx

Diamond Member
Sep 18, 2011
3,689
1,224
136
The original account has been created in March. I didn't see yet, when the computer has been added.
2013-06-15 is when it did something, 2013-06-14 is when it accessed to do something.
It's actually an A0 stepping. So from the looks of it I think it's to early to conclude anything here.
The usual steppings of APUs that retail are A1 and B0.
The only way the numbers make sense is that the ES part was running at around 600-700Mhz (in order to have somewhat better perf./Mhz than Trinity). That or SR is seriously borked lol.
If a processor doesn't have the correct BIOS, would it run at idle clocks only?
 
Last edited:

Ajay

Lifer
Jan 8, 2001
16,094
8,106
136
Well we already know this is Kaveri2.0 (from linkedin profiles of AMD engineers). Fudzilla called the SR core in Kaveri that will launch this Q4 an "Steamroller B" core. So something def. changed from HC's time when AMD presented SR core.

This is what makes the most sense. Lower clocks could kill any advantage in IPC however

I'm sure XV is waiting for 20nm. I just hope it's 20nm fd-soi to get the clocks back up. Even a 20% jump in IPC will leave AMD hopelessly noncompetitive (which won't be much of a change from Trinity or Kaveri). A 20% boost in IPC plus getting back to > 3.5 GHz clocks would be a much better story.
 

Abwx

Lifer
Apr 2, 2011
11,172
3,868
136
The numbers Dresdenboy posted are certainly curious. I'm guessing this particular chip / solution is bugged (improper BIOS support?). For instance, if we assume (just to have a base of reference), that the sample is running at 1 GHz, then compared to mobile Trinity from his post here, you would have kaveri with:

99% integer performance/GHz compared to trinity
67% fpu performance/GHz compared to trinity

So it would seem that in this sample, kaveri (or the system) is bugged and running at a very low frequency
and/or
kaveri is running really at a low frequency but has massive integer performance versus trinity.
My vote is option 1.

Options 1 and 2 are not mutualy exclusives...
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |