First Steamroller processor core exposure

Page 19 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Idontcare

Elite Member
Oct 10, 1999
21,118
58
91
SOI doesn't affect clock rates nor how much logic you can put down. SOI only effects the end outcomes: Yields and Lifespan.

OK, just going to point out the caveats here that should go along with that kind of statement lest some readers of this thread walk away with a completely wrong impression or understanding...

What you write about SOI vs Bulk-Si in your post is only true if you are comparing to completely comparable nodes that have been designed and engineered to deliver essentially identical drive currents, capacitance, and leakage such that removing the SOI element from the process flow solely imparts a change in electrical parameters that dictate yield and reliability.

In reality the development costs associated with creating such a bulk-Si node versus the development costs associated with creating the electrically equivalent node via Bulk-Si are drastically (emphasis on drastically) different.

If you are developing a node which will see limited production volumes (AMD vs Intel when AMD still owned its fabs) then the cost-savings that came from using SOI paid off in the end despite the slightly higher production costs per wafer (because there were so few wafers produced with the node).

But the numbers completely flip-flop when you start talking volume production the likes of an Intel or a TSMC or a Samsung.

With those production numbers it then makes sense to take the hit of having a higher node development cost to make an electrically equivalent Bulk-Si node because the amortized development costs are more than compensated with lower production costs per wafer over the lifetime of the node.

But you can't just say "SOI doesn't effect clockspeeds" without capturing the fact that in a limited R&D budget scenario (which all scenarios are in reality) going SOI most certainly does effect clockspeeds because it gives the R&D team more budget to develop higher Idrive xtors than they would otherwise have been able to develop had they been tasked with creating a bulk-Si process flow.
 

NostaSeronx

Diamond Member
Sep 18, 2011
3,688
1,222
136
You can't just say "SOI doesn't effect clockspeeds..."
I just did. PDSOI which is the only SOI I can really comment on does not effect clock speeds. You could have 65-nm/45-nm/32-nm/28-nm Bulk for much less than SOI and have higher clocks. Within the same R&D budget for SOI.
 
Last edited:

Idontcare

Elite Member
Oct 10, 1999
21,118
58
91
I just did. PDSOI which is the only SOI I can really comment on does not effect clock speeds. You could have 65-nm/45-nm/32-nm/28-nm Bulk for much less than SOI and have higher clocks. Within the same R&D budget for SOI.

Not at the same operating voltage, leakage, power consumption and lifetime reliability...all of which cost you a LOT of R&D money.

Take your 32nm SOI piledriver at 5GHz and remove the SOI and your leakage currents and power consumption suddenly double (along with operating temperature).

You wouldn't get 5GHz on a piledriver if it was bulk-Si, the thermal headroom wouldn't exist unless GloFo dumped shedloads of money into developing better insulating dielectrics to minimize the leakage that would be present once the SOI was no longer present.
 

AtenRa

Lifer
Feb 2, 2009
14,003
3,361
136
The argument about SR having lower clocks due to using bulk si vs PD-SOI is pointless since PD and SR are manufactured in two different node processes (32nm SOI vs 28nm bulk) having different electrical characteristics.
 

Hitman928

Diamond Member
Apr 15, 2012
5,593
8,769
136

relztes

Junior Member
Apr 19, 2009
8
0
0
I did say it as a true statement. PDSOI does relatively nothing to what drives clock speeds, unlike Fully Depleted Bulk, FinFETs, FDSOI.
---
As long as you don't use memory/gpu intensive benchmarks the A6-5200 outperforms the A10-5750M.

Digest this not the SOI comment.

Do you mean to say that clock for clock, Jaguar sometimes outperforms Piledriver? Or are you actually saying an A6-5200 has higher performance than an A10-5750M? What benchmark do you have in mind, because I can't find any where that is true. But the fact that Kabini does seem to have better performance per MHz than Richland in really interesting. I don't know if that says something bad about Piledriver, or if it just takes a much bigger core to scale up to 4 GHz vs 2 GHz.
 

Exophase

Diamond Member
Apr 19, 2012
4,439
9
81
Do you mean to say that clock for clock, Jaguar sometimes outperforms Piledriver? Or are you actually saying an A6-5200 has higher performance than an A10-5750M? What benchmark do you have in mind, because I can't find any where that is true. But the fact that Kabini does seem to have better performance per MHz than Richland in really interesting. I don't know if that says something bad about Piledriver, or if it just takes a much bigger core to scale up to 4 GHz vs 2 GHz.

Piledriver pays a module sharing penalty Jaguar doesn't. If you limit to tests where there's no more threads than there are modules I think you'll get a different picture.
 

NostaSeronx

Diamond Member
Sep 18, 2011
3,688
1,222
136
When the A10-5750M is running multithreaded tasks or tasks that don't make use of FMA, the A6-5200 is faster. The desired outcome is that Kaveri has to be at least twice as fast than Kabini, while being on the same TSMC HP/HPM process.

2.5/2.8 GHz A10-5750M vs 2.0 GHz A6-5200. To clarify, I'm not talking about clock to clock.

I don't think Kaveri will be 5 GHz or 5.6 GHz within a 35 watt envelope. Which is why everything is the same but the decode/dispatch/mmx is different ideology is wrong.
 
Last edited:

Idontcare

Elite Member
Oct 10, 1999
21,118
58
91
The argument about SR having lower clocks due to using bulk si vs PD-SOI is pointless since PD and SR are manufactured in two different node processes (32nm SOI vs 28nm bulk) having different electrical characteristics.

I think everyone acknowledges this, or maybe I should say that at least I don't get the impression that this reality is lost on anyone participating in the thread. (could be wrong though about that, its a long thread)

But I think the distinction is relevant and worthy of continued discussion (or at the very least a persistently referenced footnote) because while we all want to see SR clockspeeds come out to be at least that of Piledriver (if not even higher still), we do need to factor in a bit of reality based on what we know we should expect to come with an SOI-less 28nm node versus that of a 32nm SOI-based node lest we all end up completely out in left field with our SR clockspeed expectations.

Knowing what I know* about bulk versus SOI, my clockspeed expectations for 28nm bulk-Si SR is that it will be markedly lower clocked compared to piledriver unless some special R&D money has been applied to make a unique special "sub-node" of 28nm that is custom-tailored for AMD to enable 4+ GHz clockspeed silicon.


* I was a member of the engineering team responsible for assessing (generating the in-fab data) and providing the recommended decision to management regarding whether or not we went with SOI or bulk-Si at the 130nm, 90nm, and 65nm nodes during development at TI.
 

Vesku

Diamond Member
Aug 25, 2005
3,743
28
86
Knowing what I know* about bulk versus SOI, my clockspeed expectations for 28nm bulk-Si SR is that it will be markedly lower clocked compared to piledriver unless some special R&D money has been applied to make a unique special "sub-node" of 28nm that is custom-tailored for AMD to enable 4+ GHz clockspeed silicon.

This is what I was referring to with GF 28nm. It would certainly be an achievement if they actually got a clock speed increase compared to 32nm. Merely matching current 32nm clocks is as much as I am expecting, which still means Kaveri should slot well against i3s.
 
Last edited:

carop

Member
Jul 9, 2012
91
7
71
SOI doesn't effect clock rates nor how much logic you can put down.

I know someone who is doing RF design on SOI substrates. Here is a link on SOI advantages for RF applications:

http://www.emmicroelectronic.com/webfiles/news/soi_rfdesign_oct03.pdf
Because the SOI structures are inherently insulated, they can also be packed closer together, offering up to a 15% increase in circuit density, reducing the circuit size.

In partially depleted processes, the thickness of the active silicon layer is greater than the depletion width under the gate, leaving a neutral region that extends down to the buried oxide insulation layer. This neutral region gives PD-SOI devices unique characteristics caused by the so-called floating body effects (FBE). These "parasitic" effects can be used to increase the operating frequency of the devices. On the other hand, substantial efforts are needed when transferring bulk CMOS designs to PD-SOI to allow for these effects.

In FD-SOI devices, the active silicon layer is thin enough that the depletion width extends completely to the underlying oxide layer. FD-SOI does not have the FBE of PD-SOI, which allows transferring designs from bulk processes in a more straightforward manner.

IBM is actually making use of the FBE "kinks" as well as strain engineering in its PD-SOI process to spin its top Power processors above 5 GHz.

If AMD moves the PD-SOI parts to bulk substrate, it will loose some clock frequency. On the one hand, there will be a lot of design drama that will go into porting their PD-SOI libraries to bulk substrate. On the other, it will not be able to spin its top parts above 4 GHz.
 

NTMBK

Lifer
Nov 14, 2011
10,269
5,134
136
I know someone who is doing RF design on SOI substrates. Here is a link on SOI advantages for RF applications:

http://www.emmicroelectronic.com/webfiles/news/soi_rfdesign_oct03.pdf


IBM is actually making use of the FBE "kinks" as well as strain engineering in its PD-SOI process to spin its top Power processors above 5 GHz.

If AMD moves the PD-SOI parts to bulk substrate, it will loose some clock frequency. On the one hand, there will be a lot of design drama that will go into porting their PD-SOI libraries to bulk substrate. On the other, it will not be able to spin its top parts above 4 GHz.

The interesting part for me in that quote is that it is meant to be easier to port from bulk to FD-SOI than to PD-SOI. Given that GloFo is meant to be rolling out FD-SOI at 28nm and 20nm, that could help AMD a lot.
 

AtenRa

Lifer
Feb 2, 2009
14,003
3,361
136
But I think the distinction is relevant and worthy of continued discussion

By all means, i just wanted to point out that we dont compare 32nm SOI vs 32nm bulk.

Also, Frequency is affected by IC design as well, just look at Llano vs Trinity, both on 32nm SOI but Trinity has almost a 1GHz higher operating frequency.

Well, according to the data it seams that TSMC 28nm bulk has lower drive current than IBM 32nm SOI. If those numbers hold true, then we can speculate and say that if SR is made at TSMC it may have lower frequency than PD. If thats the case and they will still keep the 15% higher performance/watt they said, we can again speculate that IPC will skyrocket in order to reach that goal.
 
Last edited:

Hitman928

Diamond Member
Apr 15, 2012
5,593
8,769
136
When the A10-5750M is running multithreaded tasks or tasks that don't make use of FMA, the A6-5200 is faster. The desired outcome is that Kaveri has to be at least twice as fast than Kabini, while being on the same TSMC HP/HPM process.

2.5/2.8 GHz A10-5750M vs 2.0 GHz A6-5200. To clarify, I'm not talking about clock to clock.

I don't think Kaveri will be 5 GHz or 5.6 GHz within a 35 watt envelope. Which is why everything is the same but the decode/dispatch/mmx is different ideology is wrong.

Once again, what evidence have you seen of this? Every benchmark I've seen shows that jaguar has no where near that lead of an ipc over piledriver. For example, you can look through this review of the a4-5000 (same chip as the 5400, just 500 Mhz slower) and see how much faster the A10-5700 (just faster than 5750m) is. Even if you add 33% performance to the a4-5000 (to give a theoretical best possible a6-5400) and take away ~10% from the 5700, you still don't get the jaguar out performing piledriver to even close to that level. If you have any examples of what you're talking about, I'm open to them, but from everything I've seen, jaguar has at best (having to make assumptions on what clock speed richland is actually running during bench) ~10% better IPC than Richland in work loads that Richland doesn't have an advantage of additional instruction sets.

Edit:
http://www.techspot.com/review/671-amd-a4-5000-kabini/page4.html
http://www.anandtech.com/show/7106/...t-1-the-apu-and-radeon-hd-8650g-performance/2
 
Last edited:

Ajay

Lifer
Jan 8, 2001
16,094
8,104
136
The interesting part for me in that quote is that it is meant to be easier to port from bulk to FD-SOI than to PD-SOI. Given that GloFo is meant to be rolling out FD-SOI at 28nm and 20nm, that could help AMD a lot.

It would really help AMD if GFL also was producing FD-SOI with a buried oxide. But the last SEM/TEMs I saw for 28nm FD-SOI** it didn't seem there was one (which would fit with an LP node). With the hidden gate AMD has the choice of low power or high performance, without - only low power.

======================================================

As far as the general point of discussion, I don't understand why anyone would be debating bulk vs SOI clocks speeds. Firstly, this has been beaten to death once again when Bulldozer was released and so on; secondly Intel's FinFet uses FD gates so standard bulk sure doesn't look like the best way to go and lastly; IDC has demonstrated his knowledge of practical process technology and theory quite sufficiently, IMHO, and he has pointed out why bulk will limit clock speed.

Personally, I have a secret hope that Kaveri A is being produced on bulk Si in first just to get the process up and running. It would be awesome if the would switch over to Kaveri B on FD-SOI w/buried Oxide in 2H14 to have some flexibility/survivability in the market. But I imagine the odds on that are small.




** Sorry can't find them, ATM. I should use some sort of bookmark tools as I have so many bookmarks in Chrome that they are useless.
 

galego

Golden Member
Apr 10, 2013
1,091
0
0
SRoller...
"Berlin is cool, and it uses a new Steamroller core from us and delivers tremendous compute and power efficiency," says Feldman. "When you have a huge amount of compute in a single-socket part, this is ideal for workloads where performance per watt per dollar and compute density per dollar are paramount."

A hint :
AMD is saying very little about the Steamroller core at this point, except that it offers double the performance of the Jaguar core and will max out with twice the memory capacity, too.

This is very interesting. As shown in the PS4 thread*, 4 jaguar cores have about the same performance than dual sandy bridge cores with hyperthreading (aka an i3). And 8 jaguar cores have about the same performance than quad sandy bridge cores with hyperthreading (aka an i7).

Now, if steamroller is twice more powerful than jaguar, this means that 4 steamroller cores will have about the same performance than quad sandy bridge cores with hyperthreading (aka an i7) or about the same performance than octo piledriver cores (aka an FX-8). :awe:

This would explain the new server roadmaps.



The 8C piledriver Opterons 3300 are replaced by 4C steamroller Berlin. And the Warsaw piledriver coming only in 12C/16C configurations. with the old configurations dropped.

I think it is safe to speculate that Warsaw chips will be finally replaced by future 6C/8C Berlin APU/CPU or what was named.

This twice performance of the new steamroller would explain also why the desktop roadmaps for 2013 only consider 2C/4C kaveri, with the early rumoured 6C being dropped. The 4C kaveri would be competing with a sandy bridge i7 (in CPU of course, kaveri graphics will be much more advanced).

*
http://forums.anandtech.com/showpost.php?p=35021461&postcount=1120
http://forums.anandtech.com/showpost.php?p=35028201&postcount=1193
 

Erenhardt

Diamond Member
Dec 1, 2012
3,251
105
101
This twice performance of the new steamroller would explain also why the desktop roadmaps for 2013 only consider 2C/4C kaveri, with the early rumoured 6C being dropped. The 4C kaveri would be competing with a sandy bridge i7 (in CPU of course, kaveri graphics will be much more advanced).

That is strange if you think about it.
8core consoles. AMD will get what they wanted and needed. Mutlicore ultilization in games forced on game devs by next gen console design. Going back to 2/4core CPU now makes no sense, that is if games will truly be multithreaded.

But I can see how they may be frustrated or even disappointed by games optimizations (even GE titles!), and don't want to deal with those problems again.
 

SiliconWars

Platinum Member
Dec 29, 2012
2,346
0
0
I also find that part difficult to swallow, going back to a 4 core setup after the consoles would appear to make a strong case for 8.
 

cytg111

Lifer
Mar 17, 2008
23,515
13,090
136
I also find that part difficult to swallow, going back to a 4 core setup after the consoles would appear to make a strong case for 8.

Why? If a single SR core would rival 3 jaguars? I think it makes perfect sense, the mainstream world is not ready for 8 cores yet, 4 is still kinda stretching it. Consoles mayhaps help pave the way, but its still at least 3-5 years down the road percentage wise, IMO.

I hope for a fatter 4 core rather than a skinnier 8 from AMD.
 

SiliconWars

Platinum Member
Dec 29, 2012
2,346
0
0
Well that's dependent on a single SR core rivalling 3 jaguars though isn't it, in which case sure 4 will do.

It's just a bit odd. Then again, I don't really see any good reason why their couldn't be an octocore SR down the line. The IGP of Kaveri will be approaching half the 100W TDP anyway so 125W octocore's without an IGP shouldn't be a huge problem.

Everything just seems a bit weird to me, maybe it's just jitters as it gets nearer release (assuming November) but the lack of leaks is bothering me.
 

AtenRa

Lifer
Feb 2, 2009
14,003
3,361
136
Once again, what evidence have you seen of this? Every benchmark I've seen shows that jaguar has no where near that lead of an ipc over piledriver. For example, you can look through this review of the a4-5000 (same chip as the 5400, just 500 Mhz slower) and see how much faster the A10-5700 (just faster than 5750m) is. Even if you add 33% performance to the a4-5000 (to give a theoretical best possible a6-5400) and take away ~10% from the 5700, you still don't get the jaguar out performing piledriver to even close to that level. If you have any examples of what you're talking about, I'm open to them, but from everything I've seen, jaguar has at best (having to make assumptions on what clock speed richland is actually running during bench) ~10% better IPC than Richland in work loads that Richland doesn't have an advantage of additional instruction sets.

Edit:
http://www.techspot.com/review/671-amd-a4-5000-kabini/page4.html
http://www.anandtech.com/show/7106/...t-1-the-apu-and-radeon-hd-8650g-performance/2

Jaguar IPC is higher than PileDrive at low frequencies of 1.5-2GHz. Also Kabini doesn’t suffer from the CMT, a quad core 1.5GHz Kabini A4-5000 is faster than Quad core Piledriver at 1.6GHz A8-4550M.


But, I don’t expect a 2GHz Kabini to win over a 2.5GHz (3.5GHz turbo) Piledriver A10-5750M
 

cytg111

Lifer
Mar 17, 2008
23,515
13,090
136
@Silicon
I know. The way I see it, if they could produce a part that edges out haswell by 10% yet consuming 100%+ more power they'd still have a winner on their hands.
 

Hitman928

Diamond Member
Apr 15, 2012
5,593
8,769
136
Jaguar IPC is higher than PileDrive at low frequencies of 1.5-2GHz. Also Kabini doesn’t suffer from the CMT, a quad core 1.5GHz Kabini A4-5000 is faster than Quad core Piledriver at 1.6GHz A8-4550M.

It's kind of a given that we're talking low frequencies as Kabini doesn't currently scale beyond 2 GHz

Given that, if this was the original argument, I wouldn't have doubted it (obviously it depends on work load as 4550m does win some, but not on average), in fact, I would have supported it.

But, I don’t expect a 2GHz Kabini to win over a 2.5GHz (3.5GHz turbo) Piledriver A10-5750M

This was the original argument which, apparently, neither you nor I support. That's not to take away from the jaguar cores, I actually have been very impressed with them and I think they are probably AMD's strongest cores in years (in relationship to their market), but let's not make them out to be something that they are not.
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |