Kabini Rumors

Page 7 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Abwx

Lifer
Apr 2, 2011
11,166
3,862
136
With all their GPUs and low power APUs being fabbed at TSMC
there s not much wafers left to be in track with the deals
they signed with GF wich stipulate a minimum quantity
that cant be filled with only server/FX CPUs , we can confidently
assume that Kaveri and the server parts will be left to GF.

GLOBALFOUNDRIES ramped 32nm high-k Metal Gate (HKMG) Super High Performance (SHP) technology to high volume production at Fab 1 in early 2011


And since the 28nm technology is a direct shrink of 32nm, customers will benefit greatly from the high-volume ramp of our 32nm-SHP technology.

http://www.globalfoundries.com/technology/leading_edge_tech.aspx
 

Exophase

Diamond Member
Apr 19, 2012
4,439
9
81
I know that AMD has to use GF for something, I'm asking you for some official source confirming that they're using GF's 28nm FDSOI instead of or in addition to their 28nm bulk process.

Or maybe you think that FDSOI is all they will offer? It's more expensive you know..
 

Abwx

Lifer
Apr 2, 2011
11,166
3,862
136
The machinery to fab 28nm is the same as with 32nm ,
and it is not that expensive if AMD used 315mm2 with BD
for the same transistor budget as Trinity s 246mm2.
 

Exophase

Diamond Member
Apr 19, 2012
4,439
9
81
I'm not asking for confirmation that AMD will be using one of GF's 28nm process. I don't think you understand that FDSOI is not mandatory with GF's 28nm. And is going to be available substantially later - volume production estimated H1 2014, while you can already buy bulk 28nm products.
 
Last edited:

Abwx

Lifer
Apr 2, 2011
11,166
3,862
136
AMD clearly stated that Steamroller will stay a high frequency design
and given that it will be 20% faster than Piledriver there would be really
no point to be limited to 3Ghz or even less by a leaky and slow bulk process
that would put Kaveri at a lower absolute perf than Richland in DT and even
in mobile given the expected lower turbo frequency of bulk made CPUs.
 

Exophase

Diamond Member
Apr 19, 2012
4,439
9
81
Why do you think Steamroller would be limited to 3GHz or less on GF's 28nm bulk? Intel has achieved much higher clocks on their bulk processes going back several generations, using CPU uarchs that are less aggressively pursuing frequency as Bulldozer and descendants. Why assume that GF's bulk 28nm process will be so much worse?
 

Abwx

Lifer
Apr 2, 2011
11,166
3,862
136
Transistors on SOI have 30% higher frequency transition than
in bulk processes at equal size , AMD current CPUs comsumption
is not fully mastered not because of process but of floorplan layout
that was largely automated and implied higher dynamic losses due
to higher total parasistic capitance of the circuit , this will be corrected
with Steamroller.
 

Idontcare

Elite Member
Oct 10, 1999
21,118
58
91
I'm confused, are you referring to GF's 28nm bulk process? Because Kabini is definitely on TSMC 28nm (http://www.techpowerup.com/img/13-02-19/95c.jpg)


Derp, my error! You can haz charburger now

Transistors on SOI have 30% higher frequency transition than in bulk processes at equal size...

Not just "equal size", you are forgetting to normalize a WHOLE bunch of other things about the transistor that determine electrical performance as well.

But it is one more reason why Finfet w/FD-SOI looks interesting enough that people are willing to pursue it (at least in development anyways).
 

Idontcare

Elite Member
Oct 10, 1999
21,118
58
91
Why do you think Steamroller would be limited to 3GHz or less on GF's 28nm bulk? Intel has achieved much higher clocks on their bulk processes going back several generations, using CPU uarchs that are less aggressively pursuing frequency as Bulldozer and descendants. Why assume that GF's bulk 28nm process will be so much worse?

Switching speed is impacted by capacitance, which SOI can reduce. But switching speed is impacted by much more than simply capacitance, which is why as you duly noted other company's bulk-Si are not fail.

GloFo's 28nm will suck though compared to TSMC's when it comes to clockspeeds because its drive currents are so much lower owing to IBM's insistence on going gate-first HKMG integration.

(gate-last integration invokes a huge Idrive improvement above-and-beyond that of gate-first integration because of the additional stress engineering that can be done to the channel)
 

Exophase

Diamond Member
Apr 19, 2012
4,439
9
81
GloFo's 28nm will suck though compared to TSMC's when it comes to clockspeeds because its drive currents are so much lower owing to IBM's insistence on going gate-first HKMG integration.

Which is kind of ironic given that I don't know of any high volume 28nm TSMC designs that clock anywhere close to what Piledriver does, but I could be out of the loop on that one.. What I know of are GPUs - that go wider and slower for best perf/W - and mobile SoCs that don't have the power budget for it, and FPGAs that AFAIK intrinsically can't support nearly as high clocks (but could be totally wrong on this).

The only design I know of that goes for pretty high clocks on TSMC 28nm is SPARC T5 @ 3.6GHz, hopefully Steamroller on GF 28nm bulk manages much higher than that.
 

itsmydamnation

Platinum Member
Feb 6, 2011
2,863
3,413
136
Switching speed is impacted by capacitance, which SOI can reduce. But switching speed is impacted by much more than simply capacitance, which is why as you duly noted other company's bulk-Si are not fail.

GloFo's 28nm will suck though compared to TSMC's when it comes to clockspeeds because its drive currents are so much lower owing to IBM's insistence on going gate-first HKMG integration.

(gate-last integration invokes a huge Idrive improvement above-and-beyond that of gate-first integration because of the additional stress engineering that can be done to the channel)



but 32nm GF SOI relative to 28nm GF Bulk is the question. Not to TSMC.
 

Idontcare

Elite Member
Oct 10, 1999
21,118
58
91
but 32nm GF SOI relative to 28nm GF Bulk is the question. Not to TSMC.

Who's question?

I was answering Exophase's question of why would GF's 28nm be limited to 3GHz when other company's 32nm and 28nm bulk processes were not limited to 3GHz.

If there is another question afoot, my answer to Exophase isn't likely to apply since its not the question I was addressing.

As for my general estimations of GF 28nm vs GF 32nm, I expect clockspeeds to be improved with GF 28nm over GF 32nm because the drive currents have improved, combined with the dimensional shrinking and transistor development that leads to better switching speeds.

I also expect dynamic power consumption to be improved (lower active consumption), but I expect static power consumption (leakage) to be comparable or possibly slightly worse than 32nm because of the removal of SOI (adding substrate leakage back into the equation).

Clockspeeds on 32nm SOI are very much design-dependent, as is true of any process node. Try getting a Llano to 8.67GHz like you can with piledriver, won't happen despite being the same xtors and wiring.
 

inf64

Diamond Member
Mar 11, 2011
3,763
4,221
136
SR stays the high frequency design. It "just" fixes major bottlenecks in BD, that's all.
 

Idontcare

Elite Member
Oct 10, 1999
21,118
58
91
SR stays the high frequency design. It "just" fixes major bottlenecks in BD, that's all.

It decreases the degree of CMT, right? (trying to remember) Less things are shared between cores within the module, so the CMT tax will be less.

Unless something is really borked with GF's 28nm, I'd expect 5GHz air-cooled stock SKUs and possibly a smidgen lower TDP (say 110 or 115W)
 

SocketF

Senior member
Jun 2, 2006
236
0
71
It decreases the degree of CMT, right? (trying to remember) Less things are shared between cores within the module, so the CMT tax will be less.
Yes the decoders will be dedicated, only the L1I-Cache and Fetch & Branch prediction will still be shared, though the L1I-size will grow to 96 kB.
Unless something is really borked with GF's 28nm, I'd expect 5GHz air-cooled stock SKUs and possibly a smidgen lower TDP (say 110 or 115W)
Bulk or PD-SOI? If Bulk, how would it be @PD-SOI?
 

itsmydamnation

Platinum Member
Feb 6, 2011
2,863
3,413
136
It decreases the degree of CMT, right? (trying to remember) Less things are shared between cores within the module, so the CMT tax will be less.
basically yes, Bulldozer is decode limited especially with 256bit ops. There method to fix it was to have two decode blocks. They could have maintained a shaded decode if they increased its throughput and allowed ops to be issues to both threads in a clock but im guessing separate blocks was way less risk.

Out side of that most of the IPC improvement as given by AMD are outside of limitations imposed by CMT and are general core improvements.

Unless something is really borked with GF's 28nm, I'd expect 5GHz air-cooled stock SKUs and possibly a smidgen lower TDP (say 110 or 115W)

Maybe im not sure. I have a laymans theory that bulldozer spins its wheels, i.e units are still buring power but because of lack of traction you dont get the performance but still burn the same amount of fuel. If these IPC improvements result in better traction then i could see it happening. If these IPC improvements cause additional power usage above piledriver then i wouldn't expect to see near 5ghz.

Who's question?

Its the question because that's all that matters for SR. TSMC might have a way better 28nm bulk process but so long as GF 28nm process is better then its 32nm SOI thats the important bit for SR.
 

Idontcare

Elite Member
Oct 10, 1999
21,118
58
91
Bulk or PD-SOI? If Bulk, how would it be @PD-SOI?

Bulk. There is no PD-SOI @ 28nm, is there? Only FD-SOI as far as I recall.

For FD-SOI, if the PR is to be believed, then when operating in "dual-gate" mode with the substrate biasing, switching speeds are supposed to improve even further, right?

I forget the power-point claims but wasn't it supposed to add on another 20% clockspeed headroom or enable a further 30% reduction in power usage while keeping the same clockspeed? (i.e. the "same benefits as a node shrink provides" claim?)
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
Seems close to the ~2.5watts or so, the Atom z2760 has (Acer w510 has a 10.1" screen).
The A4-1200 (Temash) inside the AMD "Larne" platform, was supposedly with a 13.3" screen.

Anand has tested the devices with Start Screen. You can get lot lower in idle with Desktop. The Atom Z2760 Tablet I have here is getting 1.6W in idle. I've seen it low as 1.4W. Also, both Llano and Sandy Bridge reference systems show that reference systems are usually best case scenarios, quite the opposite of what its perceived as.

Playing 1080p video in Youtube(Justin Timberlake's Suit & Tie on S&L) uses only 3.4W.

You can also see review of Hondo from Notebookcheck with the Fujitsu Stylistic Q572.

A 30WHr battery Lenovo Thinkpad Tablet 2(Atom Z2760) gets 60% more battery life in web browsing, 89% better in load, and 25% better in idle compared to the Fujitsu Stylic Q572, despite having smaller battery(36WHr), lower display brightness throttling in battery, much higher contrast and higher overall and maximum brightness.
 
Last edited:

Gideon

Golden Member
Nov 27, 2007
1,709
3,927
136
A 30WHr battery Lenovo Thinkpad Tablet 2(Atom Z2760) gets 60% more battery life in web browsing, 89% better in load, and 25% better in idle compared to the Fujitsu Stylic Q572, despite having smaller battery(36WHr), lower display brightness throttling in battery, much higher contrast and higher overall and maximum brightness.

Finally a Z-60 review, I thought it would never see the light of day
The battery life is indeed appaling, but it was to be expected, especially as it has an always-on fan

I'm surprised that a 1 Ghz Bobcat was actually competitive to 1.8 Ghz Atom in application performance (a lot of help from a very fast SSD), I thought it would be a lot worse. They only did test PC-mark though, which makes me suspicious, given it's very storage-sensitive.

Given all that, a fanless 1 Ghz Dual-Core 64SP 3.6 TDP Temash could actually be a half-decent "traditional" tablet, at least during those months prior to Silvermont.
 

monstercameron

Diamond Member
Feb 12, 2013
3,818
1
0
The z60 has pretty much the same perf as a c-50 -found in many netbooks- as for the tdp difference, 9 vs 4.5W maybe disabling parts of the chip is the reason.

Let me preface this by stating that i got these numbahs from notebookcheck

I also noticed that the fujitsu uses more power than the older c-50 based acer w500. Maybe the relatively lower battery life is due to fujitsus choice of components and not so much amds chip...
 
Last edited:

SocketF

Senior member
Jun 2, 2006
236
0
71
Bulk. There is no PD-SOI @ 28nm, is there? Only FD-SOI as far as I recall.
GF showed a roadmap at the conference call in February, where they wrote 32/28SHP ... since then I wonder if there is also a PD-SOI process available at 28nm.

Given the strong relations between GF's 32 and 28nm line, I guess that wouldnt be rocket science, would it?
For FD-SOI, if the PR is to be believed, then when operating in "dual-gate" mode with the substrate biasing, switching speeds are supposed to improve even further, right?
Yes ... but dont forget that FD-SOI is based on the LP-libraries. With biasing and the LP libraries the performance is very close to HP-bulk (if I remember a STE blog entry correctly). The advantage is "only" lots of power savings.

I forget the power-point claims but wasn't it supposed to add on another 20% clockspeed headroom or enable a further 30% reduction in power usage while keeping the same clockspeed? (i.e. the "same benefits as a node shrink provides" claim?)
Yes, something like that, it should be more or less equal to a node shrink.
 

Idontcare

Elite Member
Oct 10, 1999
21,118
58
91
GF showed a roadmap at the conference call in February, where they wrote 32/28SHP ... since then I wonder if there is also a PD-SOI process available at 28nm.

I doubt it because if PD-SOI was on the plate for 28nm at GF then they'd have publicized that ages (years) ago.

Foundries are in the business of touting their offerings from on high, they don't keep that stuff hidden from the public eye or buried in obscure footnotes.

Now I'm hedging my position here because I do not know for fact that GF does not have PD-SOI at 28nm, and they do seem to operate with a bit of an irrational decision-tree matrix so it can't be 100% ruled out.

So while I can at this point it is completely unexpected and not anticipated, GloFo may pull a fast one and suddenly release a PD-SOI subnode in the 11th hour similar to how they are pulling in FD-SOI at 28nm.

Given the strong relations between GF's 32 and 28nm line, I guess that wouldnt be rocket science, would it?
It isn't rocket science to build a PD-SOI node, in fact it is easier (cheaper) to develop an SOI-based node than it is to design a bulk-Si node of comparable electrical properties.

It raises the production cost though, so there is a cost/benefits equation that you compute to determine which path makes the most sense to you as a fab/IDM.

If say, for example, developing an SOI-based node saves you $200m in R&D (a fixed one-time cost benefit) but raises the production expense of the manufactured wafer by say $200/wafer then you will make money (spend less money) going with SOI provided the total production volume for the lifetime of the SOI-based node is expected to be less than $200m/$200 = 1,000,000 (1m) wafers.

If you produce 1m wafers then your cost-savings from developing the SOI node are washed out by the sum total elevated manufacturing expenses associated with the SOI process.

Produce 2m wafers and suddenly your decision to save a penny has now made you a pound foolish as it costs you an additional $200m versus what you would have expended had you developed the bulk-Si node (at higher expense) and produced those 2m wafers on bulk at lower expense per wafer.

And that is the basic fundamental reason why Intel went bulk, and stayed bulk, but AMD went SOI and stayed SOI until the foundry days began in earnest. (it is also why foundries like TSMC and Samsung are bulk).

I know all this because I was tasked with assessing the viability of converting from bulk-Si to SOI at TI, something we were interested in doing as a means of lowering our R&D costs. But our wafer volumes were simply too high (we did all the pilot R&D work anyways, due diligence and what not, it wasn't as simple as a back of envelope calc like I have portrayed here).

So the question of 28nm PD-SOI is one of "is it easy(ier) to develop if one starts with 32nm PD-SOI", that answer is "yes". The question is "has it been the POR (plan of record) long enough at GloFo for their customers to have started making plans to use it in their designs starting 2-3 yrs ago?"

If they only just recently added it as an option then it will be a few years before we see anyone producing chips with it.
 

grimpr

Golden Member
Aug 21, 2007
1,095
7
81
SR stays the high frequency design. It "just" fixes major bottlenecks in BD, that's all.

You really believe that the major fault in BDs design was the shared decoder of the module? Everybody was and is blaming the slow caches and the CMT philosophy overall without pinpointing exactly the great bottleneck, how much uplift from the dedicated decoder on each integer core you expect to see in single thread performance, excluding all other fixes and tunes.
 

Haserath

Senior member
Sep 12, 2010
793
1
81
Considering SOI seems to produce better products(where performance matters), if you potentially spend less in R&D and charge more for your product because it is better, you could get more back from SOI.

Intel doesn't need it simply because nobody is currently competing with them and tick-tock allows them to quickly ramp up performance anyway. I would bet customers would gladly pay for that extra performance though.
 

SocketF

Senior member
Jun 2, 2006
236
0
71
I doubt it because if PD-SOI was on the plate for 28nm at GF then they'd have publicized that ages (years) ago.
(...)
If they only just recently added it as an option then it will be a few years before we see anyone producing chips with it.
(...)
The question is "has it been the POR (plan of record) long enough at GloFo for their customers to have started making plans to use it in their designs starting 2-3 yrs ago?"

The first mentioned 28SHP at their GTC in 2011:



http://www.brightsideofnews.com/news/2011/9/11/rumors-14nm-node-and-450mm-wafers-by-2015.aspx

However, since then it more or less vanished, on GF's 28nm website:
http://www.globalfoundries.com/technology/28nm.aspx

.. it wasnt mentioned at all. So people thought it is gone, but then it re-appeared suddenly in February.

There is also the question if SHP == PD-SOI, but I don't see what else it could be. FD-SOI is no candidate, due to its LP-base.

-----

It raises the production cost though, so there is a cost/benefits equation that you compute to determine which path makes the most sense to you as a fab/IDM.

If say, for example, developing an SOI-based node saves you $200m in R&D (a fixed one-time cost benefit) but raises the production expense of the manufactured wafer by say $200/wafer then you will make money (spend less money) going with SOI provided the total production volume for the lifetime of the SOI-based node is expected to be less than $200m/$200 = 1,000,000 (1m) wafers.

If you produce 1m wafers then your cost-savings from developing the SOI node are washed out by the sum total elevated manufacturing expenses associated with the SOI process.

Produce 2m wafers and suddenly your decision to save a penny has now made you a pound foolish as it costs you an additional $200m versus what you would have expended had you developed the bulk-Si node (at higher expense) and produced those 2m wafers on bulk at lower expense per wafer.

And that is the basic fundamental reason why Intel went bulk, and stayed bulk, but AMD went SOI and stayed SOI until the foundry days began in earnest. (it is also why foundries like TSMC and Samsung are bulk).
Thanks, sounds reasonable, however, I am missing the potential of higher ASPs, because the SOI products will clock higher.

At least if you have high-priced products. Not sure what TI was fabbing back then, but I guess it was rather cheap stuff? The only thing I remember was SUN, they fabbed their chip @TI, didnt they?

-----

You really believe that the major fault in BDs design was the shared decoder of the module? Everybody was and is blaming the slow caches and the CMT philosophy overall without pinpointing exactly the great bottleneck,
I currently believe that the great bottleneck is the narrow 128bit connection of each module to the XBAR.

The reason for this is the performance benefits by overclocking the NB-clock. Performance in some single thread tasks or games are veery nice.

I also checked the decode rate and as soon as you go out of the L2 it is really baad ... less than 1 op per clock on average.

Seems AMD knows that too, if that BSN piece is for real:
To this end the communication facilities between x86 CPU cores and GPU cores have been extended considerably. The width of the internal interface called Onion which connects the GPU to the coherent request queues has been widened to 256-bit in each direction.
AFAIK Onion is dead since Trinity (only used in Llano), but who cares. As long as that 256bit info is correct, I am happy ^^

Just a pity that we probably wont see a Steamroller CPU with L3, if all that cancellation rumors are true
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |