[Sweclockers] AMD Zen coming in Q3 2016, will be on 14 nm

Page 6 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

ElFenix

Elite Member
Super Moderator
Mar 20, 2000
102,426
8,388
126
There will probably be an iGPU version, but the GPU chip would be a 2.5D or 3D chip stacked module thanks to TSV.



Probably make the GPU silicon at TSMC, the CPU silicon at GF, and 2.5D interposer them to make their APU for consumer markets, no GPU chip for their server/high-end markets, etc. Best of all worlds.

intredasting
 

Shehriazad

Senior member
Nov 3, 2014
555
2
46
Its not so much can they fit but can they make it economic to fit the components. 14 nm will be more expensive than depreciated 28 nm.

Since AMD seems to be giving it their all for this chip and it looks like it will decide whether or not AMD will continue with the Desktop CPU market at all...you would expect them to actually try and make a competitive product, right?

I wouldn't see a point in trying to force 14nm so "early" (in AMDs situation this can be considered early...they barely started making 28nm CPUs) unless they either make a physically tiny chip so a wafer can fit a lot of them to stay somewhat cost effective...or they try to fit as much stuff in there as they can to try for a last time to get a hold in the desktop CPU market.

While it would be cute to see a candy sized 8-core CPU...I don't see that being able to carry enough oomph, eh?

And like IDC just posted...that chip might be 2.5D/3D...that also plays a big role in cost efficiency.


But we literally got 0 info...just sweclockers saying "Moar coars"
 

III-V

Senior member
Oct 12, 2014
678
1
41
There will probably be an iGPU version, but the GPU chip would be a 2.5D or 3D chip stacked module thanks to TSV.



Probably make the GPU silicon at TSMC, the CPU silicon at GF, and 2.5D interposer them to make their APU for consumer markets, no GPU chip for their server/high-end markets, etc. Best of all worlds.
Sounds expensive. Have interposers actually become economical?
 

Idontcare

Elite Member
Oct 10, 1999
21,118
59
91
Sounds expensive. Have interposers actually become economical?

The point Bryan Black of AMD was trying to make is that everything gets expensive as you continue to progress to smaller design rule nodes, and it gets even more difficult to build a SoC because a SoC requires a process node that is capable of providing a variety of electrical components and attributes that are unfortunately getting even more expensive.

There comes a cost cross-over point in the near the future, in his view, in which it makes more sense (economically) to design and fab 2 or 3 discrete chips on 2 or 3 different process nodes, each tailored in cost and capability for the specific purposes of the IC being produced on it (sram, MPU logic, analog, GPU logic, etc) and then have those discrete chips re-integrated via transposer techniques.

The point of it isn't that it will yield a re-integrated SoC that is cheaper than last years fully integrated SoC, but rather that it will enable next year's SoC to be much less costly than it otherwise would have been were one to attempt to create an equivalent monolithic SoC with say a 10nm node which contains all manner of crappy xtor parametric trade-offs in an effort to be a jack of all and master of none.

Of course the argument makes perfect logical sense, no arguing with Mr. Black in that regard. But the question on everyone's mind is "in what year or on what process node does the cost-crossover occur?"
 

Fjodor2001

Diamond Member
Feb 6, 2010
3,938
408
126
The point Bryan Black of AMD was trying to make is that everything gets expensive as you continue to progress to smaller design rule nodes, and it gets even more difficult to build a SoC because a SoC requires a process node that is capable of providing a variety of electrical components and attributes that are unfortunately getting even more expensive.

There comes a cost cross-over point in the near the future, in his view, in which it makes more sense (economically) to design and fab 2 or 3 discrete chips on 2 or 3 different process nodes, each tailored in cost and capability for the specific purposes of the IC being produced on it (sram, MPU logic, analog, GPU logic, etc) and then have those discrete chips re-integrated via transposer techniques.

The point of it isn't that it will yield a re-integrated SoC that is cheaper than last years fully integrated SoC, but rather that it will enable next year's SoC to be much less costly than it otherwise would have been were one to attempt to create an equivalent monolithic SoC with say a 10nm node which contains all manner of crappy xtor parametric trade-offs in an effort to be a jack of all and master of none.

Of course the argument makes perfect logical sense, no arguing with Mr. Black in that regard. But the question on everyone's mind is "in what year or on what process node does the cost-crossover occur?"

Cool. Wouldn't this approach also have the benefit of improving yield? Compare that to making a single large monolitic die containing all blocks - then you might have to discard it if any single one of the blocks on the die fail in production. You can of course combat this to some degree, e.g. by selling dies with failed iGPU as iGPU-less chips, but still.
 

Idontcare

Elite Member
Oct 10, 1999
21,118
59
91
Cool. Wouldn't this approach also have the benefit of improving yield? Compare that to making a single large monolitic die containing all blocks - then you might have to discard it if any single one of the blocks on the die fail in production. You can of course combat this to some degree, e.g. by selling dies with failed iGPU as iGPU-less chips, but still.

Yes, absolutely. Although it seems with today's die-harvesting techniques there isn't much to be gained over what we already have in place.

The real advantage is that you stop trying to force your iGPU to be fabbed with a process node that was highly specialized to make high frequency CPUs, or you stop trying to make your high frequency CPU cores be fabbed on a process node that was optimized to produce high-density low-frequency mobile chips.

Fab each chip on a process node, or at an entirely different foundry, and the re-integrate them at a packaging house such that the interposer-based MCM delivers maximum benefits.

Look at how Intel approaches IRIS PRO, they don't force the advanced node to deliver the best integrated dram possible. Instead they let the advanced do what it is best at, and they use a less expensive (and arguably better in terms of leakage control) node do what it is good for in terms of dram.

http://www.anandtech.com/show/6993/intel-iris-pro-5200-graphics-review-core-i74950hq-tested/3

The idea has always had merit, but historically it came with a performance tradeoff (at a minimum, which is why MCM'ed chips have such a bad reputation) and a cost trade-off (for the worse, traditionally speaking).

Both of those trade-offs are being engineered out of the equation. In a couple years it will make more sense to go 2.5D and 3D re-integration rather than monolithic SoC.
 

NTMBK

Lifer
Nov 14, 2011
10,269
5,134
136
Sounds expensive. Have interposers actually become economical?

If you want HBM, you need to go to interposers- and if you've already made that jump, my guess is that it makes sense to break up your SoC into smaller dies on more optimized processes.
 

Sequences

Member
Nov 27, 2012
124
0
76
Is there even a complete 2015 roadmap out yet for AMD for all their businesses? I think its premature to plan for late 2016 without knowing more about 2015. Unless, of course, there is no roadmap and they're just hoping to survive the next 18 months on fumes.
 

ShintaiDK

Lifer
Apr 22, 2012
20,378
145
106
Is there even a complete 2015 roadmap out yet for AMD for all their businesses? I think its premature to plan for late 2016 without knowing more about 2015. Unless, of course, there is no roadmap and they're just hoping to survive the next 18 months on fumes.

Dont think so. It gets reshuffled all the time.
 

Shehriazad

Senior member
Nov 3, 2014
555
2
46
Yea...AMD isn't exactly known for actually following through with their roadmaps...might be best if they stop giving those out.
 

AtenRa

Lifer
Feb 2, 2009
14,003
3,361
136
Yea...AMD isn't exactly known for actually following through with their roadmaps...might be best if they stop giving those out.

There is a reason why both AMD and Intel include the following in every roadmap pdf

“subject to change without notice or obligation to notify of changes”
 

carop

Member
Jul 9, 2012
91
7
71
Interestingly enough, this was in a conversation with you.

Here's some hard data to back this up:
http://www-inst.eecs.berkeley.edu/~ee290d/fa13/LectureNotes/Lecture15.pdf
Look at slide 18. TSMC has roughly equivalent performance to Intel's 22nm... on a process that won't show up until next year at best. That lines up exactly with your 1st gen FinFET bit -- a 3 and a half year lag. And their PMOS performance is trailing considerably. And it's unlikely that things have moved much since then, since their targets would have already been dialed in, and they'd just be focusing on yields at this point.

Slide 18 in that lecture deck does not say anything about the TSMC 16nm FinFET process technology. In fact, the TSMC finFET process technology that is described at IEDM 2010 is different from the 16nm FinFET process technology. The TSMC data on slide 18 are from a high performance 22/20nm CMOS logic using the finFET transistor architecture having an SRAM cell size of 0.100 μm2. The TSMC 16nm FinFET technology has an SRAM cell size of 0.070 μm2.

You can, however, do an apples to apples comparison between Intel 22nm FinFET and TSMC 16nm FinFET low power transistor characteristics.

The details of Intel 22nm FinFET process are availablee on slide 19 in that lecture deck and elsewhere on the Internet:



Jyunichi Oshita at Nikkei BP Semiconductor Research covered the details of the TSMC 16nm FinFET process:

Transistor characteristics
In regard to the on-state current of the prototyped low leakage transistor (gate length: 34nm), it is 520μA/μm for the nMOS and 525μA/μm for the pMOS with a power supply voltage of 0.75V and an off leakage current of 30pA/μm.

http://techon.nikkeibp.co.jp/english/NEWS_EN/20131213/322503/

In both cases the supply voltage is 0.75V and off leakage current is 30pA/μm. The TSMC nMOS and pMOS appear to perform better. Furthermore, TSMC claims a 15% speed boost and 30% power reduction for its 16FF+ (FinFET Plus) technology.

It may also be a good idea to keep in mind that Intel (and following them TSMC) normalize the drive current (Ion) in their publications, and the Ion comes at the price of higher capacitance. Unfortunately, neither Intel nor TSMC is reporting the capacitance. In order to calculate the speed boost you have to divide the current or gm by capacitance.
 

Abwx

Lifer
Apr 2, 2011
11,172
3,872
136
Slide 18 in that lecture deck does not say anything about the TSMC 16nm FinFET process technology. In fact, the TSMC finFET process technology that is described at IEDM 2010 is different from the 16nm FinFET process technology. The TSMC data on slide 18 are from a high performance 22/20nm CMOS logic using the finFET transistor architecture having an SRAM cell size of 0.100 μm2. The TSMC 16nm FinFET technology has an SRAM cell size of 0.070 μm2.

You can, however, do an apples to apples comparison between Intel 22nm FinFET and TSMC 16nm FinFET low power transistor characteristics.

The details of Intel 22nm FinFET process are availablee on slide 19 in that lecture deck and elsewhere on the Internet:



Jyunichi Oshita at Nikkei BP Semiconductor Research covered the details of the TSMC 16nm FinFET process:

http://techon.nikkeibp.co.jp/english/NEWS_EN/20131213/322503/

In both cases the supply voltage is 0.75V and off leakage current is 30pA/μm. The TSMC nMOS and pMOS appear to perform better. Furthermore, TSMC claims a 15% speed boost and 30% power reduction for its 16FF+ (FinFET Plus) technology.

It may also be a good idea to keep in mind that Intel (and following them TSMC) normalize the drive current (Ion) in their publications, and the Ion comes at the price of higher capacitance. Unfortunately, neither Intel nor TSMC is reporting the capacitance. In order to calculate the speed boost you have to divide the current or gm by capacitance.

Insightfull post...

Seems to me that previous comparison were between TSMC s 2009 Finfets and Intel s 2012 Finfets.

As you point it without parasitic capacitances all thoses transconductance numbers are moot when estimating the performance.
 
Mar 10, 2006
11,715
2,012
126
Slide 18 in that lecture deck does not say anything about the TSMC 16nm FinFET process technology. In fact, the TSMC finFET process technology that is described at IEDM 2010 is different from the 16nm FinFET process technology. The TSMC data on slide 18 are from a high performance 22/20nm CMOS logic using the finFET transistor architecture having an SRAM cell size of 0.100 μm2. The TSMC 16nm FinFET technology has an SRAM cell size of 0.070 μm2.

You can, however, do an apples to apples comparison between Intel 22nm FinFET and TSMC 16nm FinFET low power transistor characteristics.

The details of Intel 22nm FinFET process are availablee on slide 19 in that lecture deck and elsewhere on the Internet:



Jyunichi Oshita at Nikkei BP Semiconductor Research covered the details of the TSMC 16nm FinFET process:



http://techon.nikkeibp.co.jp/english/NEWS_EN/20131213/322503/

In both cases the supply voltage is 0.75V and off leakage current is 30pA/μm. The TSMC nMOS and pMOS appear to perform better. Furthermore, TSMC claims a 15% speed boost and 30% power reduction for its 16FF+ (FinFET Plus) technology.

It may also be a good idea to keep in mind that Intel (and following them TSMC) normalize the drive current (Ion) in their publications, and the Ion comes at the price of higher capacitance. Unfortunately, neither Intel nor TSMC is reporting the capacitance. In order to calculate the speed boost you have to divide the current or gm by capacitance.

The 16FF process has higher drive currents than the Intel 22nm process at very low leakages, but that delta narrows substantially above 1 nA/um leakage. Intel seems to be much better at tuning its processes for high performance than for very low leakage.

The 16FF+, from the paper published at IEDM this year, achieves its performance/power boost precisely by lower capacitance rather than any dramatic increase in drive current.
 
Last edited:

carop

Member
Jul 9, 2012
91
7
71
The 16FF process has higher drive currents than the Intel 22nm process at very low leakages, but that delta narrows substantially above 1 nA/um leakage. Intel seems to be much better at tuning its processes for high performance than for very low leakage.

The 16FF+, from the paper published at IEDM this year, achieves its performance/power boost precisely by lower capacitance rather than any dramatic increase in drive current.

I am not trying to dispute that Intel is better at tuning its process.

However, I could argue that Intel accepted a drop in drive current at their 14nm node in exchange for a reduction in FEOL capacitance as well.

The Intel 22nm node has a fin height of about 35nm and a fin pitch of 60nm. At 14nm, they increased the fin height to 42nm and dropped the fin pitch to 42nm. If everything remained the same (gate stack, junctions, parasitic resistance, etc...), they would have almost 70% higher current compared to their 22nm devices. However, the numbers they quote in their IEDM 2014 paper are less than 30% (15% for nMOS, 41% for pMOS). This means that the device performance was actually degraded. It seems that they accepted a drop in current in exchange for a reduction in the FEOL capacitance (all normalized per transistor width). So, FEOL capacitance is important for Intel as well.
 

witeken

Diamond Member
Dec 25, 2013
3,899
193
106
If everything remained the same (gate stack, junctions, parasitic resistance, etc...), they would have almost 70% higher current compared to their 22nm devices.
Is that including quantum effects?
 

cbn

Lifer
Mar 27, 2009
12,968
221
106
There will probably be an iGPU version, but the GPU chip would be a 2.5D or 3D chip stacked module thanks to TSV.



Probably make the GPU silicon at TSMC, the CPU silicon at GF, and 2.5D interposer them to make their APU for consumer markets, no GPU chip for their server/high-end markets, etc. Best of all worlds.

Being able to use an interposer (with separate die for iGPU) removes one of my complaints of large iGPU APUs having such a large die size. Larger die sizes, of course, being more expensive to produce than two smaller dies (even in some cases where the combined die area of the two smaller dies is somewhat greater than the single large die.)

However, I am still concerned about these two things:

1. The price enthusiasts (including myself) are willing to pay for large iGPUs.This especially when the processor appears to be largely price sensitive. (It just seems that the culture of enthusiast desktop is against large iGPU for the most part. Folks like having choice of not only the exact GPU they use (via video card), but also the ability to sell the GPU and upgrade it).

2. How much CPU throttling will exist in stock applications (during iGPU load) if the FM3 socket is only rated 95 watts and the processors come with 95 watt coolers?
 

ctsoth

Member
Feb 6, 2011
148
0
0
2. How much CPU throttling will exist in stock applications (during iGPU load) if the FM3 socket is only rated 95 watts and the processors come with 95 watt coolers?

Equivalent products are already on the market, so I would think, if the product is well designed and validated, none.
 

cbn

Lifer
Mar 27, 2009
12,968
221
106
Equivalent products are already on the market, so I would think, if the product is well designed and validated, none.

The current Kaveri products (A8 and both A10 K SKUs) do throttle below cpu base clocks when cpu and iGPU stress tests like Prime95/Furmark are being run at the same time.

Here is an example of the 65 watt A8-7600 downclocking to 2.4 Ghz during Prime 95/Furmark. (Other forum members here at Anandtech have also written the the A10-7700K will throttle to 2.8 Ghz and the A10-7850K to 3Ghz)

Now granted folks have mentioned to me that if p states are changed the cpu throttling during stress test and iGPU load will go away, but none so far have been able to answer me if this works when using the stock cooler.
 
Last edited:

ctsoth

Member
Feb 6, 2011
148
0
0
The current Kaveri products (A8 and both A10 K SKUs) do throttle below cpu base clocks when cpu and iGPU stress tests like Prime95/Furmark are being run at the same time.

Here is an example of the 65 watt A8-7600 downclocking to 2.4 Ghz during Prime 95/Furmark. (Other forum members here at Anandtech have also written the the A10-7700K will throttle to 2.8 Ghz and the A10-7850K to 3Ghz)

Now granted folks have mentioned to me that if p states are changed the cpu throttling during stress test and iGPU load will go away, but none so far have been able to answer me if this works when using the stock cooler.

I presume that it does, since I believe that people do this with laptops as well.
 

cbn

Lifer
Mar 27, 2009
12,968
221
106
I presume that it does, since I believe that people do this with laptops as well.

Laptops are a different story because they come with various types of custom coolers, and some laptops have worse throttling problems than others because of this.
 

RussianSensation

Elite Member
Sep 5, 2003
19,458
765
126
The current Kaveri products (A8 and both A10 K SKUs) do throttle below cpu base clocks when cpu and iGPU stress tests like Prime95/Furmark are being run at the same time.

Here is an example of the 65 watt A8-7600 downclocking to 2.4 Ghz during Prime 95/Furmark. (Other forum members here at Anandtech have also written the the A10-7700K will throttle to 2.8 Ghz and the A10-7850K to 3Ghz)

Now granted folks have mentioned to me that if p states are changed the cpu throttling during stress test and iGPU load will go away, but none so far have been able to answer me if this works when using the stock cooler.

My 3635QM throttles to 2.35Ghz when running distributed computing on the CPU and the laptop dGPU at the same time. The GPU doesn't throttle when the CPU and GPU are 99% loaded simultaneously.

In fact, if I only run PrimeGrid/Seti@Home on the CPU, I am often in the 3.19-3.3Ghz range, but as soon as I add the GPU into the mix for other projects, my CPU speed on all 8 threads falls below the 2.4Ghz base clock.

I never paid attention to Intel's base clock since on my desktop I always had sufficient cooling. For my next laptop upgrade, I will pay extra $100-200 to get a chip with a much higher base clock. The CPU temperature does not exceed 94C, which is way below 105C maximum the 3635QM is rated at.

In practice that means for my laptop usage a 2.8Ghz Intel i7 with a 4.0Ghz Turbo would not be just 18% faster than a 2.2Ghz i7 that with a 3.4Ghz Turbo, but actually 27%!

The programs I run aren't even synthetic like Furmark/Prime95. If I tried those, it would be disastrous. I bet my 3635 would drop to 1.8-2Ghz.

If you read Notebookcheck wrt max load data on even the best Intel laptops, when loading the CPU and GPU to 99% similtaneously, you will almost always get CPU or GPU throttling or both. For example, the 970 SLI Aorus X7 will for sure throttle the i7 under such a scenario. The few laptops that might cope better are 9-12 lbs bricks with 1.7-2.2 inch thickness and 1.5 hour or less battery life. I don't consider those products 'laptops'.
 
Last edited:

cbn

Lifer
Mar 27, 2009
12,968
221
106
My 3635QM throttles to 2.35Ghz when running distributed computing on the CPU and the laptop dGPU at the same time. The GPU doesn't throttle when the CPU and GPU are 99% loaded simultaneously.

That is pretty good IMO. We are barely seeing any throttling on your cpu (2.4 Ghz base clock throttled to 2.35 Ghz) compared to the A8-7600 decrease (3.1 Ghz base clock throttled to 2.4 Ghz).

Of course, we still need to find out if that strong degree of throttling on the desktop APUs still exists with p states adjusted when 65 watt stock cooler is used.

EDIT: I see you have mentioned DC, rather than Prime95/Furmark.

P.S. Check out these results for a Kaveri A10-7300 notebook (1.9 Ghz base clock, 3.2 Ghz turbo and 384 sp iGPU @ 464 Mhz base/533 Mhz max frequency) when the both cpu and gpu stress tests are run --> http://www.notebookcheck.net/Acer-Aspire-E5-551-T8X3-Kaveri-A10-7300-Notebook-Review.122063.0.html

The rates settled to 1.4 to 1.8 GHz when the stress test was additionally started via Prime95 - we have to speak of throttling here because the base clock is actually 1.9 GHz. The system completely faltered after adding the GPU stress test via FurMark: The CPU remained almost consistently at 1.1 GHz, and the GPU did not manage to surpass 282 MHz

That is a lot of cpu (1.9 Ghz base clock --> 1.1 Ghz) and gpu (464 Mhz --> 282 Mhz) throttling relative to very small amount we are seeing from your Intel notebook.
 
Last edited:
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |