[Techpowerup] AMD "Zen" CPU Prototypes Tested, "Meet all Expectations"

Page 13 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

AtenRa

Lifer
Feb 2, 2009
14,003
3,361
136
I like to think that my posts are fairly analytical, but obviously as a judge of my own posts I am not exactly impartial

The bottom line is that the expectations around Zen and Zen-based products seem to be inflating with each and every day.

All we know for sure about Zen is that AMD has said that perf/clock goes up by around 40% relative to Excavator and that AMD expects Zen to make the company much more competitive than it has been over the last several years in high performance processors.

Anything beyond that is basically (informed?) speculation.

The original point that I was trying to make in the text that you quoted is that high IPC/high frequency x86 CPUs are really, really hard to design and build. The reason we see 5-10% IPC improvement per generation from Intel is because wringing out those gains is actually super tough and R&D intensive, especially since Intel needs to make these IPC boosts while keeping frequencies high and keeping power efficiency in check.

If it is proving tough for Intel to scale these new performance heights, I have to maintain a fairly high degree of skepticism about AMD's ability to deliver a design in basically one shot that effectively erases the perf/clock gap that has long existed between Intel's and AMD's top designs while at the same time delivering high clocks, and great power efficiency.

We will see if AMD ultimately delivers but from every bit of information available to me in the public domain, I can't exactly be enthusiastic that Zen will be what some of these posters hope. That's all.

If you would only ported Excavator to 14nm LPP, you would get the same performance (lets say A10-7870K) at ~60% lower power and at ~half the die area.

Just think about it.
 
Mar 10, 2006
11,715
2,012
126
If you would only ported Excavator to 14nm LPP, you would get the same performance (lets say A10-7870K) at ~60% lower power and at ~half the die area.

Just think about it.

I don't think it's that simple. Take Intel's 14nm for example. At very high frequencies (i.e. 4GHz+) Skylake is barely any more efficient than Haswell is (might even be less efficient?), but at lower frequencies (i.e. the ones we see in Ultrabooks and even "H" series laptop processors) the efficiency story favors 14nm over 22nm quite nicely.
 
Last edited:

Abwx

Lifer
Apr 2, 2011
11,433
4,196
136
Then why did Apple go with TSMC 16FF+ for the entirety of the upcoming A10 orders?

It s possible that LPE is slightly below TSMC s 16FF+, and/or because of manufacturing capacity as Samsung use it for themselves, besides it s better for Apple to wait for the LPP as it will bring 10% better perfs or 20% lower power, that s not negligible given that Apple segments require above average perfs..
 

AtenRa

Lifer
Feb 2, 2009
14,003
3,361
136
I don't think it's that simple. Take Intel's 14nm for example. At very high frequencies (i.e. 4GHz+) Skylake is barely any more efficient than Haswell is (might even be less efficient?), but at lower frequencies (i.e. the ones we see in Ultrabooks and even "H" series laptop processors) the efficiency story favors 14nm over 22nm quite nicely.

You will also have to factor in the CPU architecture. Excavator is a high clock design, it can reach close to 4GHz on the planar 28nm.
 

ShintaiDK

Lifer
Apr 22, 2012
20,378
145
106
Higher IPC=Harder to reach frequency and higher powerdraw.

Its not hard to make a CPU with twice the IPC than Skylake for example. Its just not worth much if it only clocks at 500Mhz while using 150W.

This is Phenom and Bulldozer all over again. We are already at 8C 4Ghz 95W with Haswell or higher IPC.
 
Mar 10, 2006
11,715
2,012
126
It s possible that LPE is slightly below TSMC s 16FF+, and/or because of manufacturing capacity as Samsung use it for themselves, besides it s better for Apple to wait for the LPP as it will bring 10% better perfs or 20% lower power, that s not negligible given that Apple segments require above average perfs..

The only reason I question that A9 is built on LPE and not LPP to be honest is because when I have asked around, I have been told that A9 is an LPP design not LPE, but the confidence level around these answers was good but not 100%.

That's why I'd appreciate any "hard" evidence one way or another because I am legitimately curious.
 

TechGod123

Member
Oct 30, 2015
94
1
0
The only reason I question that A9 is built on LPE and not LPP to be honest is because when I have asked around, I have been told that A9 is an LPP design not LPE, but the confidence level around these answers was good but not 100%.

That's why I'd appreciate any "hard" evidence one way or another because I am legitimately curious.

Right. Sorry for attacking you numerous times.

I'm thinking that because various sources indicate the same process used to make the Exynos 7420 which used LPE and it would have made sense that the A9 did as well as I don't think it would be feasible for Samsung to produce LPE chips for the Exynos AND be manufacturing LPP chips. Do you have a link for your sources?
 
Mar 10, 2006
11,715
2,012
126
Right. Sorry for attacking you numerous times.

It's OK. Although I'm not exactly out to please everybody, I always much prefer to make friends than enemies

I'm thinking that because various sources indicate the same process used to make the Exynos 7420 which used LPE and it would have made sense that the A9 did as well as I don't think it would be feasible for Samsung to produce LPE chips for the Exynos AND be manufacturing LPP chips. Do you have a link for your sources?

Remember that a lot of tech news sites basically parrot one another or they make assumptions without verifying facts, so that's why I am a bit skeptical.

The sources that I have are, unfortunately, not public. Otherwise I would link to them
 

Abwx

Lifer
Apr 2, 2011
11,433
4,196
136
The only reason I question that A9 is built on LPE and not LPP to be honest is because when I have asked around, I have been told that A9 is an LPP design not LPE, but the confidence level around these answers was good but not 100%.

That's why I'd appreciate any "hard" evidence one way or another because I am legitimately curious.

Because the layout and physical design is what require the more time, and before this phase the simulations of the electronic circuitries require the mathematical modelisation of the transistors measured on real silicon.

So the process has to be completed at a CPU design phase even if industrial production is still not possible, in this case it s likely that LPP wasnt finalized at the time of the design neither would be ready on time.

That said GFs iteration of Samsung LPP looks good, the LVT perf/watt has been displayed at 2.41GHz, wich is undoubtly a favourable case and hence well below the max frequency, the sLVT should bring the same frequency uplift in respect of the LVT as to this one comparatively to the RVT, wich point to 3.5GHz+ as being easily within its reach.
 

Dresdenboy

Golden Member
Jul 28, 2003
1,730
554
136
citavia.blog.de
Why should we believe that they have reached 40% higher IPC than Excavator?
We don't even know what they mean by that.

Most likely, the 40% should be preceded by "up to".
You need some data to support a "most likely". With the available information from the GCC patch and no need to have clock frequencies (IPC!), it's possible to show multiple scenarios, where more than 40% could be achieved, even assuming, that they keep the XV bottlenecks in the cache subsystem.

From my POV they'd need to add pause cycles to keep a absolute max. 40% IPC gain.

So I opt for a simulated average value with the remaining question: what's the application mix?
 

Excessi0n

Member
Jul 25, 2014
140
36
101
It means that Samsung LPE is competitive with TSMC s 16FF+...

That seems to be true for the normal phone use case where the chip is mostly idle, but the iPhones with the Samsung chip have substantially lower battery life than the TSMC ones when under a heavy load (geekbench). I hope LPP is a lot better than LPE, because AMD will have a hard time competing on power use if that's not the case.
 

TechGod123

Member
Oct 30, 2015
94
1
0
You need some data to support a "most likely". With the available information from the GCC patch and no need to have clock frequencies (IPC!), it's possible to show multiple scenarios, where more than 40% could be achieved, even assuming, that they keep the XV bottlenecks in the cache subsystem.

From my POV they'd need to add pause cycles to keep a absolute max. 40% IPC gain.

So I opt for a simulated average value with the remaining question: what's the application mix?
Love your blog posts btw.
 

Phynaz

Lifer
Mar 13, 2006
10,140
819
126
100% more integer exe ressources than an EXV core, 150% more FP exe ressources than a whole module...




The 14nm LVT, the one whose perf/Watt were published, would double the perf/watt of a chip like Kaveri, and there s a faster sLVT enhancement..




I have no doubt that you you ll be deeply disappointed...

Hey, where have you been? I thought you only got 30 days, but it's been over two months.
 

Abwx

Lifer
Apr 2, 2011
11,433
4,196
136
Hey, where have you been? I thought you only got 30 days, but it's been over two months.

As said i voluntarly didnt post, you know that AT is not where there s the more insight, but let say that ultimately i got some mercy for some starving trolls...


That seems to be true for the normal phone use case where the chip is mostly idle, but the iPhones with the Samsung chip have substantially lower battery life than the TSMC ones when under a heavy load (geekbench). I hope LPP is a lot better than LPE, because AMD will have a hard time competing on power use if that's not the case.

LPP consume 20% less at equal frequency.

LPP LVT improvement over what AMD has curently has been published and is more than 2x better perf/watt, AMD will rather use the sLVT wich leaks more but is faster.

Numbers for LPP LVT compared to 28nm HPP is 2.06x the speed at 1.46x the power comsumption, normalized at equal frequency the improvement is (2.06^2)/1.46 = 2.9x better perf/Watt or 65% lower power drain.

That s of course at 2.41GHz wich has surely been picked because it s a favourable section of the LPP power curve, but still, a 2x improvement is in the realm of possibilities.
 
Last edited:

Abwx

Lifer
Apr 2, 2011
11,433
4,196
136
You need some data to support a "most likely". With the available information from the GCC patch and no need to have clock frequencies (IPC!), it's possible to show multiple scenarios, where more than 40% could be achieved, even assuming, that they keep the XV bottlenecks in the cache subsystem.

From my POV they'd need to add pause cycles to keep a absolute max. 40% IPC gain.

So I opt for a simulated average value with the remaining question: what's the application mix?

No need of datas, or rather history speak for itself as AMD has historicaly published average improvements, not best case..

Or perhaps we should look at the 4% announced at ISSC for Carrizo that ended being 9-13% in Cinebench....
 

looncraz

Senior member
Sep 12, 2011
722
1,651
136
I like to think that my posts are fairly analytical, but obviously as a judge of my own posts I am not exactly impartial

The bottom line is that the expectations around Zen and Zen-based products seem to be inflating with each and every day.

All we know for sure about Zen is that AMD has said that perf/clock goes up by around 40% relative to Excavator and that AMD expects Zen to make the company much more competitive than it has been over the last several years in high performance processors.

Anything beyond that is basically (informed?) speculation.

The original point that I was trying to make in the text that you quoted is that high IPC/high frequency x86 CPUs are really, really hard to design and build. The reason we see 5-10% IPC improvement per generation from Intel is because wringing out those gains is actually super tough and R&D intensive, especially since Intel needs to make these IPC boosts while keeping frequencies high and keeping power efficiency in check.

If it is proving tough for Intel to scale these new performance heights, I have to maintain a fairly high degree of skepticism about AMD's ability to deliver a design in basically one shot that effectively erases the perf/clock gap that has long existed between Intel's and AMD's top designs while at the same time delivering high clocks, and great power efficiency.

We will see if AMD ultimately delivers but from every bit of information available to me in the public domain, I can't exactly be enthusiastic that Zen will be what some of these posters hope. That's all.

Intel has you convinced that it is difficult for them to give you more performance, that's just what they want you to think.

If Intel wanted to give you more, they'd give you more. The only factor is achievable profit margin. Intel could absolutely give bigger performance jumps, but it would be unusual for them do so.

Since Penryn, they really haven't been that inconsistent:




Something to pay attention to here, as well, is that Excavator is pretty much dead even with Penryn for IPC. Add 40% add you land square on Haswell.

I'm really surprised anyone thinks Zen will reach Skylake at all.

Thinking it might be the next Bulldozer is at least somewhat reasonable (though the fact that AMD stumbled that bad... once... means nothing... I seem to remember Intel doing something quite similar ^_^:whiste)).
 

looncraz

Senior member
Sep 12, 2011
722
1,651
136
More numbers...

GLOBAL FOUNDRIES FINFETS VS 28NMs
GF PROCESS 28 SLP 28 HPP 14 LPP

Thanks for posting the numbers ;-)

Just goes to show what I've had to tell so many people, there's really no reason why Zen shouldn't be able to clock well on 14nm LPP - beyond AMD's lack of experience with it and the new IP involved (if anything's going to slow down the clock speeds, I think it'll be the caches).
 

looncraz

Senior member
Sep 12, 2011
722
1,651
136
Using the recent past performance of a company to form expectations of what type of product they will release makes one a "shill"?

How do you figure?

Which is magical thinking? Expecting a company's to perform similarly to how they have been for the last 6 years or so, or expecting some massive improvement despite fewer resources.

Are you saying Intel can never recover from their Pentium 4 debacle

AMD stumbles one time and you automatically assume they will stumble again when they have done so much that clearly shows they are on the right track? They even brought in a great deal of specialized talent (more than just Jim Keller, he's just the biggest name), but some who had FinFet experience.

With what we know about Zen (which is more than many seem to realize), it should be quite an interesting CPU. The FPU is what we don't really know it will perform, though I've done some work towards estimating that:

http://looncraz.net/ZenAssignments.html
 
Aug 11, 2008
10,451
642
126
Intel has you convinced that it is difficult for them to give you more performance, that's just what they want you to think.

If Intel wanted to give you more, they'd give you more. The only factor is achievable profit margin. Intel could absolutely give bigger performance jumps, but it would be unusual for them do so.

Since Penryn, they really haven't been that inconsistent:




Something to pay attention to here, as well, is that Excavator is pretty much dead even with Penryn for IPC. Add 40% add you land square on Haswell.

I'm really surprised anyone thinks Zen will reach Skylake at all.

Thinking it might be the next Bulldozer is at least somewhat reasonable (though the fact that AMD stumbled that bad... once... means nothing... I seem to remember Intel doing something quite similar ^_^:whiste)).

So would you care to explain how giving less performance leads to bigger profits? Seems to me more performance gains would lead to more sales (new sales and upgrades), thus more profits. Do you really think Intel is purposely holding back performance, especially with the increasing competition from ARM and Mobile?

It is a simple fact that the easy gains have been made long ago and every incremental gain becomes increasingly difficult, especially when trying to do it in a power constrained envelope.
 
Last edited:

looncraz

Senior member
Sep 12, 2011
722
1,651
136
So would you care to explain how giving less performance leads to bigger profits? Seems to me more performance gains would lead to more sales (new sales and upgrades), thus more profits. Do you really think Intel is purposely holding back performance, especially with the increasing competition from ARM and Mobile?

It is a simple fact that the easy gains have been made long ago and every incremental gain becomes increasingly difficult, especially when trying to do it in a power constrained envelope.

Sure. Intel doesn't have to worry about performance to drive sales, they have no competition. The user upgrade market is quite small compared to the OEM market as well, so they can make more money by release only small upgrades and charging those who make the jump each time (or many of the times) out the yang.

The real money is in servers. There, Intel is in the same situation. They have no competition, so they only advance as much as they must to keep sales up and margins high.

While x86 gains are certainly not extremely simple, if you look at the changes Intel has made to the Core design to get us what we have today it is quite clear that they have only been doing what it takes to provide a small boost that will keep people interested and give the OEMs a more expensive CPU to purchase.

Every change is incremental and simple (for the most part). These are things we said Intel could do when Core 2s were out and about to increase performance (full AGUs, another ALU and FPU port, more scheduler slots, improved branch prediction, etc.) together with a few things that are more unique (reduced misprediction penalties).

Almost everything Intel has done has been to give the illusion of struggling to create more performance. They keep a GPU on these dies that will often go completely unused in higher end systems... they could EASILY take that off and use that die space to provide more performance. Either with two more cores or with wider cores.

The biggest hurdle to overcome is the microcode itself. When you add an ALU you have to know what you will have execute with it, same with an extra FPU pipe.

Want to know how to make it wildly faster? Easy:

8xALU/FPU pipes, 4x AGUs, 1x Store AGU, 1x Store Data, double the caches, double the instruction fetch, double the register pool, etc. Assign four logical cores to this core and use a single unified scheduler.

Now, each thread can potentially access 2x MUL, 2x DIV, 8x ADD, 8xLEA, 4x LOAD, 5xSTORE, 8x fop, etc... all at once.

You throw all of the instructions into the same cache, without any regard for which thread owns what, and you pull the instructions. When an instruction goes through decoding the register is chosen in the pool based on the thread to which the instruction belongs, and the data is loaded, and the instruction continues like nothing else is different.

This is completely doable, and would easily fit in the die space (it would actually be a bit smaller than two cores, but should perform just as well). With Intel's economies of scale this would probably even be affordable for those who *really* need high IPC.

Of course, this will NOT double IPC, as there's always a point in the code where you start working a bit too far out of the context and there are legacy concerns, but this is worth a good 50% single threaded boost.

You would then have Celerons and Pentiums with one super-fast core and two or more threads.
 

looncraz

Senior member
Sep 12, 2011
722
1,651
136
You need some data to support a "most likely". With the available information from the GCC patch and no need to have clock frequencies (IPC!), it's possible to show multiple scenarios, where more than 40% could be achieved, even assuming, that they keep the XV bottlenecks in the cache subsystem.

From my POV they'd need to add pause cycles to keep a absolute max. 40% IPC gain.

So I opt for a simulated average value with the remaining question: what's the application mix?

This is exactly what I do. Every estimate I make based on the known core actually takes me over 40% rather frequently. Though some areas are no better at all.

http://looncraz.net/ZenAssignments.html

Here, I've tried to break down the gcc patch to determine the capabilities of each unit, mostly to figure out that darn FPU.

It makes sense now. I can almost see the wires
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |