New Zen microarchitecture details

Page 118 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

raghu78

Diamond Member
Aug 23, 2012
4,093
1,475
136
I am hearing rumors that zen is on par with kaby lake performance, wow amd does not kid around any longer.

With AMD the hype and rumours always end up the exact opposite of reality as we saw with Fiji and Polaris. With Zen being confirmed by AMD as being built on a high density 14nm FINFET process there is no chance of it hitting even 4 Ghz. Maybe with Zen+ if AMD go with GF 14HP (IBM 14nm SOI FINFET) there is a chance that AMD can hit 4+ Ghz clocks. AMD's approach here seems to be exactly what they did with Carrizo and then Bristol Ridge. Carrizo using high density libraries launched with 3.5 Ghz clocks. Bristol Ridge has now launched with 4+ Ghz clocks. I think we will see something similar with Zen being limited to 3.5 Ghz. IBM's 14nm SOI FINFET aka GF 14HP is expected to go into volume production in H2 2017 . Power 9 and Zen+ will be able to use that process and we can expect them to ship in late 2017/early 2018.
 

superstition

Platinum Member
Feb 2, 2008
2,219
221
101
With AMD the hype and rumours always end up the exact opposite of reality as we saw with Fiji and Polaris. With Zen being confirmed by AMD as being built on a high density 14nm FINFET process there is no chance of it hitting even 4 Ghz. Maybe with Zen+ if AMD go with GF 14HP (IBM 14nm SOI FINFET) there is a chance that AMD can hit 4+ Ghz clocks. AMD's approach here seems to be exactly what they did with Carrizo and then Bristol Ridge. Carrizo using high density libraries launched with 3.5 Ghz clocks. Bristol Ridge has now launched with 4+ Ghz clocks. I think we will see something similar with Zen being limited to 3.5 Ghz. IBM's 14nm SOI FINFET aka GF 14HP is expected to go into volume production in H2 2017 . Power 9 and Zen+ will be able to use that process and we can expect them to ship in late 2017/early 2018.
What hype and rumor said Zen would be on a high-power node with a performance library? The Stilt said, many months ago, on overclock.net that it would be the low power process for Zen. He even said it might be better, for an enthusiast-oriented chip, to use 32nm SOI, in terms of outcome — which obviously doesn't suggest a great result for Zen in terms of high TDP high clock enthusiast performance.

However, as long as consoles remain AI weak due to weak CPUs and developers continue the very old trend of focusing mostly on graphics (remember how Cell was good at streaming and horrible for AI) Zen should probably be fine for gaming. Even the outdated Piledriver architecture can deliver adequate framerates with optimized code.
 
Last edited:

jpiniero

Lifer
Oct 1, 2010
15,172
5,707
136
Starship... Is that the 7nm part that was rumored? Come to think of it, the new AMD-GloFo WSA gives that rumor more credibility...

You mean at TSMC? That would be interesting... although I have no idea how long it will take for it to get cheap enough for AMD to use. I imagine probably 3 years from now.
 
Mar 10, 2006
11,715
2,012
126
You mean at TSMC? That would be interesting... although I have no idea how long it will take for it to get cheap enough for AMD to use. I imagine probably 3 years from now.

Maybe longer. Here's Lisa Su:

AMD is establishing a number of products on 14-nm FinFET in 2016, Su said. “But it will be a long node. It will last three, four, five years. But within that node we can do a lot in optimization, and within that node, we can do a lot on power... once you’re in the node, it’s all about architecture.”

http://www.pcworld.com/article/3020...will-survive-and-yes-even-thrive-in-2016.html
 

guskline

Diamond Member
Apr 17, 2006
5,338
476
126
Now that Labor Day weekend is upon us, hopefully AMD will release more info on ZEN.

After Bulldozer, I've learned to be more of a realist concerning AMD cpu performance vs Intel.

I expect the first generation of Zen will be a decent bump up but not equal to Intel. However, over time perhaps the refinements can produce a polished and extremely competitive chip.
 

The Stilt

Golden Member
Dec 5, 2015
1,709
3,057
106
Remember that the chip will be useful for other things like Virtual Machines, giving an edge on that. It can go up 500 dollars at worst and 700 at best.

Even if Zeppelin would somehow match the IPC and the frequency of Broadwell-E and it would be priced based on the relative throughput (GFlops), it should cost exactly half of the price of an i7-6900K. However since neither the IPC or the frequencies will match Broadwell-E and the number of memory channels and PCI-E lanes is smaller, the realistic price for the fastest 8C/16T Zeppelin is around 1/3 of the price of i7-6900K (1089$). That's just my personal opinion of course
 

raghu78

Diamond Member
Aug 23, 2012
4,093
1,475
136
There is not such a confirmation for HDL as far as i know.

http://www.anandtech.com/show/10578...rs-micro-op-cache-memory-hierarchy-revealed/3

"The combination of FinFET with the fact that AMD confirmed that they will be using the density-optimised version of 14nm FinFET (which will allow for smaller die sizes and more reasonable efficiency points) also contributes to a shift of either higher performance at the same power or the same performance at lower power."

There is no high performance 14nm FINFET process available for volume production in 2016. The GF 14HP process which is based on IBM 14nm SOI FINFET is scheduled for volume production in H2 2017 with Power 9 servers hitting the market in late 2017.

I think its a fair argument that AMD will use the approach which they used for Carrizo and Bristol Ridge with Zen and Zen+. So I expect clocks to be around 3.5 Ghz max for 2017 and with Zen+ most likely to use GF 14HP we are likely to see 4+ Ghz clocks in 2018. It would be a good contest in 2018 for Kabylake based enthusiast 8 core CPUs vs Zen+ based 8 core CPUs built on IBM 14nm SOI FINFET aka GF 14HP.

It would be interesting to see how competitive IBM's 14nm SOI FINFET process is vs Intel 14nm +. Generally IBM processes are tuned for very high transistor performance with very high clocks of 5 Ghz.
 
Reactions: Phynaz

Abwx

Lifer
Apr 2, 2011
11,543
4,325
136
Even if Zeppelin would somehow match the IPC and the frequency of Broadwell-E and it would be priced based on the relative throughput (GFlops), it should cost exactly half of the price of an i7-6900K.

Lol, you are talking of peak throughput wth AVX that never occur on any soft...?.

Quiet a "metric" at work here, so it could be better on about anything but hey it s the peak gflop that count...


However since neither the IPC or the frequencies will match Broadwell-E and the number of memory channels and PCI-E lanes is smaller, the realistic price for the fastest 8C/16T Zeppelin is around 1/3 of the price of i7-6900K (1089$). That's just my personal opinion of course

More non sense, or have you numbers that suggest that it wont match BDW s IPC or is it as usual some kind of wishfull thought.?.

So far AMD s Blender s demo just contradict you, and they likely used a generic version, just that they used anothe file than the ones on this soft benchmark.

Anyway your "estimations" are based on the fact that you got it all wrong on many aspects about Zen, now you are just left doing wild speculations in the hope thet they ll be fullfilled..

Someone posted a link to a Bitsandchips article about an interview of AMD s enginers :
http://www.eweek.com/pc-hardware/amd-lays-out-the-argument-for-zen-at-hot-chips-show-2.html

The company demonstrated those capabilities during the event for analysts and journalists a week ago, and are expecting the numbers to follow through when the chips start appearing in systems later this year and as they ramp in 2017. Officials also said that tests found the performance of the Zen chips were competitive with—and at times outperformed—Intel's new "Broadwell-E" chips.

So apparently they are not saying that it outperform it only in a single instance..


I think its a fair argument that AMD will use the approach which they used for Carrizo and Bristol Ridge with Zen and Zen+. So I expect clocks to be around 3.5 Ghz max for 2017

And Bristol Ridge top at 4.2GHz, so if they use the same approach why should Zen be limited to 3.5GHz..??.
 
Last edited:

KTE

Senior member
May 26, 2016
478
130
76
AMD's approach here seems to be exactly what they did with Carrizo and then Bristol Ridge.
From Sam, Jim and Mike's interviews, it seems they went for something in between Bulldozer and Carrizo in terms of the frequency/power/performance triangle.

Because with logic libraries, high performance is typically opposite to low power.



Sent from HTC 10
(Opinions are own)
 

turtile

Senior member
Aug 19, 2014
622
299
136
http://www.anandtech.com/show/10578...rs-micro-op-cache-memory-hierarchy-revealed/3

"The combination of FinFET with the fact that AMD confirmed that they will be using the density-optimised version of 14nm FinFET (which will allow for smaller die sizes and more reasonable efficiency points) also contributes to a shift of either higher performance at the same power or the same performance at lower power."

There is no high performance 14nm FINFET process available for volume production in 2016. The GF 14HP process which is based on IBM 14nm SOI FINFET is scheduled for volume production in H2 2017 with Power 9 servers hitting the market in late 2017.

They haven't mentioned HDL anywhere (design not process). Although, I think it's something they might use - definitely in the APUs again.
 

Gideon

Golden Member
Nov 27, 2007
1,769
4,131
136
More non sense, or have you numbers that suggest that it wont match BDW s IPC or is it as usual some kind of wishfull thought.?.
It doesn't have to match BWEs performance per se to be competitive. It can also be slower on average while being also more efficient, for instance. Imo Ivy-Bridge level IPC on average would be extremely good already (which is actually pretty competitive with Broadwell), especiallt considering the probable TDP for a 8 core chip.

IMO, expecting Zen's first iteration to beat Broadwell IPC for granted, based on one internal benchmark ... considering some of the limits we know about the architecture (victim L3 cache, etc), well ... to me it looks like boarding the very same hypetrain, which said that RX480 will beat GTX 980.

Abwx said:
And Bristol Ridge top at 4.2GHz, so if they use the same approach why should Zen be limited to 3.5GHz..??.
Wait, what CPU? The only ones i'm aware are 3.0/3.7Ghz (though also 35W TDP mobile parts) and that's on a mature 28nm process node, that should be significantly less mobile focused.
 

Gideon

Golden Member
Nov 27, 2007
1,769
4,131
136
They haven't mentioned HDL anywhere (design not process). Although, I think it's something they might use - definitely in the APUs again.
Yes, but I highly doubt Ian Cutress just pulled it out of nowhere. After all he did also mention:

http://www.anandtech.com/show/10591...art-2-extracting-instructionlevel-parallelism
"While we were unable to attend the event in person, we managed to get some hands on time with information and put questions to Mike Clark, AMD Senior Fellow and design engineer."
 

AtenRa

Lifer
Feb 2, 2009
14,003
3,361
136
http://www.anandtech.com/show/10578...rs-micro-op-cache-memory-hierarchy-revealed/3

"The combination of FinFET with the fact that AMD confirmed that they will be using the density-optimised version of 14nm FinFET (which will allow for smaller die sizes and more reasonable efficiency points) also contributes to a shift of either higher performance at the same power or the same performance at lower power."

There is no high performance 14nm FINFET process available for volume production in 2016. The GF 14HP process which is based on IBM 14nm SOI FINFET is scheduled for volume production in H2 2017 with Power 9 servers hitting the market in late 2017.

I think its a fair argument that AMD will use the approach which they used for Carrizo and Bristol Ridge with Zen and Zen+. So I expect clocks to be around 3.5 Ghz max for 2017 and with Zen+ most likely to use GF 14HP we are likely to see 4+ Ghz clocks in 2018. It would be a good contest in 2018 for Kabylake based enthusiast 8 core CPUs vs Zen+ based 8 core CPUs built on IBM 14nm SOI FINFET aka GF 14HP.

It would be interesting to see how competitive IBM's 14nm SOI FINFET process is vs Intel 14nm +. Generally IBM processes are tuned for very high transistor performance with very high clocks of 5 Ghz.

This only confirms they will use the LPP and not the LPE and from what i can understand from GloFo themselves, 14nm LPP process combines all Libraries (HDL, HP etc) in to a single process.
That means there is no 14nm LPP HP or 14nm LPP HDL or 14nm LPP LP etc, this time because of FinFets they designed a single process (LPP) that has the highest Density (M1 double Patterning), highest Performance (Increased Fin height) and lower power due to Fully Depleted FinFets.

Edit: This statement also confirms ZEN will use M1 Double Patterning for highest Density and highest Performance. It will be more expensive but it will have the highest density and performance the 14nm LPP can offer.
 
Last edited:

KTE

Senior member
May 26, 2016
478
130
76
Wouldn't the size/power/clocks be a function of the EDA tools available/used anyway?

Sent from HTC 10
(Opinions are own)
 

AtenRa

Lifer
Feb 2, 2009
14,003
3,361
136
Wouldn't the size/power/clocks be a function of the EDA tools available/used anyway?

Sent from HTC 10
(Opinions are own)

The EDA (IC Compilers and Validators) for the 14nm LPP includes all the libraries for the process, but it is the designers that will choose to use or not the M1 double Patterning for Highest Density and performance.
Using the M1 Double Patterning increases the design cost (longer time to finish the design due to higher work), increase cost due to 2x the mask cost (you need 2 or more masks for the same design) and then increase the actual manufacturing wafer cost because of extra steps in the process.

I believe that Polaris 10 and 11 doesnt use a lot or at all M1 Double Pattering for cost reduction but this effects the density and performance of the IC design.
 

FlanK3r

Senior member
Sep 15, 2009
321
84
101
And Bristol Ridge top at 4.2GHz, so if they use the same approach why should Zen be limited to 3.5GHz..??.

What do you think, its different in power consumption with 2cu/4c vs 8c/16t? With higher clocks is CPU more power hungry of course....
 

Abwx

Lifer
Apr 2, 2011
11,543
4,325
136
It doesn't have to match BWEs performance per se to be competitive. It can also be slower on average while being also more efficient, for instance. Imo Ivy-Bridge level IPC on average would be extremely good already (which is actually pretty competitive with Broadwell), especiallt considering the probable TDP for a 8 core chip.


IvyBridge is a 3 ALUs designs, beside the two FP exe units use the same ports as the two first ALUs, comparatively Zen has 4 ALUs and 4 FP exe units that all have their own ports, so it s likely that IPC will be higher than IB.

For TDP they gave enough infos, in FP it s somewhat more efficient than BDW apparently, we also know that a 14nm Zen core consume the same power/frequency as a 28nm EXV core, given that the uncore will also benefit from lower power comsumption a 8C Zen chip should consume less than a theorical 8C EXV, we can extrapolate the latter s power drain from the Athlon 845, that s a reasonable scenario given that this chip is overvolted in respect of the regular production.


IMO, expecting Zen's first iteration to beat Broadwell IPC for granted, based on one internal benchmark

That s not a internal benchmark but a legacy software, or is Blender an AMD property..?.


Wait, what CPU? The only ones i'm aware are 3.0/3.7Ghz (though also 35W TDP mobile parts) and that's on a mature 28nm process node, that should be significantly less mobile focused.


http://i.imgur.com/vpvXsEV.png
 

Abwx

Lifer
Apr 2, 2011
11,543
4,325
136
What do you think, its different in power consumption with 2cu/4c vs 8c/16t? With higher clocks is CPU more power hungry of course....

This has been answered by AMD, an EXV core consume the same (at same frequency) than a Zen core.

The Athlon 845 use 46W for Cinebench R15 and the power per core is about 8.5W, so with 8 cores this would amount to 80W despite the uncore being more stressed (due to the augmented throughput) as it will also benefit fron Finfets and will be more economical (at equal throughput) than in Carrizo/Bristol Ridge.

At first glance the 95W are a worst case figure with Prime 95, effective power in regular MThread loading should be 20% below this level, at wich frequency is still unknown but AMD said that they ll release chips at higher frequencies than the 2.8 ESs.
 

scannall

Golden Member
Jan 1, 2012
1,960
1,678
136
I see where Mark Papermaster reiterated the %40 IPC gain at IFA. That would put it around Ivy Bridge. That doesn't seem to me to be an unrealistic goal. And they have been very consistent about that %40 number. Unlike the construction cores, the official word anyway has felt like trying to keep the expectations at least somewhat grounded. The hype machine seems to exist in forums, and unofficial benchmarks with little or no credibility.
 

AtenRa

Lifer
Feb 2, 2009
14,003
3,361
136
40% is average, it will be higher in some applications and lower in others. Example, it could have higher IPC in Blender and lower IPC in Cinebench.

What most of the users here are concerned is how it will perform in gaming. Will it be close to Ivy or closer to Broadwell/Skylake ?? Also, will it OC high over 4-4.5GHz or will it top at lower clocks ???
I dont believe anyone will really be concerned about power even if ZEN will use 10-20W more than Intel counterparts if it is close in perf.
 

scannall

Golden Member
Jan 1, 2012
1,960
1,678
136
40% is average, it will be higher in some applications and lower in others. Example, it could have higher IPC in Blender and lower IPC in Cinebench.

What most of the users here are concerned is how it will perform in gaming. Will it be close to Ivy or closer to Broadwell/Skylake ?? Also, will it OC high over 4-4.5GHz or will it top at lower clocks ???
I dont believe anyone will really be concerned about power even if ZEN will use 10-20W more than Intel counterparts if it is close in perf.

Of course it'll be an average. That would only make sense. So far, AMD hasn't said anything about clockspeed so that's a 'wait and see' thing. Same with gaming performance. I'm looking forward to its release, and reviewers getting their hands on it. Hey, it's something new anyway. And regardless of the outcome it'll be interesting reading.
 
Reactions: Sweepr

Abwx

Lifer
Apr 2, 2011
11,543
4,325
136
The hype machine seems to exist in forums, and unofficial benchmarks with little or no credibility.

The lack of credibility is rather your tendency to downplay what they displayed, Blender is not an unknown software, and this has surely more credibility than Sysmark if we are to check a CPU throughput.

That they used their own file, a Zen logo, wont change the results by a iota, indeed the test took the same time at +-10% than Blender s single frame rendering benchmark, wich point to an equivalent amount of computation for said frame.

As for IB IPC perhaps that you should do some reasearch because it looks like you are granting it even higher IPC than Intel s current lineup, FTR at same frequency HW does 51% better in the Intel favouring Cinebench 11.5 and 25% in PoVray compared to EXV.

Example, it could have higher IPC in Blender and lower IPC in Cinebench.

That s unlikely, to the point that i wonder if we wont be granted an "updated" version of Cinebench for 2017, just like the R15 version that opportunistically replaced the 11.5 version while reducing the AMD scores in the same motion...
 
Last edited:
Reactions: sirmo

KTE

Senior member
May 26, 2016
478
130
76
The EDA (IC Compilers and Validators) for the 14nm LPP includes all the libraries for the process, but it is the designers that will choose to use or not the M1 double Patterning for Highest Density and performance.
You're wouldn't be the first but I'm seeing people here routinely confusing these concepts and treating them as if they are one and the same.

Design automated tools, and their capabilities or limitations are something completely different to the base process. The tools are provided by the likes of Synopsys, Cadence, etc. They are for IC design, analysis and verification like you state. As in mostly automated software rather than manual.

Chip architectural tweaks are also different and independent to logic level and circuit level tweaks.

DP is something I'm not sure is even relevant in this context, today. It's a litho RET technique though. As in, allowing better optical focusing for small features and edges. Using it creates complexity... limitations to the chip design, performance and variability (like misalignment between critical layers) tho.

In terms of libraries, High Density and High Performance don't coexist for an IC. There is no jack of all tricks here. HD is rarely even used across all the chip, just certain sections... except when the primary focus is as a LP SoCs.

It would be great to see how AMD ends up tackling this common compromise. Sam, Mike and Jim's words are clear: you usually have to make a choice, a balance, in these 3 main aspects of power, performance and size. They've said this time, they've tried the best of three approach, and they've said this incurs a huge risk no one wants to take. The risk being, trying for all 3 can end up with a chip that has multiple domain issues on all 3 fronts, or a serious issue on any one front.

With their advances and improvements in process tech, as a single design, we know AMD can:

1. Make XV+20% performance.
2. Make XV<20% power.
3. Make Bulldozer<20% frequencies.

PPF

With this chip, my understanding is that they've tried to hit a middle ground of all, with a focus on IPC.

“I think of Zen as balance,” Mike Clark, the senior fellow and lead architect at AMD who steered the design of the Zen core, explained in rolling out its design. “And I think of my job as an architect as trying to balance all of the competing forces. You are given a transistor allocation, and you want to use the transistors as best you can and build the best core you can. But there are the competing interests of clock frequency, how much work you can get done per clock, the power, the complexity of the ISA, which you have to get functionally correct and new instructions you might want to add. We were working on the Bulldozer line, and we were improving with the Excavator core, learning, but we could see that we needed to make some bigger changes. If you try to make a big change in the architecture, however, it really throws off the balance. You realize you need to rebalance everything and do a grounds-up core. We set a goal of getting 40 percent more instructions per clock while keeping the other forces at bay, and set a new point for the architecture that we could build on going forward.”

“Architects are pretty crazy, but we knew we had to take a different approach,” said Clark. “We have been working on the frequency and the performance for decades, and we have really good tools for this. We have really good tools for power as well. But we had never really intersected power at the beginning of a grounds up core into the microarchitecture, really looking at every feature we are adding and being able to really understand the power it was going to draw running real workloads and evaluate a feature trade off that early in the design process. We were usually applying the power analysis much later in the design flow, really at a point where the key architectural decisions had been made and there was way less flexibility to go attack any power problems. They might be fundamental in the architecture.”

My estimate still stands. I expect the performance to be good but frequencies limited at launch. So from the above possibilities, I would say they've skewed to a 55/28/17 PPF split

Sent from HTC 10
(Opinions are own)
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |