New Zen microarchitecture details

Page 102 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Enigmoid

Platinum Member
Sep 27, 2012
2,907
31
91
Nice job ignoring the difference in power usage of HBM vs GDDR5

We also have to ignore the super-low clocked Nano entirely... and the Fury X has the advantage of a liquid cooler keeping temps, and leakage, low.

In fact, a stock Fury X only pulls some 220W during gaming tests.

http://www.tomshardware.co.uk/amd-radeon-r9-fury-x,review-33235-7.html

Add 30W for RAM chips, another 15W for GDDR5 controllers, and another 15W for 80C temps, and we see 1.6969x reduced power... while clocking about 15~20% higher on average (albeit on a smaller GPU).

Still with this nonsense about RAM?



AMD's footnotes specifically mention the 2.8x in comparison between the 470 and the 270X.

2.8X is performance/boardpower.

There is NO accounting for RAM differences, there is no accounting for shaders.

Taken in account of 4 benchmarks, Firestrike, Hitman, AotS, and Overwatch.

You can make all the comparisons to Hawaii, Furmark, and RAM you want, that is irrelevant in terms of the 2.8x claim.

Interesting, using AMD's own claims you can work backward.



According to AMD, the 470 should use only 62% of the power of a 270X.

And it does. AMD is using the 2.8x in terms of performance/TDP, not perf/W. Which it looks like will align if the 470 uses around 110W of power.

So forget process and watts for RAM. Just use TDP.
 

Abwx

Lifer
Apr 2, 2011
11,172
3,868
136
That's what I've been saying. IMHO, that's what it is.

We are going around in circles due to being starved of any proper info.

Infos are here but what is lacking is either the will or the capability to make an accurate estimation, as said ad nauseam Zen has 33% more ALUs than Sandy Bridge or Ivy Bridge and 100% more than EXV.

Still we have people telling us that AMD s 4 ALUs design will be barely of SB/IB IPC level despite those latter using only 3 ALUs, if anything that would imply that the design is not efficient in using its ressources, in wich case a 3 ALU design would have been more relevant, so just the fact that they use 4 is an indication that 3 wasnt enough for their plans.
 
Last edited:

Enigmoid

Platinum Member
Sep 27, 2012
2,907
31
91
Yes, when someone is comparing an HBM product which uses some ~50W less for its RAM and controllers, it' s a very significant concern.

Did you read my post? Did you read AMD's notes?

Your concerns are all very well but have absolutely nothing to do with the 2.8x claim. That is very simply perf/TDP.
 

looncraz

Senior member
Sep 12, 2011
722
1,651
136
Did you read my post? Did you read AMD's notes?

Your concerns are all very well but have absolutely nothing to do with the 2.8x claim. That is very simply perf/TDP.

Did you bother to read the post to which I was replying which compared RX 480 with Fury?

You seem to not realize that I'm actually trying to be on topic.

I focus on the process-derived improvement (which is substantial - and apparently greater than AMD's own claims when all factors are considered), otherwise I am just pointing out inappropriate comparisons - such as a GDDR5 TDP vs an HBM TDP comparison.
 

Phynaz

Lifer
Mar 13, 2006
10,140
819
126
2,35% in graphics workload is in the interval "up to" 70%, and in many tests (compute especiallY) it can be near the maximum.
Average is not peak.
Cards are not made only with a GPU.
Da you need some other evident statement or do you want to continue trolling?

My advice -

Be careful of whom you call a troll, lest you be accused of being one yourself.

In other words, lookup who the The Stilt is before you accuse him of anything.

You're a new reg, be careful of the first impression you make.
 

The Stilt

Golden Member
Dec 5, 2015
1,709
3,057
106
Infos are here but what is lacking is either the will or the capability to make an accurate estimation, as said ad nauseam Zen has 33% more ALUs than Sandy Bridge or Ivy Bridge and 100% more than EXV.

Still we have people telling us that AMD s 4 ALUs design will be barely of SB/IB IPC level despite those latter using only 3 ALUs, if anything that would imply that the design is not efficient in using its ressources, in wich case a 3 ALU design would have been more relevant, so just the fact that they use 4 is an indication that 3 wasnt enough for their plans.

I'd say the Sandy / Ivy Bridge expectation is purely based on the quoted 40% IPC improvement over XV instead of the actual µarch changes themselves. If you add 40% to the XV's average performance in legacy FP workloads you end up to similar performance as on Sandy / Ivy.

It could well be that the there is a reduction in the 40% figure if it infact is an average improvement and AMD included some AVX2 workloads. I don't think there is a real need to speculate since I would guess we will have some leaks within a month or so. As soon as the samples hit (and they've already done so) Asia there is no way AMD can contain the leaks.
 

Abwx

Lifer
Apr 2, 2011
11,172
3,868
136
I'd say the Sandy / Ivy Bridge expectation is purely based on the quoted 40% IPC improvement over XV instead of the actual µarch changes themselves. If you add 40% to the XV's average performance in legacy FP workloads you end up to similar performance as on Sandy / Ivy.

It could well be that the there is a reduction in the 40% figure if it infact is an average improvement and AMD included some AVX2 workloads. I don't think there is a real need to speculate since I would guess we will have some leaks within a month or so. As soon as the samples hit (and they've already done so) Asia there is no way AMD can contain the leaks.

They quoted 40% but we know that the design width should allow for more, they have doubled the core potential throughput, if they gain only 40% this means that it is less efficient than EXV at equal node.

You think that Zen four ALUs will be no more efficient than SB/IB s 3 ALUs in matter of IPC.?.

Because that s what you are actually speculating.
 

leoneazzurro

Golden Member
Jul 26, 2016
1,015
1,610
136
My advice -

Be careful of whom you call a troll, lest you be accused of being one yourself.

In other words, lookup who the The Stilt is before you accuse him of anything.

You're a new reg, be careful of the first impression you make.

Thanks for the advice, maybe you should also look at arguments and facts before warning someone who's posting facts and not feelings, despite the number of posts.
 

The Stilt

Golden Member
Dec 5, 2015
1,709
3,057
106
They quoted 40% but we know that the design width should allow for more, they have doubled the core potential throughput, if they gain only 40% this means that it is less efficient than EXV at equal node.

You think that Zen four ALUs will be no more efficient than SB/IB s 3 ALUs in matter of IPC.?.

Because that s what you are actually speculating.


I'm not speculating with anything else but with the info AMD has provided, which is nothing but "40% IPC improvement over XV" (and which is also "up to", allegedly).
 

KTE

Senior member
May 26, 2016
478
130
76
Infos are here but what is lacking is either the will or the capability to make an accurate estimation, as said ad nauseam Zen has 33% more ALUs than Sandy Bridge or Ivy Bridge and 100% more than EXV.
Which translates into how much performance increase exactly?

Sent from HTC 10
(Opinions are own)
 

Abwx

Lifer
Apr 2, 2011
11,172
3,868
136
I'm not speculating with anything else but with the info AMD has provided, which is nothing but "40% IPC improvement over XV" (and which is also "up to", allegedly).

Because you think that they ll disclose the real numbers 18 months before the product is launched..?..

Or was 40% considered at AMD as an easy target at the time they released this number, so much that they knew that there was no risk of being later sued once we get hard numbers..?.

I guess that you noticed the wave of investors going long on AMD stock, certainly that a good management is to take the risk of future multiple trials by announcing fairy taled numbers.

On the technical side of thing what is known of Zen uarch point to total throughput being comparable to a EXV module, so unless SMT bring 43% more throughput, wich is unlikely, ST IPC should be improved by more than 40% on average.

Which translates into how much performance increase exactly?

Sent from HTC 10
(Opinions are own)

Schematicaly all ops are either computed by the ALUs, or if it s FP they are completed by the ALUs, so throughput is dependent of the ALU count, that s why Piledriver could have lower IPC than a Phenom core in some instances that rely on brute force availability.
 
Last edited:

Phynaz

Lifer
Mar 13, 2006
10,140
819
126
Thanks for the advice, maybe you should also look at arguments and facts before warning someone who's posting facts and not feelings, despite the number of posts.

I never said anything about number of posts. You'll notice The Stilt's post count is relatively low compared to some.

Since you brought up post count, and called somebody a troll while your in your single digits, I have to ask WWYBYWB? Or is this just an alt you're posting from?

Let us know when you bother to search who he is and what he knows, if you have the courage.
 

Phynaz

Lifer
Mar 13, 2006
10,140
819
126
Stilt, a lot of us really appreciate your objective and technically competent posts. You are a credit to this board and I hope that you don't let a few vocal critics deter you from continuing to share your insights on these forums.

Agreed!

The Stilt, please put the trolls on ignore and don't let them chase you off! We need quality people such as yourself here.
 

leoneazzurro

Golden Member
Jul 26, 2016
1,015
1,610
136
I never said anything about number of posts. You'll notice The Stilt's post count is relatively low compared to some.

Since you brought up post count, and called somebody a troll while your in your single digits, I have to ask WWYBYWB? Or is this just an alt you're posting from?

Let us know when you bother to search who he is and what he knows, if you have the courage.

I don't bother, he can be the pope, if he says something wrong, I have the right to say it's wrong. And you are not much better, having nothing to reply to facts you resort to personal attacks (you brought the argument I just started here, BTW, not considering what I was posting but only who I was replying to). Starting from now you are in ignore.
 
Last edited:

Phynaz

Lifer
Mar 13, 2006
10,140
819
126
Schematicaly all ops are either computed by the ALUs, or if it s FP they are completed by the ALUs, so throughput is dependent of the ALU count, that s why Piledriver could have lower IPC than a Phenom core in some instances that rely on brute force availability.

The above is wrong, as ABWX is missing quite a few execution units. As far as Intel CPU's are concerned (I don't know about AMD construction cores) they have dedicated units for logical operators, division, FMA etc. that don't use ALU/FPU execution units. The preceding do use the same execution ports as the ALU/FPU units.
 
Last edited:

Phynaz

Lifer
Mar 13, 2006
10,140
819
126
I don't bother, he can be the pope, if he says something wrong, I have the right to say it's wrong. And you are not much better, having nothing to reply to facts you resort to personal attacks (you brought the argument I just started here, BTW, not considering what I was posting but only who I was replying to). Starting from now you are in ignore.

RBM confirmed
 

Glo.

Diamond Member
Apr 25, 2015
5,765
4,670
136
You are saying he is talking about RX 470, instead of the product he is actually launching?

https://www.youtube.com/watch?v=0gN7oIubcVk&feature=youtu.be&t=715

I might be blind, but my eyes it says: "RX 480 Built on 14nm FinFet, optimized by AMD". Do you disagree?

Also AMD specifically claims that 1.7x of the total 2.8x comes from the 14nm LPP process transformation alone. So Polaris 10 should have AT LEAST 1.7x the performance per watt of ANY 28nm (even Fiji) GPU for the claim to be true. Since RX 480 has 2.35% higher performance per watt than Fiji (R9 Fury) according to TPU (1080 - 2160)...

https://www.techpowerup.com/reviews/AMD/RX_480/25.html

It has even more.
R9 290 - 4.8 TFLOPs, at 260W of power consumption. 18 GFLOPs/watt.
RX 480 - 5.83 TFLOPs, at 160W of power consumption - 36 GFLOPs/watt.
 

leoneazzurro

Golden Member
Jul 26, 2016
1,015
1,610
136
Really, this whole argument is funny. It is like comparing apples to oranges for estimating how bananas are.
 

looncraz

Senior member
Sep 12, 2011
722
1,651
136
They quoted 40% but we know that the design width should allow for more, they have doubled the core potential throughput, if they gain only 40% this means that it is less efficient than EXV at equal node.

You think that Zen four ALUs will be no more efficient than SB/IB s 3 ALUs in matter of IPC.?.

Because that s what you are actually speculating.

40% scaling for going 100% wider isn't bad, it's just the reality of imperfect scaling when relying on instruction level parallelism... and non-equal ALUs... and memory constraints... and decoder limits... and... :hmm:

The first extra ALU would, maximally, offer 50% more integer IPC, but we all know that we'd be lucky to see half of that (aside from certain peak scenario). So we'll call that a fairly safe 25%, provided all "supporting mods" are in place. Then tack on one more... with even more diminishing returns...

Haswell has 33% more ALUs than Ivy Bridge and doesn't reliably manage a 15% improvement in IPC, which tells us what that fourth ALU on Zen is worth: <= 15%.

That's 40%... from a doubling of ALUs... provided you have enough of the rest of the chip up to the task.
 

ElFenix

Elite Member
Super Moderator
Mar 20, 2000
102,425
8,388
126
Really, this whole argument is funny. It is like comparing apples to oranges for estimating how bananas are.
No crap.

Polaris 10's power consumption is tangentially useful in figuring out improvement due to process alone for Zen when other variables are carefully controlled for. Instead of an on-topic discussion related to that (which we really can't have until actual 4GB 480s are in the wild, again, control for variables) we have a vast off topic discussion including absurd comparisons by otherwise great posters and members weirdly fawning over one another.

Might I remind you guys that a bunch of you were infracted a couple weeks back for nearly this same discussion?
 

Abwx

Lifer
Apr 2, 2011
11,172
3,868
136
Haswell has 33% more ALUs than Ivy Bridge and doesn't reliably manage a 15% improvement in IPC, which tells us what that fourth ALU on Zen is worth: <= 15%.

That's 40%... from a doubling of ALUs... provided you have enough of the rest of the chip up to the task.

HW fourth ALU is in a cluster that has few exe ressources, Intel clustered exe units with each one its dedicated ALU cant really be compared to AMD wider design when it comes to mixed code execution.

But still, we have some people thinking that Zen s 4 ALUs will be barely as capable as Intel s previous gen 3 clustered ALUs.

No crap.

Polaris 10's power consumption is tangentially useful in figuring out improvement due to process alone for Zen when other variables are carefully controlled for.

Unless you can point wich flavor of 14nm LPP is used for the 480 and wich one is used for Zen those datas are irrelevant, FTR there s HVT, RVT, LVT, slVT among others...

Even knowing the one used for Polaris would be useless if it s not the same as Zen, and all point to the transistors being different, Polaris likely use HVT-RVT due to lower leakage while Zen will use either LVT or sLVT.
 
Last edited:

AtenRa

Lifer
Feb 2, 2009
14,003
3,361
136
Unless you can point wich flavor of 14nm LPP is used for the 480 and wich one is used for Zen those datas are irrelevant, FTR there s HVT, RVT, LVT, slVT among others...

Even knowing the one used for Polaris would be useless if it s not the same as Zen, and all point to the transistors being different, Polaris likely use HVT-RVT due to lower leakage while Zen will use either LVT or sLVT.

I will also add that im expecting ZEN to use M1 layers that will increase density and energy characteristics over Polaris 10/11 but will also increase the cost of the Wafer due to double patterning.
But since ZEN will be most probably sub 200mm2 and will also be targeting high-end CPUs thus high ASPs, it should be ok.
 

NostaSeronx

Diamond Member
Sep 18, 2011
3,689
1,224
136
22/20/14 nanometer Bulldozer improvements lost while Zen was implemented.

FP256; 256-bit [Single Macro-op] Decode, 256-bit Load/Store Width, 256-bit FP/INT Vector Unit Width.

Clustered Multithreading 2.0;
Power -> Enhanced PWR_MGMT via Integrated Buck Regulators that manage an enhanced Resonant Clock Mesh. Think about the inductors in the RCM and the per-core AVFS modules in Excavator. (Module(Front-end, L2, FPU) would use Processor VR, cores(Core/LSU) would use their own Buck Regulators.)
Performance -> Changes were to reduce latency, unspecified.

Full AGLU; Inclusion of Arithmetic Data Paths in AGLU pipe.
 
Last edited:
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |