Zen hasn't taped out yet

Page 8 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Dresdenboy

Golden Member
Jul 28, 2003
1,730
554
136
citavia.blog.de
This plan sounds like a fast death sentence for AMD ,in other words, BD no 2. Why go with many small cores plus SMT when you can have similar MT performance but better ST performance If you go with a moderate amount of wider cores+SMT?
"Small" is just about the absolute area. IPC will be different though.
 
Last edited:
Mar 10, 2006
11,715
2,012
126
So just a hypothetical example:

-Broadwell-E performs 10-20% better than Zen
-Zen costs $600
-Broadwell-E costs $1000

Which one would you pick?

What a ridiculous argument. 8 core Broadwell-E will probably slot nicely into the $600 price point, so it's not an "either or" between your two scenarios.

I get that you really want AMD to succeed, but you shouldn't assume that the people determining which SKUs to offer at Intel at complete morons.
 

Abwx

Lifer
Apr 2, 2011
11,167
3,862
136
Integer ST performance should be fine with Zen. Just ST FP stuff might not see a big jump from XV.

ST FP perf should get a much bigger uplift than 40%, Zen FPU is a beefed Flex FPU and this latter has comparable throughput than a HW core, and a Zen core should have comparable FP throughput than an XV module.
 

The Stilt

Golden Member
Dec 5, 2015
1,709
3,057
106
If it has higher than 40% uplift over Excavator, why didn´t AMD say so. It´s not like they can afford to make any understatements...
 

Fjodor2001

Diamond Member
Feb 6, 2010
3,926
404
126
What a ridiculous argument. 8 core Broadwell-E will probably slot nicely into the $600 price point, so it's not an "either or" between your two scenarios.
It was just an example, don't get stuck on exact prices.

The point is this: Assuming 8 core Broadwell-E is 10-20% faster than Zen, how much more will you be willing to spend to get that extra performance?

I think for a lot of people when the difference starts to get as small as 10-20% and they can save a couple of $100, they'll opt for Zen. Lately AMD has been much more behind Intel performance wise than that, so AMD has not really been an option for them even if AMD was cheaper. But with Zen, I think AMD can get close enough performance-wise to be a realistic option again for a lot of people.
 

AtenRa

Lifer
Feb 2, 2009
14,003
3,361
136
Zen might be a bit weak on the FP side, but if the small core hypothesis is correct, AMD might throw lots of cores on a die. SMT, core renaming (for power management), IVR, high integer throughput, and the promised cache subsystem should give nice performance there. It would be able to stay at higher average clocks with all cores+SMT active, as a small core footprint doesn't cause that much static power consumption (while active), making efficient use of available TDP. Size comparison: ~4-5 mm² (incl. L2) vs. 8 mm² Skylake.

L4 HBM2 cache + NVM (managed as one huge address space) seems likely and might help to be competitive.

Big Data might be bottlenecked by the 2 AGUs or cache misses, whatever comes first, as it has 70 mem operands per 100 instructions.

What's not so safe to assume yet (but found in patents and papers): stack cache (for power efficiency), uOp cache, checkpointing (low branch misprediction latency), SMT thread prioritization (not that useful for servers), near threshold computing, reliable computing, redundant computing, asynchronous logic.

So, back to Llano ??? 2x Llano cores were almost the same as a single SandyBridge. But this time "Llano" has 40% increase of IPC and SMT.

Could they make such a wide core with 4x ALUs, 2x AGUs and FPU unit at 4-5mm2 on the 14nm FF ???

And if ZEN core is only 4-5mm2 they could make a 8x Core 16x Threads die at 120-140mm2 or even smaller (depending on the L3 cache size and hyper-transports or equivalent).
 

Abwx

Lifer
Apr 2, 2011
11,167
3,862
136
If it has higher than 40% uplift over Excavator, why didn´t AMD say so. It´s not like they can afford to make any understatements...

Because those 40% are an average including Integer code wich does not need as much uplift to be on par with Intel.

A Zen core FPU is more powerfull than the one in an XV module, so with its four ALUs a Zen core should have in SMT at least the same throughput than an XV module.

For instance an XV module at 3GHz score 1.575 in CB 11.5, without the CMT penalty it would score 1.762, so this range is about what should be expected from Zen in a 1C/2T configuration, the proportion brought by SMT in those scores is of course an unknown.
 

mrmt

Diamond Member
Aug 18, 2012
3,974
0
76
s
Zen might be a bit weak on the FP side, but if the small core hypothesis is correct, AMD might throw lots of cores on a die. SMT, core renaming (for power management), IVR, high integer throughput, and the promised cache subsystem should give nice performance there. It would be able to stay at higher average clocks with all cores+SMT active, as a small core footprint doesn't cause that much static power consumption (while active), making efficient use of available TDP. Size comparison: ~4-5 mm² (incl. L2) vs. 8 mm² Skylake.

So do you think AMD could cram enough hardware on 4.5mm^2 to be competitive in terms of performance and efficiency against a 8mm^2 Skylake core on integer performance, meaning that Intel is basically doubling the size of the core just to get better FP performance?
 

The Stilt

Golden Member
Dec 5, 2015
1,709
3,057
106
Because those 40% are an average including Integer code wich does not need as much uplift to be on par with Intel.

A Zen core FPU is more powerfull than the one in an XV module, so with its four ALUs a Zen core should have in SMT at least the same throughput than an XV module.

For instance an XV module at 3GHz score 1.575 in CB 11.5, without the CMT penalty it would score 1.762, so this range is about what should be expected from Zen in a 1C/2T configuration, the proportion brought by SMT in those scores is of course an unknown.

Single Excavator CU running at 3.0GHz scores 1.54 in Cinebench R11.5. Two native cores (CMT core disabled from each unit) enabled it scores 1.68 (91.6% CMT yield).

If we expect AMD to match Intel in SMT yield (~26.5% in Cinebench), then the native Zen core would score 1.328 at 3.0GHz. That would be 58% improvement in IPC for this type of workloads. Not impossible, but I think the SMT yield matching Intel´s 26.5% is a slightly tall order especially for their first SMT design.
 

Abwx

Lifer
Apr 2, 2011
11,167
3,862
136
Single Excavator CU running at 3.0GHz scores 1.54 in Cinebench R11.5. Two native cores (CMT core disabled from each unit) enabled it scores 1.68 (91.6% CMT yield).

If we expect AMD to match Intel in SMT yield (~26.5% in Cinebench), then the native Zen core would score 1.328 at 3.0GHz. That would be 58% improvement in IPC for this type of workloads. Not impossible, but I think the SMT yield matching Intel´s 26.5% is a slightly tall order especially for their first SMT design.

A 4C/8T Zen wouldnt make sense if it couldnt match a 8C/8T XV in throughput/GHz given that the latter would be likely more efficient at equal node, although at the expense of ST IPC...

As for CB scores/GHz :

http://www.planet3dnow.de/cms/18564...er-architekturen/subpage-rendering-cinebench/
 
Last edited:

itsmydamnation

Platinum Member
Feb 6, 2011
2,866
3,418
136
s

So do you think AMD could cram enough hardware on 4.5mm^2 to be competitive in terms of performance and efficiency against a 8mm^2 Skylake core on integer performance, meaning that Intel is basically doubling the size of the core just to get better FP performance?

im sure Intel optimises the hell out of it, but in the "real world" tm anything above 64 bit data paths is trading max perfomance on complex code for throughput. You could see this both from a core size and power perspective. The QUESTION is how big of a tax is Intel 256 bit vs amd 128bit.

Most enterprise and consumer apps dont even use simd and 256bit simd is an even smaller subset of that.
 

The Stilt

Golden Member
Dec 5, 2015
1,709
3,057
106

mrmt

Diamond Member
Aug 18, 2012
3,974
0
76
im sure Intel optimises the hell out of it, but in the "real world" tm anything above 64 bit data paths is trading max perfomance on complex code for throughput. You could see this both from a core size and power perspective. The QUESTION is how big of a tax is Intel 256 bit vs amd 128bit.

Most enterprise and consumer apps dont even use simd and 256bit simd is an even smaller subset of that.

Considering that Intel 14nm is better than Samsung 14nm that would have to be above 100%. Do you think it is remotely close to this number?
 

Abwx

Lifer
Apr 2, 2011
11,167
3,862
136
The Cinebench numbers I gave you are based on actual measurements (I do have the reference FX-8800P system), not on extrapolated figures.

I know that you have a plateform but your score is one among many others, as for extrapolating scores this is a moot argument as CB 11.5 scale perfectly with frequency, and eventual scaling error is way lower than the bench variability at a given frequency.

Indeed your score is below the one that i linked and wich is 1.05pts/GHz/4C with a penalty at 0.8936, higher than with Kaveri, ST and MT in respect of this latter being improved by 11.9% and 9% respectively.

Considering that Intel 14nm is better than Samsung 14nm that would have to be above 100%. Do you think it is remotely close to this number?


Lol, yet another random and blanket statement by some specialist of the physics of semiconductors..

GF s 14nm LPP LVT is 20% more efficient than Intel s 14nm, the 14nm LPP sLVT will bury it by another 10% in dynamic power comsumption, Intel better work their process than relying on bullish marketers...
 
Last edited:

The Stilt

Golden Member
Dec 5, 2015
1,709
3,057
106
GF s 14nm LPP LVT is 20% more efficient than Intel s 14nm, the 14nm LPP sLVT will bury it by another 10% in dynamic power comsumption, Intel better work their process than relying on bullish marketers...

P1272 or P1273?
AFAIK Intel hasn´t revealed any in-depth technical data about their processes, or do you have some available?
 

DrMrLordX

Lifer
Apr 27, 2000
21,807
11,161
136
If it has higher than 40% uplift over Excavator, why didn´t AMD say so. It´s not like they can afford to make any understatements...

AMD sort of undershot on XV's improvement over SR on those benches you ran for me on OCN. The only bench that showed 5% improvement was Cinebench R10. The other three were 11%, 11%, and 22%. Those last three were all SIMD-heavy fp workloads fwiw.

Single Excavator CU running at 3.0GHz scores 1.54 in Cinebench R11.5. Two native cores (CMT core disabled from each unit) enabled it scores 1.68 (91.6% CMT yield).

If we expect AMD to match Intel in SMT yield (~26.5% in Cinebench), then the native Zen core would score 1.328 at 3.0GHz. That would be 58% improvement in IPC for this type of workloads. Not impossible, but I think the SMT yield matching Intel´s 26.5% is a slightly tall order especially for their first SMT design.

If Zen runs like that, we're in for some serious pain. That will not be a fast chip.
 

pm

Elite Member Mobile Devices
Jan 25, 2000
7,419
22
81
P1272 or P1273?
AFAIK Intel hasn´t revealed any in-depth technical data about their processes, or do you have some available?

I'm not sure that I want to jump into the war over who has the best process technology - the term "best" is already a subjective definition...

But, if you want data to make a semi-informed decision, then the best place to look is the IEDM (International Electron Devices Meeting) conference papers.

For example this paper by Intel on it's 14nm process:
http://www.intel.com/content/dam/ww...foundry/intel-14nm-iedm-2014-presentation.pdf

Depending on your level of technical paper access, you can get similar papers for TSMC and IBM and, not so much GF and Samsung.
 
Last edited:

Abwx

Lifer
Apr 2, 2011
11,167
3,862
136

Proof i already posted it, with due calculations and explanations that could be understood by a 16 year old college student who wouldnt had deserted his maths courses..

P1272 or P1273?
AFAIK Intel hasn´t revealed any in-depth technical data about their processes, or do you have some available?

Dont know what thoses numbers are related to but i m talking of the process used to fab both BDW and SKL CPUs, compared to GF s 14nm LPP LVT.

At 2.4GHz the latter necessitate 1.1x lower voltage than the former, just this parameter say that the perf/Watt advantage ratio is automaticaly 1.1^2 for whom did bother to check some basic transistors rules.

When voltage must be increased by 10% to reach a given frequency it could mean two things separetaly or a combination of both.

First is the most likely and is that Intel s transistors have 1.21x lower (trans)conductance than GF s, this can be straightfowardly compensated with a voltage higher by a ratio wich is the root square of the missing conductance ratio, this is a direct consequence of a fundamental property of mosfet/fet transistors wich are device whose output/input caracteristic is a square law.

Second is that Intel s transistors and circuit layout has globaly 1.21x higher parasistic capacitance, wich is unlikely given that the geometries design rules are the same for everybody and that there s no reason that Intel engineers whould had screwed this indeed better mastered design phase.
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |