Speculation: Ryzen 3000 series

Page 39 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

itsmydamnation

Platinum Member
Feb 6, 2011
2,920
3,544
136
It sure isn't due to 7nm is it? Oh, wait! Let's assume it is - so you have a chip that's running at 4.4Ghz @ 75W. What does that say about ipc? I haven't been paying a lot of attention to the tweaks AMD have made to Zen 2 but it's almost impossible to miss AMD's focus on "FP (heavy)" code across the board; an inevitability given the fact that the Zen 2 cores are spread across desktop and server all the way to the chiplets, even.

What you are saying doesn't match reality.

1. know your workload (256bit pipes == 0% perf improvement for CB)
2. AMD have said they are front end limited, ie decode and issue
3. Zen2 makes lots of changes to prefetch, predict, decode and issue.


edit:autocorrect on phone fail
 
Last edited:

Tuna-Fish

Golden Member
Mar 4, 2011
1,475
1,975
136
2. AMD have said they are front end limited,
Not just AMD. The premier source of x86 cpu analysis from the software side, Agner Fog, believes that Ryzen has more execute resources than the frontend can use.
ie decode and issue
Also fetch. Ryzen is substantially faster from the uop cache than it is from the L1, because it seems to be hard capped to average throughput of 16B/cycle from L1i, despite supposedly having more throughput from L1 than that. Fog identified this as the single largest issue with the core.

Fog on Ryzen said:
The instruction fetcher can fetch 32 aligned bytes of code per clock cycle from the level-1 code cache, according to AMD documents, but the maximum measured throughput is only slightly more than 16 bytes per clock, rarely exceeding 17.
Fog on Ryzen said:
Code that does not fit into the µop cache can have a throughput of four instructions or six µops or approximately 16 bytes of code per clock cycle, whichever is smaller. The 16 bytes fetch rate is a likely bottleneck for CPU intensive code with large loops

3. Zen2 makes lots of changes to prefetch, predict, decode and issue.
Also increases the size of uop cache, which increases the proportion of hot loops that will fit in it, and:

AMD during New Horizons said:
Re-optimized instruction cache
Whatever that means, hopefully getting closer to the theoretical 32-byte fetch.
 

krumme

Diamond Member
Oct 9, 2009
5,956
1,595
136
What you are saying doesn't match reality.

1. know your workload (256bit pipes == 0% perf improvement for CB)
2. AMD have said they are front end limited, ie decode and issue
3. Zen2 makes lots of changes to prefetch, predict, decode and issue.


edit:autocorrect on phone fail
Doesn't double size load store units and double FMA size increase 128 bit execution?

Btw if I recall the fmax if Zen samples half a year before release was really really low. Anyone recall more precise?
 

Zucker2k

Golden Member
Feb 15, 2006
1,810
1,159
136
And where did you get a clockspeed of 4.4 GHz?
If Zen 2 has no ipc increase in fp over zen+ then it has to have been running near 4.4Ghz thereabouts to score 2050+ in CBR15. You claim there are no fp tweaks so increased frequency is the only logical answer here.
 

trollspotter

Member
Jan 4, 2011
28
35
91
Doesn't double size load store units and double FMA size increase 128 bit execution?

Btw if I recall the fmax if Zen samples half a year before release was really really low. Anyone recall more precise?
I seem to remember the Zen 8C ES being 2.8 base/3.2 boost and the QS being 3.3 base/3.7 boost, which matched up with the eventually released R7 1700 at a very similar ~70W power usage.

*Edit: found a link around 7 months before first gen Zen launch.
https://wccftech.com/amd-zen-es-benchmarks-leak-out/
 

Markfw

Moderator Emeritus, Elite Member
May 16, 2002
26,129
15,274
136
If Zen 2 has no ipc increase in fp over zen+ then it has to have been running near 4.4Ghz thereabouts to score 2050+ in CBR15. You claim there are no fp tweaks so increased frequency is the only logical answer here.

Zen2 has 16-17% IPC increase. It was in the video. Did you watch it ?
 

Zucker2k

Golden Member
Feb 15, 2006
1,810
1,159
136
Zen2 has 16-17% IPC increase. It was in the video. Did you watch it ?
Yes. It showed better power/performance. The ipc delta will only show in a clock 2 clock comparison. In other words, no one outside of AMD knows whether the performance delta stems from ipc or clocks.
 

itsmydamnation

Platinum Member
Feb 6, 2011
2,920
3,544
136
If Zen 2 has no ipc increase in fp over zen+ then it has to have been running near 4.4Ghz thereabouts to score 2050+ in CBR15. You claim there are no fp tweaks so increased frequency is the only logical answer here.
man are you deliberately being obtuse?

The improvements that are in Zen two that effect CB are general in nature not specific to 128bit SIMD.....
 

krumme

Diamond Member
Oct 9, 2009
5,956
1,595
136
nope it doesn't, each of the load ports can only load from one address, its 2 x upto 256bit load not 4x upto 128bit for a zen 2 core.
Ok so frontend. What's your guess on freq on the sample shown?
Perhaps we need some betting/poll here

Cb score and fmax for highest perf 8c/16c at release. Guys ?
 

exquisitechar

Senior member
Apr 18, 2017
683
940
136
Yes. It showed better power/performance. The ipc delta will only show in a clock 2 clock comparison. In other words, no one outside of AMD knows whether the performance delta stems from ipc or clocks.
Definitely IPC, I doubt the sample was running at higher than 4GHz.

It would be great if the clocks were in that range. Then they just need to raise the frequency until release, and bin the heck out of those tiny 7nm chiplets.
CB is a FP app but it doesnt use instructions that are AVX/2 and FMA, it s right that it could benefit from more 64b exe units if the 256b units are segmented as 4 x 64b units, wich is the principle of their FlexFPU used in Excavator and likely in Zen 1..

If we go by the appearances the improvement/clock in CB look to be 15%, improvement in scientific tasks should be of the same level unless there s the impact of AVX2, in wich case the numbers will be inflated, it should be also the case in X265 encoding that make use of AVX2.

That being said i dont think that the chip was working higher than 4GHz, TSMC process power curve seems so steep than any deviation (from isofrequency) would push the power too far from the claimed numbers (0.5x the power/isofrequency and 1.25x the frequency/isopower), FI +-10% frequency delta will result in power deltas of +35/-28%, hence the power in the demo was inevitably cornered within a narrow window close to an isofrequency comparison.
I agree.
 
Reactions: spursindonesia

Zucker2k

Golden Member
Feb 15, 2006
1,810
1,159
136
man are you deliberately being obtuse?

The improvements that are in Zen two that effect CB are general in nature not specific to 128bit SIMD.....
Nobody said it was specific. AMD, starting with zen, has been a dominant force in fp code crunching. You think any "general" changes they make to the uarch isn't going to be aimed at boosting this strength? Did you expect a different demo than cinebench? That's probably the first consideration on the list of tweaks going into the zen2 revision.
/s
 
Reactions: trollspotter

inf64

Diamond Member
Mar 11, 2011
3,864
4,546
136
General purpose improvements to Zen2 core could easily yield 10-15% improvement to CB15 score. We know that this test doesn't use AVX of FMA so improvements to front end and caches can easily amount to that kind of a jump (in this test). I expect around 12-15% on average in int code and up to 2x in AVX and FMA tuned code (as expected due to 2x FP resources vs Zen1). My personal guess is that Zen2 sample in AMD's demo ran at around all core Turbo of 2700X which should be ~4Ghz.
 

coercitiv

Diamond Member
Jan 24, 2014
6,624
14,027
136
Cb score and fmax for highest perf 8c/16c at release. Guys ?
Personally I'm more inclined towards a mix of 10% IPC and 5% frequency uplift for the CES sample, I guess there were 1-2 other members who thought the sample was running 4.2-4.3Ghz.

I expect another 10% frequency increase at release, so in my book that goes to 4.7Ghz. CB score should be unaffected by IPC-fmax ratio, and should hover above 2250.

PS: it would actually be better if it was all IPC uplift and clocks were ~4GHz, that would almost guarantee a 10% clock increse at launch.
 

TheGiant

Senior member
Jun 12, 2017
748
353
106
Personally I'm more inclined towards a mix of 10% IPC and 5% frequency uplift for the CES sample, I guess there were 1-2 other members who thought the sample was running 4.2-4.3Ghz.

I expect another 10% frequency increase at release, so in my book that goes to 4.7Ghz. CB score should be unaffected by IPC-fmax ratio, and should hover above 2250.

PS: it would actually be better if it was all IPC uplift and clocks were ~4GHz, that would almost guarantee a 10% clock increse at launch.

I personally expect max clock of 4,5GHz and I guess the sample was 4,4GHz. It is like with the 1800x demo vs 6900K back then. They said the same, its not final silicon etc...
I expect a CB score of 2200, which is lets say wow for a 8C desktop chip.
What troubles me is that gaming...
Back in the days of athlon64 Unreal Tournament botmach was a good benchmark of CPU intensive scenes in games. The p4 lost by 50% and it was perfectly in line with what gamers felt. Pretty much nobody was playing on the p4 except fanboys. And AMD did show the numbers back then as TEH WIN.
Here they didn't show a gaming win, just "yes we can game and our performance is ok".
I have a bad feeling about that.
 
Reactions: Kenmitch

itsmydamnation

Platinum Member
Feb 6, 2011
2,920
3,544
136
I personally expect max clock of 4,5GHz and I guess the sample was 4,4GHz. It is like with the 1800x demo vs 6900K back then. They said the same, its not final silicon etc...
I expect a CB score of 2200, which is lets say wow for a 8C desktop chip.
What troubles me is that gaming...
Back in the days of athlon64 Unreal Tournament botmach was a good benchmark of CPU intensive scenes in games. The p4 lost by 50% and it was perfectly in line with what gamers felt. Pretty much nobody was playing on the p4 except fanboys. And AMD did show the numbers back then as TEH WIN.
Here they didn't show a gaming win, just "yes we can game and our performance is ok".
I have a bad feeling about that.
You realize that AMD are doing the exact same thing they have done with the other ryzen reveals.
AMD have even confirmed both systems were using DDR4-2666 so its not like they are running uber memory to try and hide IF latency,
 

coercitiv

Diamond Member
Jan 24, 2014
6,624
14,027
136
I personally expect max clock of 4,5GHz and I guess the sample was 4,4GHz.
I expect a CB score of 2200, which is lets say wow for a 8C desktop chip.
So you expect a 7% score increase based on a 2% frequency increase in a benchmark that shows little memory scaling?

It is like with the 1800x demo vs 6900K back then. They said the same, its not final silicon etc...
The demo back then had Ryzen running at 3.4Ghz with no boost, while final silicon ran at 3.7Ghz all-core boost. That's a 9% frequency jump, which applied to your 4.4Ghz yields 4.8Ghz.

What troubles me is that gaming...
I have a bad feeling about that.
You're worried because you inflate frequency gains to downplay IPC gains while minimizing any further frequency gain until launch, you remember the original Ryzen demo in a far darker shade than it originally was, you ignore L3 cache gains and AMD going on record that memory access will better, not worse:
Mark Papermaster said:
the architecture is aimed at providing a generational improvement in overall latency to memory. The architecture with the central IO chip provides a more uniform latency and it is more predictable.

Every piece of information we have depicts a Zen 2 core that is bound to be at least slightly better per clock than Zen+ at gaming. Even if it's just a 10% performance increase, it means Zen 2 become more compelling for gaming, except maybe for powering $1200 GPUs.

Let's assume Zen 2 is 2-3% faster clock-per-clock than Zen+ at gaming and clocks are right at 4.5Ghz where you predict. That's a 10% increase in clocks and likely a 8-10% increase in performance as we factor in frequency diminishing returns vs. the 2-3% IPC gain.
 
Last edited:

BigDaveX

Senior member
Jun 12, 2014
440
216
116
Here they didn't show a gaming win, just "yes we can game and our performance is ok".
I have a bad feeling about that.
Because when they tipped their hand too early in the past, it resulted in Intel pulling things like Northwood-C and the Extreme Edition out of the bag in an effort to blunt the impact of AMD's offerings. They've learned that it's best to underplay their hand - like how they said Zen was targeting an IPC uplift of 40% over Bulldozer, when they almost certainly knew full well that the actual uplift was nearer 55%.
 

scannall

Golden Member
Jan 1, 2012
1,960
1,678
136
Because when they tipped their hand too early in the past, it resulted in Intel pulling things like Northwood-C and the Extreme Edition out of the bag in an effort to blunt the impact of AMD's offerings. They've learned that it's best to underplay their hand - like how they said Zen was targeting an IPC uplift of 40% over Bulldozer, when they almost certainly knew full well that the actual uplift was nearer 55%.
Yep, better to under promise and over deliver than the other way around.
 
Reactions: DarthKyrie

Atari2600

Golden Member
Nov 22, 2016
1,409
1,655
136
If Zen 2 has no ipc increase in fp over zen+ then it has to have been running near 4.4Ghz thereabouts to score 2050+ in CBR15. You claim there are no fp tweaks so increased frequency is the only logical answer here.

You do understand that changes to the uncore affect IPC for both Int and FP operations?

So AMD could have left the FPU completely unchanged from Zen1, redone the uncore and obtained a sizeable IPC improvement as a result.
 

DrMrLordX

Lifer
Apr 27, 2000
22,027
11,606
136
Doesn't double size load store units and double FMA size increase 128 bit execution?

Going from 128-bit FMACs to 256-bit FMACs won't improve any performance except 256-bit SIMD.

If Zen 2 has no ipc increase in fp over zen+ then it has to have been running near 4.4Ghz thereabouts to score 2050+ in CBR15. You claim there are no fp tweaks so increased frequency is the only logical answer here.

Zen2 should have ~15% higher IPC in int and fp for reasons not related to SIMD. IPC increase in AVX2 workloads should be significantly higher - possibly in the range of 50-60%, maybe higher. Look at Ivy Bridge and Haswell for an example of what we could see (compare Ivy Bridge AVX128 to Haswell AVX2).
 

AAbattery

Member
Jan 11, 2019
27
56
91
I wonder why PC World was told they cannot show the backside of the new Ryzen package. Gordon mentions it about one minute into their video.
 
Reactions: amd6502
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |