Discussion Zen 5 Speculation (EPYC Turin and Strix Point/Granite Ridge - Ryzen 9000)

Page 39 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

yuri69

Senior member
Jul 16, 2013
437
717
136
Wouldn't we expect generational improvements to be in the range 20-30% single core?
Not really. Zen 3 is also a new gen based on a new architecture. The quoted IPC gain figure is 19%. Note, this <20% increase was made compared to Zen 2 which is still a 1st gen product in terms of the CCX & L3 topology.
 

yuri69

Senior member
Jul 16, 2013
437
717
136
Oh, my bad.

But still, even 6.5GHz would mean only ~11% above the current 5.85GHz. Is such figure viable on the 4nm TSMC?

Beating the 6GHz mark means ~3%.
 

DisEnchantment

Golden Member
Mar 3, 2017
1,687
6,237
136
1800X --> 4GHz
1800X --> 3950X (15% IPC * 15% Clocks @ 4.6 GHz) --> 32% perf
3950X --> 5950X (19% IPC * 6% Clocks @ 4.9 GHz) --> 27% perf
5950X --> 7950X (13% IPC * 16% Clocks @ 5.7 GHz) --> 31% perf
Zen 3 just a bit behind in generational gain.

From above trends +30% perf seems possible but it is just a projection not guaranteed.

But still, even 6.5GHz would mean only ~11% above the current 5.85GHz. Is such figure viable on the 4nm TSMC?

Beating the 6GHz mark means ~3%.
I believe 4~5% at best.
 

Exist50

Platinum Member
Aug 18, 2016
2,452
3,101
136
BREAKING: RISC-V Conference held by Tenstorrent accidentally leak Zen5 performance, also include NVIDIA Grace which is still being projected


View attachment 79037
Note they also have a typo. "Xenon" instead of (which?) "Xeon". I think the far more likely explanation is that the presentation is just a bit shoddily put together, and the Zen 5 number is a projection, not insider knowledge.
 

A///

Diamond Member
Feb 24, 2017
4,352
3,155
136
AMD was targeting 40%+ IPC bump from the Zen 1 core . Excavator => Zen 1 was ~50%; Zen 1 => Zen 3 was ~41%; Zen 3 => Zen 5 could be ~40% which puts it at around 23-25% higher IPC vs vanilla Zen 4.
This reads right. The amd fan in me hopes they can hit a higher number. It's why I'm hesitant to buy a few 7950X's but I can't bear another summer with hot Intel systems even with two sources of ac to cool things down.
 

Joe NYC

Platinum Member
Jun 26, 2021
2,331
2,942
106
Note they also have a typo. "Xenon" instead of (which?) "Xeon". I think the far more likely explanation is that the presentation is just a bit shoddily put together, and the Zen 5 number is a projection, not insider knowledge.
Yes.

And I would also be curious about the numbers for Grace, which I also don't believe.
 

NostaSeronx

Diamond Member
Sep 18, 2011
3,689
1,223
136
Zen5 might be insider knowledge...
Personal information removed
They might have yoinked the Zen5 team from India.

The Zen5 numbers might be actual. Beating 6 ALU designs from ARM's/NVIDIA's V2/Grace and Tenstorrent's Ascalon in "Scalar Competition Landscape."

So, I assume it has a better front-end/load-store and at least 6 ALUs to get better "Scalar" scores than any architecture listed.
 

Attachments

  • tenszen5.png
    60.4 KB · Views: 25
  • tenszen5_2.png
    32.2 KB · Views: 26
  • tenszen5_3.png
    57.4 KB · Views: 34
Last edited by a moderator:

DisEnchantment

Golden Member
Mar 3, 2017
1,687
6,237
136
Zen5 might be insider knowledge...

They might have yoinked the Zen5 team from India.

The Zen5 numbers might be actual. Beating 6 ALU designs from ARM's/NVIDIA's V2/Grace and Tenstorrent's Ascalon in "Scalar Competition Landscape."
View attachment 79101
So, I assume it has a better front-end/load-store and at least 6 ALUs to get better "Scalar" scores than any architecture listed.
Just some anecdote to share, there is a high level manager from AMD (I don't want to put some names) working out of Bangalore India, who went to Tenstorrent and he took few guys from his team with him.

This is the state of the progressions of AMD Zen Cores (I have CPUs from all the Zen generations, 1700X, 3400G, 3900X, 5950X, 7950X and I know how they perform relative to each other.)
1800X --> 3950X (15% IPC * 15% Clocks @ 4.6 GHz) --> 32% perf
3950X --> 5950X (19% IPC * 6% Clocks @ 4.9 GHz) --> 27% perf
5950X --> 7950X (13% IPC * 16% Clocks @ 5.7 GHz) --> 31% perf

For Server I have the 7571 (Planning to go to Genoa) and below is what I calculate for EPYC
7601 (3.2G) --> 7742 (3.4G)(15% IPC * 6% Clocks) --> 21% perf
7742 (3.4G) --> 7763 (3.5G)(19% IPC * 3% Clocks) --> 22% perf
7763 (3.5G) --> 9554 (3.75G)(13% IPC * 7% Clocks) --> 23% perf
These are standard SKUs, then there are the F SKUs which are clocked much higher.

However, from Tenstorrent slides
Zen 1/Naples (4.30) --> Zen 2/Rome (4.56) --> 6% Spec2017 Int perf?
Zen 2/Rome (4.56) --> Zen 3/Milan (5.91) --> 29 % Spec2017 Int perf
Zen 3/Milan (5.91) --> Zen 4/Genoa (6.8) --> 15% Spec2017 Int perf?

You can see numbers are all over the place.


While I am excited for RISC-V this slide looks like a big marketing nothing.
And like I mentioned before V2 does not come close to Milan nor is Genoa getting beaten by SPR.
And how comes Zen 5 is not 'projected' performance but Grace Performance is projected?
When NV themselves provided the SPECrate2017_int_base estimate for it already.


EPYC Milan 7763 with 64 Cores is much higher (>400) than NV Single Grace CPU with 72 Cores. (This result is with AOCC compiler, but many older AOCC optimizations made it to GCC/LLVM now)

I would not look too much into this disclosure.
 
Last edited:

NostaSeronx

Diamond Member
Sep 18, 2011
3,689
1,223
136
I would not look too much into this disclosure.
The reason why results might differ is this a pure mid-core/"Scalar" benchmark rather than SIMD/Vector.

So, the benchmark given is only for front-end/mid-core/back-end, and not FP/Vec units that can do Packed Integer. Which ARM/Intel optimized in GCC 12,etc. The benchmark is pure base instruction sets. Since RVV isn't 2.0, thus isn't fully frozen and they would probably be trounced in SIMD benchmarks.
2x Vec256 (Ascalon)
vs
2x Vec512(3x Vec256) (GoldencoveX)
vs
6x Vec256(dual-pump Vec512) (Zen4)

Contextual clues with "Scalar" in the title clearly implies that they aren't using SIMD in this specint_rate bench.
And how comes Zen 5 is not 'projected' performance but Grace Performance is projected?
Tenstorrent have architects from Zen5 at Tenstorrent, but they didn't bother testing Graviton3(Neoverse V1)/Grace(Neoverse V2), both of which are projected.
 
Last edited:

DisEnchantment

Golden Member
Mar 3, 2017
1,687
6,237
136
2x Vec256 (Ascalon)
vs
2x Vec512(3x Vec256) (GoldencoveX)
vs
6x Vec256(dual-pump Vec512) (Zen4)
Are you sure SPEC CPU®2017 Integer has the wide vector test otherwise (outside of this slide), if that is the case AVX CPUs like SPR/Genoa would decimate everything else.
Other incoherent thing is that Genoa has a stronger int than float vs SPR but not in this slide apparently.
 
Last edited:
Reactions: Tlh97 and Joe NYC

NostaSeronx

Diamond Member
Sep 18, 2011
3,689
1,223
136
Are you sure SPEC CPU®2017 Integer has the wide vector test otherwise (outside of this slide), if that is the case AVX CPUs like SPR/Genoa would decimate everything else.
Other incoherent thing is that Genoa has a stronger int than float vs SPR but not in this slide apparently.
https://community.arm.com/arm-community-blogs/b/tools-software-ides-blog/posts/gcc-12 (Rate=1)
https://www.intel.com/content/www/u...ng-innovation-and-performance-with-gcc12.html (not Rate=1)

Biggest improvement of better SIMD/vectorization support shows up in:

For Tenstorrent to get the SIMD Auto-vec stuff they would need this patch and/or GCC 14(too late for GCC 13) => https://gcc.gnu.org/pipermail/gcc-patches/2023-March/613260.html

The only way to get 1:1 benchmarks is to turn off SIMD.

Which is why this doesn't include Vector(SIMD), hence focusing on Scalar(Superscalar) performance instead.

x86-64 has auto-vec, ARM has auto-vec, RISC-V does not have auto-vec. Most likely can't run packed SIMDs within RVV since the ISA for it isn't frozen. Doesn't seem best to project SIMD performance till some basis to get SIMD perf is there. Which is not here or there or anywhere yet. Given "Scalar" it makes the most sense that it is only using the Integer units in the mid-core and nothing in FPU's for packed int for each architecture. The highest scores are the ones with more ALUs, not more SIMD units.

Zen2(17h v2) -> Zen3(19h v1) ~~ 1.296x <==
Zen1(17h v1) -> Zen3(19h v1) ~~ 1.374x <--
Zen4(19h v2) -> Zen5(1Ah v1) ~~ 1.3x <==
Zen3(19h v1) -> Zen5(1Ah v1) ~~ 1.496x <--
It is well within the margin of prior improved core[V2 => V1]. It however exceeds the gen1 to gen1[V1 -> V1].

Edit: Maybe, it does use SIMDs... https://www.anandtech.com/show/16778/amd-epyc-milan-review-part-2/6
Xeon score is the same as 8380
Milan score is the same as 7763
Rome score is the same as 7742
Graviton2 doesn't seem to be based on Anandtech.

¯\_(ツ)_/¯​

 
Last edited:

DisEnchantment

Golden Member
Mar 3, 2017
1,687
6,237
136




Is this Strix's LLC/IFC?
Quite intriguing that they designed the LLC with CCS interface. It can plug right into a SDP instead of UMC or can plug into a GMI interface.
Really interesting as well to see if console APUs go for this.
Could be an adaptation of MCD?

20230105709 : CACHE ALLOCATION POLICY
 

Doug S

Platinum Member
Feb 8, 2020
2,486
4,049
136
However, from Tenstorrent slides
Zen 1/Naples (4.30) --> Zen 2/Rome (4.56) --> 6% Spec2017 Int perf?
Zen 2/Rome (4.56) --> Zen 3/Milan (5.91) --> 29 % Spec2017 Int perf
Zen 3/Milan (5.91) --> Zen 4/Genoa (6.8) --> 15% Spec2017 Int perf?

You can see numbers are all over the place.


Typically companies use the latest and greatest compiler (their own or whatever their standard is) to produce SPEC benchmarks. However, they typically don't re-run the benchmarks on old hardware with newer compilers. So the benchmarks tend to include compiler advances along with hardware improvements. Most of the time this will affect FP and SIMD more than generic integer code.

The amount of effort they put forth to find optimal settings matters too, at least for peak results (I always ignore those and look at base) so the results might not be as good as they could be if they went all out. Back in the RISC days vendors used to consider SPEC the gold standard benchmark and put in a lot of work to making the results look as good as possible. Nowadays I don't think Intel & AMD care all that much about SPEC, they put most of their effort into other benchmarks.

Some will say that benchmarks using the same code for everything like Geekbench (at least within versions like 5.0, 6.0, etc.) is superior to SPEC for this reason, but when there are major changes in an architecture (like P4 to Core, Bulldozer to Zen, x86-32 to x86-64) you often need to update the compiler to fully exploit them and you just won't see the full effect with that - though to be fair, also won't see the full effect until the applications you care about have to updated.
 

BorisTheBlade82

Senior member
May 1, 2020
667
1,022
136
View attachment 79110
View attachment 79111


Is this Strix's LLC/IFC?
Quite intriguing that they designed the LLC with CCS interface. It can plug right into a SDP instead of UMC or can plug into a GMI interface.
Really interesting as well to see if console APUs go for this.
Could be an adaptation of MCD?

20230105709 : CACHE ALLOCATION POLICY
That looks much different to what I would have expected. So am I seeing this right that only one of the MCs leads to/from the LLC? What implications does this have? What is this about?
 

DisEnchantment

Golden Member
Mar 3, 2017
1,687
6,237
136
That looks much different to what I would have expected. So am I seeing this right that only one of the MCs leads to/from the LLC? What implications does this have? What is this about?
Just a typical patent thing trying to cover as much scenarios as possible, but as stated somewhere within, the other memory interfaces may or may not be present.
 

A///

Diamond Member
Feb 24, 2017
4,352
3,155
136
As I have said elsewhere, that guy is the epitome of failing upwards.
I was hesitant posting that news this morning. Raja, for a long time, has been a figure that either garners hatred or praise, and sometimes indifference. He's done his own good and bad. I would argue that RDNA was his brainchild due to how long it takes for a GPU or CPU to come to fruition. Without having access to internal documents it's difficult to say what he had his fingers in. Jim Keller to me seems like a no BS type of guy and he wouldn't have had Raja on his board if he was a tool or useless individual like so many claim he is. The infamy surrounding Raja travels with him heavy like the scent of a moonflower in the summer evening intoxicating all around it with its simplicity through uniqueness.
 
Reactions: BorisTheBlade82

A///

Diamond Member
Feb 24, 2017
4,352
3,155
136
Hasan's trash heap has an interesting article pointing at Jim Keller as giving some introspect at some Zen 5 performance estimates. It's in line with the estimates here being 25% in IPC over vanilla Zen 4, hopefully a little more. It would be like Zen 2 to Zen 3, but far greater and placing yet another boot on Intel's throat not allowing them breathing room. Going by what I remember from the Intel rumors Zen 5 should leave Intel's server, workstation, mainstream and mobile dead in the water unless they can pull off one big surprise.
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |