Discussion Zen 5 Speculation (EPYC Turin and Strix Point/Granite Ridge - Ryzen 9000)

Page 736 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

DisEnchantment

Golden Member
Mar 3, 2017
1,747
6,598
136
We'll have our preview with STX Halo soon enough I guess.
The uncore is just pure cope at this point.
Discussing the bits and pieces of architectural weaknesses and how to overcome them is not coping. I don't know why this word is very much used in the shillicon twitter universe.

I am saying this based on the visible improvements in SPEC int, fixed clock, when Z4 is equipped with 3D V-Cache . I would think removing the uncore bottlenecks which 3D Vcache attempts to work around would improve the situation, until the next bottle neck at least
Also Z4 in MI300A benefits from the LLC prefetching as per AMD themselves.
 

CouncilorIrissa

Senior member
Jul 28, 2023
520
1,995
96
Discussing the bits and pieces of architectural weaknesses and how to overcome them is not coping. I don't know why this word is very much used in the shillicon twitter universe.

I am saying this based on the visible improvements in SPEC int, fixed clock, when Z4 is equipped with 3D V-Cache . I would think removing the uncore bottlenecks which 3D Vcache attempts to work around would improve the situation, until the next bottle neck at least
Also Z4 in MI300A benefits from the LLC prefetching as per AMD themselves.
I don't mean that discussing it is cope, it isn't. I meant that the uncore is just poor. It's just downright funny that CCD is unable to use all of the memory bandwidth because of a single GMI3 link.
 
Last edited:

DisEnchantment

Golden Member
Mar 3, 2017
1,747
6,598
136

moinmoin

Diamond Member
Jun 1, 2017
5,063
8,025
136
Well, It is not exactly stellar, saying it is mild improvement is being too generous considering the time frame involved.

I am mostly looking at Alexander Yee's blog to make this statement.

Other than AVX512 there is not much improvement
Yee did point out huge improvements in scalar integer though. It's everything in-between that's stagnating.

The bigger issue is that AMD continues to slip on its cadence, it was to be under 18 months. But competition in DC appears not to be strong enough for AMD to keep that up, causing it to lag more and more in mobile (and desktop as far as Apple and ARM can be considered competition there already). Wouldn't matter as much if there were a realistic chance of the cadence catching up, but it seems to get worse instead.
 

inquiss

Member
Oct 13, 2010
179
261
136
So this is another typical AMD launch. Couple users over hype the product. Others fall for this hype. When the product is actually released, everyone feels disappointed. For me the performance meets the expectation from the architectural perspective.
It is well know that it is very difficult to increase integer IPC. The number of general purpose registers is a bottle neck. More read/write ports will help, but it may also increase power usage. As I said before, we need to wait for APX instruction set implementation before we see huge IPC increase.
Having said that, there is still lots of potential still left in AVX. With AVX512 they can probably go over 16 execution units.

My real disappointment is there is no 24/32 core AM5 Zen5 CPU.
How would you feed these cores in AM5?
 

gdansk

Platinum Member
Feb 8, 2011
2,839
4,221
136
Wouldn't matter as much if there were a realistic chance of the cadence catching up, but it seems to get worse instead.
It's about the same time-between-releases as Zen 4. But for this length of time people expect bigger gains (even if the process uplift was less)
 

DisEnchantment

Golden Member
Mar 3, 2017
1,747
6,598
136
Yee did point out huge improvements in scalar integer though
But is constrained by memory bandwidth, that is they have throughput as long as there is no data to be fetched from somewhere lower in the memory hierarchy.
But they kept the L2 at 1MiB and kept the L2 to L3 at 32B/cycle. So no respite there too.
 

moinmoin

Diamond Member
Jun 1, 2017
5,063
8,025
136
It's about the same time-between-releases as Zen 4.
Which was known to be delayed to account for CXL. So Zen 5 to spend the same time means it is actually doubly delayed instead catching up with the intended cadence.

But is constrained by memory bandwidth, that is they have throughput as long as there is no data to be fetched from somewhere lower in the memory hierarchy.
But they kept the L2 at 1MiB and kept the L2 to L3 at 32B/cycle. So no respite there too.
That was to be expected though considering we already knew bigger uncore/IO changes would only happen with Zen 6 going by previous gens.
 
Reactions: Tlh97 and Joe NYC

gdansk

Platinum Member
Feb 8, 2011
2,839
4,221
136
Which was known to be delayed to account for CXL. So Zen 5 to spend the same time means it is actually doubly delayed instead catching up with the intended cadence.
Let's just put it this way. Only once did a Zen land on time and that was Zen 3. And Zen 5 is right on average. If all Zens but one are delayed then well, what's the exception? It isn't Zen 5.
 

DrMrLordX

Lifer
Apr 27, 2000
22,000
11,560
136
Maybe you need to know your stuff to interpret what any one particular result means, but if you're just looking at the average it really doesn't matter. When it comes to distilling performance to a single number, regression to the mean is real, and sample size is king.

When you have 50 or more individual benchmarks with bizarre performance profiles that don't match anything you do irl, it taints the geomean. Phoronix has been like this for years.

Looking at Phoronix results today, seems like Turin-D is going to be really, really good for cloud customers. It's a shame these kind of improvements didn't translate well to the consumer side (excluding a couple of areas like browsers).

No surprises on server side, but also consider how many office monkeys spend all day running bloated javascript/electron crap. It will be the GOAT for those people. And there are a LOT of them. AMD knew exactly what they were doing. Once again, AMD produces products that are not "for us".
 

inquiss

Member
Oct 13, 2010
179
261
136
For Zen 3 DDR4 4000 was the sweet spot, for zen 5 it's 6000. So that's 50% more bandwidth. The codes are also higher performance so need more memory bandwidth. You can't have more than 26 cores without increasing the memory channels, which taxes everyone on the platform. If you need more cores or bandwidth you go to TR.
 

JustViewing

Senior member
Aug 17, 2022
216
382
106
For Zen 3 DDR4 4000 was the sweet spot, for zen 5 it's 6000. So that's 50% more bandwidth. The codes are also higher performance so need more memory bandwidth. You can't have more than 26 cores without increasing the memory channels, which taxes everyone on the platform. If you need more cores or bandwidth you go to TR.
I am using DDR4 3200 with 5950X, so DDR5 6400 should be enough for 24 cores if not 32 Zen 5 cores. Remember L2 was increased in Zen4, so it will relieve some of the memory pressure. Sure more memory bandwidth will help, but what matters is whether 24/32 cores will out perform 16 cores in multi threaded applications. If even with limited bandwidth, a 24 core Zen5 beats 16 Zen5 within same power/bandwidth envelope, it is a win for the user.
 
Reactions: Timmah!

Timmah!

Golden Member
Jul 24, 2010
1,510
824
136
For Zen 3 DDR4 4000 was the sweet spot, for zen 5 it's 6000. So that's 50% more bandwidth. The codes are also higher performance so need more memory bandwidth. You can't have more than 26 cores without increasing the memory channels, which taxes everyone on the platform. If you need more cores or bandwidth you go to TR.
You know you dont have to buy hypothetical 24 core product, if you feel its too constrained by memory bandwith, right?
 
Reactions: JustViewing

inquiss

Member
Oct 13, 2010
179
261
136
I am using DDR4 3200 with 5950X, so DDR5 6400 should be enough for 24 cores if not 32 Zen 5 cores. Remember L2 was increased in Zen4, so it will relieve some of the memory pressure. Sure more memory bandwidth will help, but what matters is whether 24/32 cores will out perform 16 cores in multi threaded applications. If even with limited bandwidth, a 24 core Zen5 beats 16 Zen5 within same power/bandwidth envelope, it is a win for the user.
It would be a pointless product that nobody would buy. Partly because no one buys the 16 core chips anyway, but additionally because they would be severely bandwidth constrained. If you've read this thread you can see that the current chips seem bandwidth constrained as they are already.
 

inquiss

Member
Oct 13, 2010
179
261
136
You know you dont have to buy hypothetical 24 core product, if you feel its too constrained by memory bandwith, right?
I wouldn't, and I'm not going to because AMD isn't releasing it.hiw many people are going to be interested in an underperforming 24 or 32 core chips that's so memory contained? Either you want the chip to work (get epyc or thread ripper) or you don't. Not many people are in the "I want to buy a high core count processor and would buy it even if it's memory starved" camp.
 

Timmah!

Golden Member
Jul 24, 2010
1,510
824
136
I wouldn't, and I'm not going to because AMD isn't releasing it.hiw many people are going to be interested in an underperforming 24 or 32 core chips that's so memory contained? Either you want the chip to work (get epyc or thread ripper) or you don't. Not many people are in the "I want to buy a high core count processor and would buy it even if it's memory starved" camp.
Arent you constrained by memory bandwith only when you saturate entire available memory?
 

Timmah!

Golden Member
Jul 24, 2010
1,510
824
136
No? You're memory bandwidth constrained when you can't get things out of the memory fast enough.
Allright then. There are still tasks, like 3D rendering, that does not benefit from faster RAM significantly, that would immensely benefit from additionaĺ cores. RAM speed perhaps becomes important factor when you run out of it and data needs to be fetched from drive, but thats better to be resolved by more RAM anyway.
 

CakeMonster

Golden Member
Nov 22, 2012
1,492
653
136
Back during Z3 I was all aboard the MOAR CORES train, as it seemed even games were rapidly using 50% of threads, suggesting maxing the main cores and reaching into SMT. But the later generations have proved that better cores do make up for a lot of those scenarios, I think we'll be perfectly fine with 16c/32t for the duration of Z6 (up to 44~48 months from now). However, if Z6 release slips, or it does not improve much IPC wise, I could turn out to be wrong. I'm much more worried about the IPC race and cache now after initial Z5 results, like hopefully we'll get 12c and 16c X3D models without the heterogenous cores and the mess that is thread prioritization of those now.
 
Reactions: Tlh97 and marees
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |