Discussion Zen 5 Speculation (EPYC Turin and Strix Point/Granite Ridge - Ryzen 9000)

Page 785 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Josh128

Senior member
Oct 14, 2022
273
394
96
For one, we have no idea the %load of these 2, so its virtually meaningless. I have a 64 core Turin on the way, it will be here Friday, so I will update my Turin build thread at that time. (weel, after a couple of days to put it in)
How is that even possible considering it has even been announced yet? lol Serious question, what are the requirements for getting a sample like that?
 

Markfw

Moderator Emeritus, Elite Member
May 16, 2002
26,049
15,191
136

eek2121

Diamond Member
Aug 2, 2005
3,098
4,386
136
@Timmah! probably as designed, but it really doesn't matter as much as folks claim. Zen 5 is faster than Zen 4 (how much depends on your workloads, gamers not happy...yet, AI guys super happy! Everyone else is between those two folks) The chip generally is an upgrade in many scenarios, albeit a minor one.

Honestly, EPYC and Threadripper are going to be exciting for non-gamers/non casual users. For everyone that doesn't need an upgrade now? Just wait.

Also, (sorry, offtopic) side note: I guess SarahKerrigan is gone?
 

Josh128

Senior member
Oct 14, 2022
273
394
96
So the higher power consumption of Zen 5 carries to EPYC. 9654 Genoa was 2.4Ghz base, 360W TDP. 9655 Turin is 2.6GHz base, but 400W(!) TDP. Thats 11% more power, I'd be very curious to see the average uplift-- probably wont be much more than 11%, so the perf/power will likely be very similar to last gen.

Whats going to be really interesting is what 128 core 3nm Zen 5C (if they make such an SKU) can do vs 128 core 4nm Zen 4C. I expect it to be the most interesting comparison of this whole generation.
 

marees

Senior member
Apr 28, 2024
331
401
96
"While I only dug into additional performance monitoring events on a few cores, I hope AMD will increase the L1 instruction cache’s size later on. Going to 64 KB probably isn’t enough. Apple and Qualcomm have gone for very large 192 KB L1 instruction caches, and with good reason. But even that won’t be enough to contain the instruction footprint on certain workloads."

 

eek2121

Diamond Member
Aug 2, 2005
3,098
4,386
136
So the higher power consumption of Zen 5 carries to EPYC. 9654 Genoa was 2.4Ghz base, 360W TDP. 9655 Turin is 2.6GHz base, but 400W(!) TDP. Thats 11% more power, I'd be very curious to see the average uplift-- probably wont be much more than 11%, so the perf/power will likely be very similar to last gen.

Whats going to be really interesting is what 128 core 3nm Zen 5C (if they make such an SKU) can do vs 128 core 4nm Zen 4C. I expect it to be the most interesting comparison of this whole generation.
See earlier posts. Much greater than 11%.
 
Reactions: lightmanek

itsmydamnation

Platinum Member
Feb 6, 2011
2,907
3,517
136
So the higher power consumption of Zen 5 carries to EPYC. 9654 Genoa was 2.4Ghz base, 360W TDP. 9655 Turin is 2.6GHz base, but 400W(!) TDP. Thats 11% more power, I'd be very curious to see the average uplift-- probably wont be much more than 11%, so the perf/power will likely be very similar to last gen.

Whats going to be really interesting is what 128 core 3nm Zen 5C (if they make such an SKU) can do vs 128 core 4nm Zen 4C. I expect it to be the most interesting comparison of this whole generation.
TDP and base clock have to account for AVX 512 implementation , which is substantially bigger then Zen4.
 
Reactions: lightmanek

Thunder 57

Platinum Member
Aug 19, 2007
2,945
4,467
136
"While I only dug into additional performance monitoring events on a few cores, I hope AMD will increase the L1 instruction cache’s size later on. Going to 64 KB probably isn’t enough. Apple and Qualcomm have gone for very large 192 KB L1 instruction caches, and with good reason. But even that won’t be enough to contain the instruction footprint on certain workloads."


I found that curious. Those chips lack uop caches, right? Hence the need for a larger L1i. Zen originally had a 64KB L1i but shrunk it to 32KB in Zen 2 to make room to increase the uop cache size.
 

Hotrod2go

Senior member
Nov 17, 2021
349
232
86
These caches are indeed faster, and L1d has become bigger. (And it doesn't stop at these caches, as e.g. BTB is bigger, ITLB is bigger, µop cache is dual ported, ROB is larger...) *However*, bigger L1 and faster L1/2/3 are benefiting workloads more which hit L1/2/3 a lot, whereas the returns are diminishing in workloads which have a sizable amount of cache misses to begin with. This goes without saying but I am mentioning it because you brought up games. Many video games empirically benefit from increase of L3 cache size and from decrease of main memory latency. So they are more akin to the latter type of workload.
Yes, I get your point, but I don't play "many games" I'm a Bethesda game fan & with their latest release - Starfield & its creation 2 engine I notice an increase in fps with my 9700X using the same memory profile as before with Zen 4. So its engine specific if its noticeable the increased cache efficiency at all levels of it.
 

Hotrod2go

Senior member
Nov 17, 2021
349
232
86
You can load from L1 128B/c with AVX512, 64B/c with AVX2 and 32B/c with pure scalar code. For Zen4 it was respectively, 64B/c, 64B/c, 24B/c.
Now about AIDA it's testing aggregate bandwidth so comparing 6 cores to 8 cores will leave you with false assumptions as 8 core will most likely show greater score due to having more cores.
Games won't see such dramatic increase in BW as AIDA might let you believe as they are rarely using SIMD and none of them are using AVX512. For games latency improvements are more welcome.
The latest iteration of non beta Aida64 warns the end user it is not optimised for Zen 5 yet. So take my claims as a guide only at this point in time. I'm well aware synthetic benchmarks have some detachment from real world scenarios.
 

StefanR5R

Elite Member
Dec 10, 2016
5,885
8,746
136
@StefanR5R , what NUMA settings do you recommend in the server BIOS for getting best performance out of consumer workloads/benchmarks, as opposed to NUMA-aware server workloads?
Impossible to say for me, because I have no own experience with Windows on NUMA machines, or even on machines with split last-level caches.

So Turin will use 12 channels of "sweet spot" DDR5-6000???

SWEEEEEEEEEEEETTTTTTTTTTTTT!!!!!!!
Nice, I thought the rumors said 5600 or something like that.
But timings of reg-ECC kits will probably be a shock to gamers to see. ;-) They should wait for Turin-X anyway. :-D

On the screenshot you can see that it is 2 socket configuration with each socket having 128 cores with SMT disabled. So this frequency doesn't have to be anything special
View attachment 105843
The thing with xwitter references is, one might be unable to follow such links, or the web browser at hand might not show anything.

So the higher power consumption of Zen 5 carries to EPYC.
Higher power consumption compared to what? (Rhetoric question. Always look at work gotten done for the power.)

9654 Genoa was 2.4Ghz base, 360W TDP. 9655 Turin is 2.6GHz base, but 400W(!) TDP.
Gasp! (Not really.)
If you put a vectorized workload onto Genoa, then even the 64 core SKU still runs very efficiently when set to its cTDP_high of 400 W. Guess what AMD's guides have always been recommending WRT the choice between cTDP_low/default TDP/cTDP_high of all the EPYC generations when you use them in HPC and similar scenarios.

Personally I am interested to see which cTDP_high the 64 core Turin will have. IMO, 500 W could be reasonable given the considerably wider core and to be expected core throughput compared to Genoa. — Edit: Running the IMCs at higher clock rate will certainly demand a decent amount of more power too. Sometimes it pays off to configure a lower RAM clock rate than the maximum supported.
 
Last edited:
Reactions: lightmanek

Josh128

Senior member
Oct 14, 2022
273
394
96
Higher power consumption compared to what? (Rhetoric question. Always look at work gotten done for the power.)
Zen 4. Its already established that low clock / TDP Zen 5 is scarcely better than Zen 4 at all in perf/w (7700 vs 9700X @65W), and actually can be worse at low enough TDP. Being that these server chips core frequencies are way down compared to desktop, way lower than what desktop can do at 65W, they are very possibly dropping near that "worse than" range.
 

CouncilorIrissa

Senior member
Jul 28, 2023
518
1,982
96

StefanR5R

Elite Member
Dec 10, 2016
5,885
8,746
136
Zen 4. Its already established that low clock / TDP Zen 5 is scarcely better than Zen 4 at all in perf/w (7700 vs 9700X @65W), and actually can be worse at low enough TDP.
Oh, you were referring to the sketchy scaling of GNR (and STX I think) to low per-core power levels. Yes, it will be interesting to see how Turin will manage. At the top end, there is a core count increase, the IOD will support faster RAM, the top end has more fabric links active... and at the same time the cores are that much wider. Quite some places to spend additional socket power at. Potentially one out of several debatable points why Turin gets a newer CCD stepping.
 

Josh128

Senior member
Oct 14, 2022
273
394
96
If you follow the discussion, the first comment was specifically about server workloads.
Actually, my comment was about general performance increases from Genoa to Turin, not performance in server workloads. Yes, indeed these are server chips, but there are plenty of scientific applications that might be run on these chips that are not server specific.
 
Jul 27, 2020
19,613
13,459
146



Wonder if AEMP is EXPO 2.0 or just some ASUS marketing term.

@Det0x, tell us more
 
Reactions: lightmanek

eek2121

Diamond Member
Aug 2, 2005
3,098
4,386
136
Performance? Outside of AVX-512 acceleration, core for core, no way.
Boost clocks for EPYC alone increase by more than 11%. 3.7 -> 4.5 ghz is a huge jump. I am expecting Threadripper to have a peak of at least 5.5ghz and will be sad if that is not the case.

Just because client is boring right now doesn’t mean all segments are.

side note: Postgres added AVX-512 support to some areas and managed more than a 6x performance increase in some cases. SQL databases can benefit significantly by utilizing AVX-512.
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |