Discussion Zen 5 Speculation (EPYC Turin and Strix Point/Granite Ridge - Ryzen 9000)

Page 696 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

SarahKerrigan

Senior member
Oct 12, 2014
735
2,035
136
The thing with these desktop designs is that they are extremely bandwidth starved. If Zen 5 had the kind of membw that M4 has, its IPC would really fly.

On single-thread? No it wouldn't. SPECint is barely sensitive to DRAM bandwidth at all. Essentially no single-threaded load that isn't a vector microbenchmark is going to saturate a modern CPU's memory bandwidth.

This is crap you just hallucinated.
 
Jul 27, 2020
19,613
13,481
146
I am not sure why it is like that, if due to historical reasons, but people generally treat SIMD performance as FP performance, but there are INT SIMD instructions, they are executed by the "FP" part of the core. So INT code could also use this 512b wide registers on granite ridge. The thing is the legacy software or most of the software we have today is not written with SIMD in mind. Since long time the dominant programming model is the object oriented programming and it doesn't lend well to SIMD. (IMO at least, but it's not a place for this discussion). So what matters more, because you don't have to rewrite or recompile code to use it, is the scalar part of the core and the front-end.
Yes! That's the argument I needed to support my idea that rewritten software would benefit from Zen 5 more!
 
Jul 27, 2020
19,613
13,481
146
On single-thread? No it wouldn't. SPECint is barely sensitive to DRAM bandwidth at all. Essentially no single-threaded load that isn't a vector microbenchmark is going to saturate a modern CPU's memory bandwidth.

This is crap you just hallucinated.
I didn't say ST anywhere in my post. Of course I was referring to MT. The combined IPC of all the Zen 5 cores with access to same membw as M4 would be a lot higher.
 

SarahKerrigan

Senior member
Oct 12, 2014
735
2,035
136
I didn't say ST anywhere in my post. Of course I was referring to MT. The combined IPC of all the Zen 5 cores with access to same membw as M4 would be a lot higher.

"16 cores DESTROY four heavy and four light Apple cores!"

A truly stirring defense of x86.

(Nobody in the history of ever has measured "IPC across all cores.")
 

DavidC1

Senior member
Dec 29, 2023
782
1,241
96
So what matters more, because you don't have to rewrite or recompile code to use it, is the scalar part of the core and the front-end.
These are called general purpose chips for a reason.

uarch gains have always been scalar code. Improving scalar code has the side benefit of improving FP, and AIeee!! as well.
 
Reactions: podspi

MS_AT

Senior member
Jul 15, 2024
210
507
96
Yes! That's the argument I needed to support my idea that rewritten software would benefit from Zen 5 more!
It wasn't meant to be this way. It's as much true for Intel The x64 crowd for some reason is going towards SIMD when ARM (including Apple) in comparison have rather weak SIMD execution side. Might be because x64 guys want to extract as much perf as they can per instruction while ARM guys care less because ARM decoders are easier to implement so they spam decoders and scalar execution units. But as Apple shows for the general purpose software this seems to be a better strategy. [I mean no disrespect to Apple engineers when I say they are spamming something, it's just an observation they they have considerably more resources on that side]. To boot, x64 side is penalized with only 16 GPRs, so it will spill to cache more often vs 32 GPRs on ARM, and since they are clocking higher each spill cost relatively more [I mean access latency is usually > 4 cycles on x64 side and it thinks it's less on ARM side but haven't checked the docks, I might be wrong]
 

DavidC1

Senior member
Dec 29, 2023
782
1,241
96
It wasn't meant to be this way. It's as much true for Intel The x64 crowd for some reason is going towards SIMD when ARM (including Apple) in comparison have rather weak SIMD execution side. Might be because x64 guys want to extract as much perf as they can per instruction while ARM guys care less because ARM decoders are easier to implement so they spam decoders and scalar execution units.
That's why the saner side for x86 is the Intel E core team ignoring AVX-512 and straight up doubling the number of vector units like ARM has been doing.
 

DavidC1

Senior member
Dec 29, 2023
782
1,241
96
By the time Apple gives people the same number of cores, AMD's design and core counts will have scaled to new heights.
Yes, we must laud AMD with all the praise for a 5.7GHz core consuming 30W+ at lower performance level than a 4.4GHz one using less than 8W.

A monster truck with a V16 engine with fuel consumption measured in Gallons per mile performs same in off road conditions as a V4 subcompact hybrid. It's simply awesome!
 

naukkis

Senior member
Jun 5, 2002
878
757
136
It wasn't meant to be this way. It's as much true for Intel The x64 crowd for some reason is going towards SIMD when ARM (including Apple) in comparison have rather weak SIMD execution side. Might be because x64 guys want to extract as much perf as they can per instruction while ARM guys care less because ARM decoders are easier to implement so they spam decoders and scalar execution units. But as Apple shows for the general purpose software this seems to be a better strategy. [I mean no disrespect to Apple engineers when I say they are spamming something, it's just an observation they they have considerably more resources on that side]. To boot, x64 side is penalized with only 16 GPRs, so it will spill to cache more often vs 32 GPRs on ARM, and since they are clocking higher each spill cost relatively more [I mean access latency is usually > 4 cycles on x64 side and it thinks it's less on ARM side but haven't checked the docks, I might be wrong]

x86 is towards wider SIMD units. ARM side has more 128 bit execution pipes and load/store ports. Apple and Cortex-x925 with its six 128bit NEON pipes is anything but weak in SIMD execution. x86 targets high clocks and many SIMD units and load/store-ports are hard to make working together but Intel E-cores take that same ARM approach - more 128 bit execution ability and performance seems to be there for x86 too.
 

DavidC1

Senior member
Dec 29, 2023
782
1,241
96
By the time Apple gives people the same number of cores, AMD's design and core counts will have scaled to new heights.
The reason that we don't need to compare to Bulldozer is because compared to best cores the x86 cores kinda do feel like Bulldozer. The reason people can't make the connection is because x86 cores are insulated by the ISA bubble.

Someone said Intel had a 50% 1T lead over Bulldozer right?

How much faster is M4 over Zen 5 per clock again? Oh right, 59% in Int.
 
Jul 27, 2020
19,613
13,481
146
Yes, we must laud AMD with all the praise for a 5.7GHz core consuming 30W+ at lower performance level than a 4.4GHz one using less than 8W.
That monster "electricity eating" core opens way more computing possibilities for the world than Apple's closed everything ecosystem. Apple's core might be general purpose but it's not "general user". Restricted to the elite.

I would have no issue declaring Apple the winner if people could straight away ditch x86 and move to ARM, without taking a hit to their pockets and bank accounts.
 

DavidC1

Senior member
Dec 29, 2023
782
1,241
96
I would have no issue declaring Apple the winner if people could straight away ditch x86 and move to ARM, without taking a hit to their pockets and bank accounts.
Hence...
because x86 cores are insulated by the ISA bubble.
Might I say the government consistently giving a win for Intel vs outsiders have to do something with it?

If let's say Nvidia was allowed x86 license with Denver?
 

CouncilorIrissa

Senior member
Jul 28, 2023
521
2,002
96
That monster "electricity eating" core opens way more computing possibilities for the world than Apple's closed everything ecosystem. Apple's core might be general purpose but it's not "general user". Restricted to the elite.

I would have no issue declaring Apple the winner if people could straight away ditch x86 and move to ARM, without taking a hit to their pockets and bank accounts.
Aight, I'm done, that's it for me for today, have fun, folks.
 

DavidC1

Senior member
Dec 29, 2023
782
1,241
96
Try that again in FP/SIMD and M4 dies a horrible death 🤣

4x 128 bit vs 4x 512 bit isn't even a competition.

SME improves some specific things for them in SIMD - but not everything.
You do know FP is much easier to boost right?

If it was so important why is Sapphire Rapids regarded as almost a worthless chip?
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |