Discussion Zen 5 Speculation (EPYC Turin and Strix Point/Granite Ridge - Ryzen 9000)

Page 689 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

bakyt115

Member
Nov 21, 2016
92
158
106
may be if someone with 370 in hand can disable SMT and e cores and makes some test then we can compare results with 4 core zen4

also interesting to see core to core comparison with SMT on and off.
 

MangoX

Senior member
Feb 13, 2001
591
108
106
I'm not sure it was mentioned, but why did AMD split stx into 2 ccx's? I think that totally killed the core to core latency. I mean it can't be a technical problem. Because after all, the 8500G has 2x z4 and 4x z4c sharing the same ccx without the latency problem. Why AMD? Why?
 
Reactions: igor_kavinski

bakyt115

Member
Nov 21, 2016
92
158
106
I'm not sure it was mentioned, but why did AMD split stx into 2 ccx's? I think that totally killed the core to core latency. I mean it can't be a technical problem. Because after all, the 8500G has 2x z4 and 4x z4c sharing the same ccx without the latency problem. Why AMD? Why?
to save area
 

MS_AT

Senior member
Jul 15, 2024
207
497
96
The encoding test are also all over the place (looking at AV1), with there being large gap between Phoenix and Strix and then the difference gets swapped in the same test but different resolution? Feels like something doesn't quite work in the scheduling or in clock management. Could also be SVT having awful threading model perhaps. AV1 encoding via Handbrake seemed to suck in ComputerBase review too, perhaps the encoder sucks on big.little (although in massively threaded up, it should effectively stop being big.little except for the caches). Maybe the encoder is prone to some problem with threading or dispatching SIMD on zen 5 / strix for some reason.
That is another thing that would be nice if AT investigated. I mean it would be nice they could comment on both the core performance in isolation (pin the workload to single core, double check it stays there and measure whatever needs measuring) and MT/SoC performance where scheduling issues could be confirmed and pointed out. Now we get visual representation that something was not exactly ok [but was that the test procedure, the CPU itself or Windows] but we are left guessing what. Of course I understand those things take time but they could be highlighted in the review and it could be mentioned that they will be investigated in another piece [bonus points for keeping up the promise and doing the piece]
 
Reactions: yuri69

RTX2080

Senior member
Jul 2, 2018
324
511
136
which of zen5 products will be on 3nm?
Maybe Strix halo too?

AMD is waiting for N3(E) for StrixHalo, and would be expensive for sure.

Certainly PBO or a manual OC though. If it achieved it at 230W, its so odd that AMD chose to require PBO to reach that perf instead of letting the proc handle it itself. Only thing I can think of is its a reliability/culpability play, and the voltage and current requirements to hit this ~45K are just not something they are comfortable having to warranty.

2 reason I guess: 1, there's no competition currently. 2, AMD is also afraid of silicon degrading which happened at Raptorlake.
 

Josh128

Senior member
Oct 14, 2022
290
403
96
Interesting post by David Huang;


"The consequence of Zen 5's initial release to most media outlets for testing on ultra-thin notebooks is that you can't even find a few Cinebench tests where a single core ran at full frequency without being throttled..."

No wonder AT couldn't measure any ST IPC increase in Specint while David measured around 10% jump vs Zen 4 mobile part.

Another comment (spicey language):


edit;
One more


"I suggest you wait until I finish running SPEC and GB under Linux in a few days before drawing any conclusions.In addition, if you have read my previous analysis of performance bottlenecks, you will know that even for a 6-wide 4ALU x86 processor, the performance bottleneck is mostly not in the decoding width or the number of ALUs."

Problem is, from leaks we've seen in R23 ST, even desktop silicon is also not holding its full single core boost freqs.
 

bakyt115

Member
Nov 21, 2016
92
158
106
there was paper about power consumption of x86 (in comparison with arm). and there was statement that decoding consume nearly 20% (can be wrong in numbers can't find article itself) of power for old tiny atom cores.

may be second decoder is now issue for zen5 core to hold max frequency in 1t workloads
 

MS_AT

Senior member
Jul 15, 2024
207
497
96
Problem is, from leaks we've seen in R23 ST, even desktop silicon is also not holding its full single core boost freqs.
Have you seen the clocks next to the score? Or do you base it on the fact that score is less than expected? These two things don't need to go hand in hand [although it would be better if boost was not reached for the leaked scores, then there would be a chance something can be tuned in BIOS to boost the clock to advertised values]
 

Philste

Senior member
Oct 13, 2023
248
442
96
The performance at the same wattage is 17-34% higher. Notebookcheck compared against the Z1 Extreme and 8945HS:
But this also shows that ZEN5 need more juice compared to ZEN4. The advantage at low wattages is pretty meh for 50% more threads. It only starts getting decent at power levels above ZEN4s sweetspot. I'm 80% sure we will see that every ZEN5 Desktop SKU is slower than it's predecessor at low wattages. Similar to Igors Leaks, the ZEN5 ES was besten by 7950X pretty much everywhere below 100W. ZEN5 needs Juice to run properly.
 

Josh128

Senior member
Oct 14, 2022
290
403
96
Have you seen the clocks next to the score? Or do you base it on the fact that score is less than expected? These two things don't need to go hand in hand [although it would be better if boost was not reached for the leaked scores, then there would be a chance something can be tuned in BIOS to boost the clock to advertised values]
Its because the scores are not equaling +17% vs known Zen 4 SKUs at known clocks. We are seeing ~9%-14% for 9600X and 9700X vs their corresponding Zen 4 SKUs depending on whether PBO is on or off.
 
Last edited:
Jul 27, 2020
19,613
13,477
146

LightningZ71

Golden Member
Mar 10, 2017
1,783
2,137
136
Thinking about the Zen5c CCX on Strix Point, that's the least amount of L3 cache per thread since Lucienne. Yes, the L2 is twice as large and still exclusive, so the apparent L2/L3 cache is 33% larger, but Zen5c, even restricted to ~3.6Ghz, is much higher performance that Zen2 at up to 4-4.5Ghz. This becomes an even bigger issue with AVX-512 code as that can quickly become memory throughput bound.

All of that is to say that the Zen5c CCX is very memory bandwidth starved and will be hitting the memory controller and snooping the other CCX a lot. This seems like a very hamstrung design unless the memory controller has gobs of bandwidth available at low latency. A MALL cache would have made this make sense. The current design just exacerbates existing problems with limited bandwidth.

This leads me to believe that AMD changed course hard after they had frozen critical aspects of Strix Point's design, like the rumored NPU expansion that removed the MALL. They wanted to hit a marketing goal of having a processor that looked like Alder Lake/Raptor Lake, but things went sideways along the way and they didn't course correct well.
 

SarahKerrigan

Senior member
Oct 12, 2014
735
2,035
136
So ideally, there should be 4 decoders, to handle the case of two branching pathways and the instruction to be executed after the branch is entered.

What matters is basic blocks. The two-ahead predictor allows for the frontend to make forward progress on the BB from the next taken branch, then the BB from the taken branch after that.

I'm not sure how you got "it needs four decoders" from that.
 
Last edited:
Reactions: Mopetar and Bigos
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |