Discussion Zen 5 Architecture & Technical discussion

Page 13 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

DisEnchantment

Golden Member
Mar 3, 2017
1,749
6,614
136
Agreed. But it is interesting how he mentions that essentially the problem is the shift from the 6 wide decode to 4 wide, and also the former 4 ALUs. He isn't asked about FP or INT or anything, just "why's the IPC only that?" and the answer is specifically jumping to decode width and ALU count, which IIRC only went from 4 to 6 in INT. So my little ear got attentive there. Seems like even he knows what the crux of the complaints is. And claims that soft will grow enough over time (which apparently doesn't mean only Z5 but also 6/7 possibly).

Problem is influencers gave "celebrity status" to core uarch guys as if the rest don't make a difference.
Mike Clark, the uarch chief architect, leader of the core roadmap, will absolutely say the Zen 5 core is great or in simulation it is great etc.
Nobody bothered to ask Sam Naffziger, the fabric and chiplet lead about what is up with Infinity fabric, or the chiplet tech in Zen 5
Nobody was seeking the SoC guys, or the product guys why their chip performed the way it did.
Mahesh Subramony was a lead SoC guy for Strix but nobody asked him anything, everybody asked only Mike Clark.
The product guys assemble all the IPs together to make the final purchasable product, so they definitely are responsible for the final performance of the product, not just the uarch folks
 

Tuna-Fish

Golden Member
Mar 4, 2011
1,505
2,080
136
Faster network I/O could mean reduced latency in multiplayer games, amirite?
The effect is completely negligible in that use. It matters for high-throughput web servers and such, as it reduces the overhead of handling a packet. I think it requires special support from both hardware and the OS, and is unlikely to be supported outside specialist server-grade stuff and Linux.
 
Jul 27, 2020
20,921
14,496
146
The effect is completely negligible in that use. It matters for high-throughput web servers and such, as it reduces the overhead of handling a packet. I think it requires special support from both hardware and the OS, and is unlikely to be supported outside specialist server-grade stuff and Linux.
I don't know. Seems like something the Killer NICs could leverage, if Intel still owns them and they are looking for another marketing bullet point to help sell their NICs to mobo makers. Would be weird for them to support an AMD feature but they just might do it, considering they badly need sales at the moment.
 

Mahboi

Golden Member
Apr 4, 2024
1,035
1,900
96
The effect is completely negligible in that use. It matters for high-throughput web servers and such, as it reduces the overhead of handling a packet. I think it requires special support from both hardware and the OS, and is unlikely to be supported outside specialist server-grade stuff and Linux.
Yeah games typically acquire through TCP then stream UDP and there's a regular check to see if you're still on.
UDP is basically zoom zoom get data handle data send response. There is no computation overhead, I'm willing to be a huge amount of even very big corpos' game servers run on 2015 Skylakes and are just dandy with it.
 
Reactions: lightmanek
Jul 27, 2020
20,921
14,496
146
There is no computation overhead
Not computation but latency can be reduced further since packet data won't need to be copied to RAM first. It will remain in cache always. It could increase the number of players serviced by a single server.
 

lightmanek

Senior member
Feb 19, 2017
476
1,092
136
Not computation but latency can be reduced further since packet data won't need to be copied to RAM first. It will remain in cache always. It could increase the number of players serviced by a single server.
I don't think cutting 100-200 Nanoseconds from 1-150 Milliseconds typical game server latency will make you hit that head-shot more often ...
 

Mahboi

Golden Member
Apr 4, 2024
1,035
1,900
96
Not when EU players still go to NA servers for fun and have a latency leap from 30ms to 120ms and find that "bad, but bearable".
Yet again Gaming has few actual hard requirements anymore.
 
Reactions: lightmanek
Jul 27, 2020
20,921
14,496
146
Not when EU players still go to NA servers for fun and have a latency leap from 30ms to 120ms and find that "bad, but bearable".
Because there aren't enough EU servers or players???

Or is it because they love hearing and chuckling at the immature expletives uttered by NA players?
 
Reactions: lightmanek
Jul 27, 2020
20,921
14,496
146
Can someone here come up with an easy to run Windows benchmark (even just console based) and release it on Github so people here can run it on their 7950X and 9950X PCs and their results analyzed to figure out what's holding back Zen 5? Something like a staggered workload that starts with low core count and low data requirements and then progresses to higher and higher core counts and becomes increasingly memory bound?
 

MS_AT

Senior member
Jul 15, 2024
365
798
96
Can someone here come up with an easy to run Windows benchmark (even just console based) and release it on Github so people here can run it on their 7950X and 9950X PCs and their results analyzed to figure out what's holding back Zen 5? Something like a staggered workload that starts with low core count and low data requirements and then progresses to higher and higher core counts and becomes increasingly memory bound?
I think it would be more meaningful to hook up a profiler and profile the software you are concerned with than to write another synthetic program that will try its best to pretend its average workload of type A, or B or C. So something like what C&C is doing but this takes time and you need the time also to get to know the documentation to know what performance counters mean. And since there isn't one yet available for Zen5 you need to read older ones and hope they still are doing the same thing. But its not guaranteed...
 

DavidC1

Golden Member
Dec 29, 2023
1,211
1,933
96
I wonder if all the smart architecture guys have been assigned to the NPU team? Even Jim Keller is at Tensorrent.
 

del42sa

Member
May 28, 2013
119
159
116
Problem is influencers gave "celebrity status" to core uarch guys as if the rest don't make a difference.
Mike Clark, the uarch chief architect, leader of the core roadmap, will absolutely say the Zen 5 core is great or in simulation it is great etc.
Nobody bothered to ask Sam Naffziger, the fabric and chiplet lead about what is up with Infinity fabric, or the chiplet tech in Zen 5
Nobody was seeking the SoC guys, or the product guys why their chip performed the way it did.
Mahesh Subramony was a lead SoC guy for Strix but nobody asked him anything, everybody asked only Mike Clark.
The product guys assemble all the IPs together to make the final purchasable product, so they definitely are responsible for the final performance of the product, not just the uarch folks
nobody asked MLID for his opinion on ZEN5 failure, yet he gave it and perhaps good teaching lesson for AMD CPU department

So it's the most frontend-limited design around

I know precise where the problem is : it's broken
 
Last edited:
Reactions: igor_kavinski

CouncilorIrissa

Senior member
Jul 28, 2023
575
2,256
96
So it's the most frontend-limited design around
Yeah, the growth in backend resources outpaced the front-end. Clam pretty much said this in the GNR review:
Widening the core may have been premature too. Much of the potential throughput offered by Zen 5’s wider pipeline is lost to latency, either with backend memory accesses or frontend delays.
 

MS_AT

Senior member
Jul 15, 2024
365
798
96
So it's the most frontend-limited design around
I will let myself quote the original article:
ibx264 is backend bound (...) Zen 5 loses more throughput, but that’s because it has a wider pipeline and thus more potential throughput to lose. Zen 5 is still the leader here, but that’s because of its increased reordering capacity and better frontend rather than core width.
Kernel compilation is a very frontend bound workload (...) Again Zen 5 loses the most potential throughput to frontend reasons, despite having arguably the most advanced frontend of all CPUs here. Feeding a 8-wide core is hard when branches are everywhere. Despite not looking so good in this graph, it’s important to remember that Zen 5 outperforms every other core architecture here, even when limited to four cores at 3 GHz
Still, the fundamental limiters for CPU performance have remained the same over the past decade and more. (...) I’ll have fun watching engineers try their best to tackle those challenges. It’s like watching a new player learn to play Dark Souls. Tragedy is inevitable. With that in mind, I wish them the best.
Memory latency and spaghetti code will claim your performance. Resistance is futile.
In other words try as hard as you can software engineers will manage to nullify whatever performance improvements you have came up with
 

Tuna-Fish

Golden Member
Mar 4, 2011
1,505
2,080
136
Zen5 is exploring entirely new thresholds of being front-end bound.

"all" that Zen6 needs to do to be a much better core is to have a better frontend. There is clearly a lot of spare execution resources there to make use of. The scare quotes are there because that is in fact a very tall order.
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |