Discussion Zen 5 Speculation (EPYC Turin and Strix Point/Granite Ridge - Ryzen 9000)

Page 683 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

GTracing

Member
Aug 6, 2021
78
191
76
Performance cores are in separate cache domain to efficiency cores. Pretty much similar approach as Meteorlake LPE-cores - they are only useful to save power when performance cores are totally shut off.
That's utterly ridiculous and completely untrue.

AMD 12 and 16 desktop CPUs also have multiple CCXs, the cores on different CCXs aren't useless.
 

ryanjagtap

Member
Sep 25, 2021
132
153
96
Performance cores are in separate cache domain to efficiency cores. Pretty much similar approach as Meteorlake LPE-cores - they are only useful to save power when performance cores are totally shut off.
I think the inverse is true. The Zen 5c CCX is the most active and the 4C Zen 5 CCX turns on for bursty workloads. There is a video of Wendell from Level1Techs interviewing an AMD fellow who said this I think.


The video has no timestamps. You can transcribe the video if you don't want to watch it.
 

naukkis

Senior member
Jun 5, 2002
868
731
136
That's utterly ridiculous and completely untrue.

AMD 12 and 16 desktop CPUs also have multiple CCXs, the cores on different CCXs aren't useless.

Dual CCX-cpu's aren't as useful as single cache domain cpus with similar core count. But that's not the same situation than with Strix point - in desktop both CCX has fast cores and equal amount of cache. In Strix point there's slow CCX and fast CCX with more cache but only 4 cores. AMD could do desktop cpu with one normal and one dense CCD to make similar situation but they probably won't even consider doing such a dog to shame desktop performance reputation. But they did it for mobile - after years of pretty flawless execution. Have to wonder if they are losing their mojo.
 

GTracing

Member
Aug 6, 2021
78
191
76
Dual CCX-cpu's aren't as useful as single cache domain cpus with similar core count. But that's not the same situation than with Strix point - in desktop both CCX has fast cores and equal amount of cache. In Strix point there's slow CCX and fast CCX with more cache but only 4 cores. AMD could do desktop cpu with one normal and one dense CCD to make similar situation but they probably won't even consider doing such a dog to shame desktop performance reputation. But they did it for mobile - after years of pretty flawless execution. Have to wonder if they are losing their mojo.

The cross-CCX latency is not nearly as big of an issue as some people make it out to be.

The lower clocks on the dense core cluster barely matter because the chip isn't able to run all cores at high clock speeds, and single threaded workloads will run on the fast cores anyways.
 

HurleyBird

Platinum Member
Apr 22, 2003
2,759
1,455
136
The fact that a single CCX can only support upto 8 cores seems to be a problem for AMD. More cores would require another CCX, and a seperate L3 block. Perhaps they should work on larger CCXes, or even do a rework of their core cluster hierarchy?

Turin dense is a 16-core CCX already. There's nothing fundamental that stopped AMD from putting everything on one CCX in Strix.
 

ryanjagtap

Member
Sep 25, 2021
132
153
96
Turin dense is a 16-core CCX already. There's nothing fundamental that stopped AMD from putting everything on one CCX in Strix.
Maybe they had to make two CCX instead of a unified CCX to power gate the zen 5 cores? I don't think they could turn off the zen 5 cores when idling without having put it on a different CCX. (Just a speculation)
 

naukkis

Senior member
Jun 5, 2002
868
731
136
The cross-CCX latency is not nearly as big of an issue as some people make it out to be.

The lower clocks on the dense core cluster barely matter because the chip isn't able to run all cores at high clock speeds, and single threaded workloads will run on the fast cores anyways.

Cross-latency isn't problem biggest problem- being different cache domains is main problem. Have really, really wonder why AMD selected that CCX arrangement as it's the worst possible. They could have done 4+4 CCX with addtional 4 core low power CCX to have at least 8-core CCX with enough cache and fast cores for MT scalability. But they choose that - and result seems to be not that great.
 

GTracing

Member
Aug 6, 2021
78
191
76
Cross-latency isn't problem biggest problem- being different cache domains is main problem. Have really, really wonder why AMD selected that CCX arrangement as it's the worst possible. They could have done 4+4 CCX with addtional 4 core low power CCX to have at least 8-core CCX with enough cache and fast cores for MT scalability. But they choose that - and result seems to be not that great.

Can you point to benchmarks from any reputable reviewer to show that this is an actual problem?
 

SarahKerrigan

Senior member
Oct 12, 2014
735
2,033
136
So Zen5 = Bulldozer 2.0?

Fascinating.

Not in the sense of being a completely uncompetitive product. It looks like AMD made some odd choices, tried some new tricks, and performance is kind of underwhelming, but it's not meaningfully worse than Zen4 on any axis and is better on some others.

Folks making the Dozer comparison need to remember that Intel was doing 50%+ more iso-clock ST int against it.
 

naukkis

Senior member
Jun 5, 2002
868
731
136
Can you point to benchmarks from any reputable reviewer to show that this is an actual problem?

So with their CCX MT job has to choose between 8 cores and 8MB L3 or 4 cores and 16MB L3. First that is a big problem to scheluder, where should job scheluded for best performance? Why did they choose that CCX arrangement instead of putting 8 cores with big L3 and big cores cluster and leave 4 small cores to CCX with low amount of cache if their intention is to have low power CCX for power saving( to mimic what Intel is doing?)? That CCX arrangement they chose ain't making any sense.
 
Last edited:
Reactions: FlameTail

Rheingold

Member
Aug 17, 2022
55
150
76
I think the inverse is true.
Thank you. I also had those statements still in mind but couldn't remember which interview it was. The interesting part about Zen 5c starts at 8m20s. Mehesh even envisions the separation of CCXs and thus cache contexts as beneficial for applications with different performance requirements. It just remains to be seen if these potential advantages will be leveraged in real usage.
 

Tup3x

Golden Member
Dec 31, 2016
1,069
1,062
136
Somewhat lukewarm release. Some performance numbers are a bit odd and makes me wonder how well it actually does in some workloads.
 

Geddagod

Golden Member
Dec 28, 2021
1,295
1,368
106
Cross-latency isn't problem biggest problem- being different cache domains is main problem. Have really, really wonder why AMD selected that CCX arrangement as it's the worst possible. They could have done 4+4 CCX with addtional 4 core low power CCX to have at least 8-core CCX with enough cache and fast cores for MT scalability. But they choose that - and result seems to be not that great.
Perhaps they really wanted to ensure perf/watt at the very start of the curve is competitive, so they increased the -C core count. I agree though, I'm confused about the P+E core arrangement too, and I'm really surprised out of the couple interviews with AMD employees we had, no one really asked them about this...
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |