Discussion Zen 5 Speculation (EPYC Turin and Strix Point/Granite Ridge - Ryzen 9000)

Page 950 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

StefanR5R

Elite Member
Dec 10, 2016
6,341
9,760
136
What would be better
The answer is trivial: The operating system and the userland gets to deal with sixteen homogeneous cores.¹
Or, what @Win2012R2 said while I was distracted.

Even on regular dual CCD CPU's one chiplet is binned better and runs at a higher frequency.
Not in EPYCs AFAICT.

________
¹) although split into two last-level cache domains, a problem for which at least EPYC 7000 and 9000 BIOSes offer an optional NUMA setting; don't know about EPYC 4000 BIOSes
 
Reactions: Win2012R2

Win2012R2

Senior member
Dec 5, 2024
792
795
96
The first quote was referring to if BOTH CCD's had Vcache since currently by design one CCD is binned better.

My assertion is that all chiplets must be the same - cache wise and frequency wise, that's what binning is for, this is entirely possible and is actually done for EPYCs, but we should not be giving a pass to AMD for not doing it for top end consumer SKU that makes them very good money.
 
Reactions: igor_kavinski

LightningZ71

Platinum Member
Mar 10, 2017
2,077
2,525
136
If you are doing HPC development or even doing workstation work using those same applications, a dual 3dCache Epyc 4XXX processor would be a notable improvement. CFD with OpenFoam and Embree ray Tracing are two notable areas that see improvements. That's not to say that the regular 3d cache product won't see SOME improvement on a few threads because of having at least one 3dcache CCD. but, if you're looking for maximum performance per socket, having both would help.

If all you want to do every day is run passmark, then it certainly won't help you, and even the single cache CCD part would be a waste.
 

MS_AT

Senior member
Jul 15, 2024
555
1,168
96
Determinism.

Chiplets should be exactly the same - including max frequencies, cache levels etc.
Do you have a use case where it does make a difference? As was noted above the properly written HPC software is one, is that what you are using?
The answer is trivial: The operating system and the userland gets to deal with sixteen homogeneous cores
There are not homogeneous as you have yourself noted. The only thing you gain is that for sufficiently low threaded workloads it does not matter where the workload lands. For all other use cases you would need software that is CCD aware to make best use of x3D cache on both CCDs [problem is it depends on the application what is the best allocation]. But it is the same problem that normal 9950x would face.

All I am trying to say is that putting 2 x3D chiplets on 9950x would most likely not result in meaningful improvement in typical client workloads (including gaming) what would lead to poor reviews and bad sales. Altough if given the option I would probably try to buy one if they made these
 

Win2012R2

Senior member
Dec 5, 2024
792
795
96
Do you have a use case where it does make a difference?
Bad schedulers like in Windows?

I had to disable half of my 7950X3D to get it all working in Win 11. Unless AMD puts V cache on both chiplets next time it will be obvious no-buy from me, 12 cores with extra cache will do just fine then.
 

Tuna-Fish

Golden Member
Mar 4, 2011
1,616
2,375
136
vcache on both chiplets does not help with the scheduler issues!

The problem is that the round-trip from one CCD to another takes forever, not just that only one of the CCDs has vcache. If they both had vcache, you still would take horrible penalty if game threads got scheduled across the split.
 

StefanR5R

Elite Member
Dec 10, 2016
6,341
9,760
136
typical client workloads (including gaming)
Since we are in an AMD Zen 5 discussion: Some Zen 5 products are already available of which the tech specs and price points were focused elsewhere than on typical client workloads (including gaming). :-)

Bad schedulers like in Windows?
Even very good schedulers do not have enough information to automagically optimize for asymmetric processors like this.

Though at least the asymmetry has been reduced a bit from 7950X3D to 9950X3D, thanks to almost same f_max across all cores of the latter.
 

StefanR5R

Elite Member
Dec 10, 2016
6,341
9,760
136
vcache on both chiplets does not help with the scheduler issues!
Wrong.

The problem is that the round-trip from one CCD to another takes forever, not just that only one of the CCDs has vcache. If they both had vcache, you still would take horrible penalty if game threads got scheduled across the split.
"The problem" which you are stating only exists if threads share large hot data.
 
Reactions: MadRat

MS_AT

Senior member
Jul 15, 2024
555
1,168
96
Since we are in an AMD Zen 5 discussion: Some Zen 5 products are already available of which the tech specs and price points were focused elsewhere than on typical client workloads (including gaming). :-)

Sorry I thought we were discussing why there is no client halo part with 2 x3d chiplets.

If your workload fits in one CCD then the problem used to be to ensure it gets scheduled to appropriate CCD. Since with Zen5 the freq diff is small enough then x3d chiplet is the usually the better choice. Solving Zen4 problem. Perfect scheduling is not possible unless the app itself will inform the OS what it needs.

If your workload spans multiple CCDs, then scheduler is not an oracle as you have yourself noted and the app would need to help it. This will be true regardless of CCD setup, but in some circumistances 2 x3d chiplets would give better performance, still this does not depend on scheduler itself afaik.

I am of course happy to learn in which circumistances homogenous x3d CCDs would help the scheduler.
 

Win2012R2

Senior member
Dec 5, 2024
792
795
96
Even very good schedulers do not have enough information to automagically optimize for asymmetric processors like this.
Yes sure, but we can make their work harder or easier, and getting same spec cores makes it easier - plus NUMA can and should be used. There is only one reason AMD uses lower grade 2nd chiplet - to save a few bucks on a top SKU, that's not acceptable.
 

desrever

Senior member
Nov 6, 2021
289
766
106
So what do you guys think we are seeing next? Shimada Peak or Zen 5 desktop APUs?
AMD has been pretty quiet on both of those. My guess is they would want to launch Shimada Peak but it takes longer, especially as they wait for production to overtake Epyc demand.

Desktop APUs are so low priority that I don't know if AMD will even bother launching them. Theres almost no reason to buy them since all ryzen CPUs now come with IGPUs.
 

inquiss

Senior member
Oct 13, 2010
352
527
136
My assertion is that all chiplets must be the same - cache wise and frequency wise, that's what binning is for, this is entirely possible and is actually done for EPYCs, but we should not be giving a pass to AMD for not doing it for top end consumer SKU that makes them very good money.
The non-vcache die is faster than the vcache die. This is better. One fast. One with vcache. Are you arguing to slow the non vcache die down?

Lol why
 

Win2012R2

Senior member
Dec 5, 2024
792
795
96
The non-vcache die is faster than the vcache die. This is better. One fast. One with vcache. Are you arguing to slow the non vcache die down?
It's very marginally faster (frequency wise) in Zen 5 - and no, that's not better because large discrepancy in cache size makes those cores not the same, therefore presenting problem for scheduler.

I am arguing for two chiplets with 3D cache rates to same frequency, as exactly the same as possible to reduce scheduling issues, plus NUMA mode should be activatable in BIOS so that OS knows it's different cache domains.
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |