- Mar 3, 2017
- 1,773
- 6,749
- 136
Determinism.
Chiplets should be exactly the same - including max frequencies, cache levels etc.
The answer is trivial: The operating system and the userland gets to deal with sixteen homogeneous cores.¹What would be better
Not in EPYCs AFAICT.Even on regular dual CCD CPU's one chiplet is binned better and runs at a higher frequency.
It's possible, obviously, and it must be done - it is totally unacceptable to have different chiplets, especially in top SKU.That's not possible. Even on regular dual CCD CPU's one chiplet is binned better and runs at a higher frequency.
It's possible, obviously, and it must be done - it is totally unacceptable to have different chiplets, especially in top SKU.
Please explain
Explain how is it not possible to have two same chiplets and how exactly this "market disagreement" is responsible for it.What's to explain?
Explain how is it not possible to have two same chiplets and how exactly this "market disagreement" is responsible for it.
That's not possible.
So which one is it then?Of course it's possible.
So which one is it then?
The first quote was referring to if BOTH CCD's had Vcache since currently by design one CCD is binned better.
Do you have a use case where it does make a difference? As was noted above the properly written HPC software is one, is that what you are using?Determinism.
Chiplets should be exactly the same - including max frequencies, cache levels etc.
There are not homogeneous as you have yourself noted. The only thing you gain is that for sufficiently low threaded workloads it does not matter where the workload lands. For all other use cases you would need software that is CCD aware to make best use of x3D cache on both CCDs [problem is it depends on the application what is the best allocation]. But it is the same problem that normal 9950x would face.The answer is trivial: The operating system and the userland gets to deal with sixteen homogeneous cores
Bad schedulers like in Windows?Do you have a use case where it does make a difference?
Having same spec chiplets does - with or without extra cache they should be the same spec (frequency, cache, ISA), not lower grade 2nd chiplet like it is now.vcache on both chiplets does not help with the scheduler issues!
Since we are in an AMD Zen 5 discussion: Some Zen 5 products are already available of which the tech specs and price points were focused elsewhere than on typical client workloads (including gaming). :-)typical client workloads (including gaming)
Even very good schedulers do not have enough information to automagically optimize for asymmetric processors like this.Bad schedulers like in Windows?
Wrong.vcache on both chiplets does not help with the scheduler issues!
"The problem" which you are stating only exists if threads share large hot data.The problem is that the round-trip from one CCD to another takes forever, not just that only one of the CCDs has vcache. If they both had vcache, you still would take horrible penalty if game threads got scheduled across the split.
This describes all games."The problem" which you are stating only exists if threads share large hot data.
Since we are in an AMD Zen 5 discussion: Some Zen 5 products are already available of which the tech specs and price points were focused elsewhere than on typical client workloads (including gaming). :-)
If your workload fits in one CCD then the problem used to be to ensure it gets scheduled to appropriate CCD. Since with Zen5 the freq diff is small enough then x3d chiplet is the usually the better choice. Solving Zen4 problem. Perfect scheduling is not possible unless the app itself will inform the OS what it needs.Wrong
Yes sure, but we can make their work harder or easier, and getting same spec cores makes it easier - plus NUMA can and should be used. There is only one reason AMD uses lower grade 2nd chiplet - to save a few bucks on a top SKU, that's not acceptable.Even very good schedulers do not have enough information to automagically optimize for asymmetric processors like this.
AMD has been pretty quiet on both of those. My guess is they would want to launch Shimada Peak but it takes longer, especially as they wait for production to overtake Epyc demand.So what do you guys think we are seeing next? Shimada Peak or Zen 5 desktop APUs?
The non-vcache die is faster than the vcache die. This is better. One fast. One with vcache. Are you arguing to slow the non vcache die down?My assertion is that all chiplets must be the same - cache wise and frequency wise, that's what binning is for, this is entirely possible and is actually done for EPYCs, but we should not be giving a pass to AMD for not doing it for top end consumer SKU that makes them very good money.
It's very marginally faster (frequency wise) in Zen 5 - and no, that's not better because large discrepancy in cache size makes those cores not the same, therefore presenting problem for scheduler.The non-vcache die is faster than the vcache die. This is better. One fast. One with vcache. Are you arguing to slow the non vcache die down?