- Mar 3, 2017
- 1,747
- 6,598
- 136
Besides, I bet that many who keep bringing this up simply don't realize how little parallelism there is in the vast majority of client workloads.How many times does this idea come up that clearly doesn't work. To go higher core counts you need quad channel ram. That's too expensive for a mainstream platform.
Hey, you underestimate the number of people who run Cinebench R23 continuously on cold winter days and nights especially to stay warm!Besides, I bet that many who keep bringing this up simply don't realize how little parallelism there is in the vast majority of client workloads.
Hey, you underestimate the number of people who run Cinebench R23 continuously on cold winter days and nights especially to stay warm!
I don't think that works. How does that work with binning strategy? How many chips need to be binned down to 4 cores? It also doesn't help with cross cc'd talk, you still have that same issueThe least they can do is give dissimilar CCDs to 9900X, as in one 8 core CCD + one 4 core CCD so at least it wouldn't get ignored so much by gamers. There is only one local retailer in UAE who got Ryzen 9000 CPUs and they are already out of 9700X. 9600X and 9900X are still in stock.
I'm sure that AMD knew about how mediocre Zen 5 was when they were deciding TDPs. Why did they choose 65W for them? To cripple them even more? It lost in a lot of benchmarks to 7600X/7700X because of that low TDP. I have a couple ideas why:
1. They wanted at least something good out of this gen, so leaned on "power efficiency" story (forgetting that 65W Zen4 parts exist).
2. They wanted X3D to appear a lot more powerful than regular chips.
3. OEMs and SIs asked for 65W chips to be right there on launch, so they can put them in crappy B840 motherbaords.
4. They drank their own kool aid and decided TDPs before knowing performance?
Most likely the first one, but it was interesting to think about.
You are being too generous with the kind of questions being asked by "tech" journalist who got access to Mike Clark, Mahesh Subramony et al. What a waste.TBH we already know the design rationale. They wanted a new forward-looking base design - >4-wide decode, 6 ALUs, 8-wide dispatch, and 512b-compatible bandwidth.
That's about it. A new base with a lot of cut corners and odd features.
Networked PCs are supporting this type of applications inexpensively.Depending on workflow for some use cases even if a piece of software is not very well threaded you can just run multiple instances of it or you can have a rendering / encoding task going while working on some other task and having more cores allows you to dedicate more resources to running those background tasks for what might be a single threaded primary app that you are currently working in.
It may help because most games won't go above 16 threads which will ensure no thread needs to go outside the 8 core CCD. On 6 core CCD, the number of threads prove insufficient and some game/gpu driver threads need to be moved to the other CCD.It also doesn't help with cross cc'd talk, you still have that same issue
It could work if they at least create some new 9905X SKU with dissimilar CCDs and look at the sales data after a year to find out what consumers prefer. This is now the 3rd generation where x900X SKU seems to be disliked and always has to be gotten rid of through firesales. They should just stop with this stupid SKU.I don't think that works. How does that work with binning strategy? How many chips need to be binned down to 4 cores?
That depends on nature of communication, what you advocate would help if the threads are talking to each other heavily, but if they are just workers spawned to do something and die then CCD2CCD overhead doesn't matter that much and 6C x 6C could be better due to more balanced nature [same amount of LLC per core, where 8x4 would require more complex scheduling logic to extract max performance]. After all there are some gaming benchmarks were 7900x is leading 7700x.It may help because most games won't go above 16 threads which will ensure no thread needs to go outside the 8 core CCD. On 6 core CCD, the number of threads prove insufficient and some game/gpu driver threads need to be moved to the other CCD.
Like look here: https://www.techspot.com/review/2811-cpu-cache-vs-cores/
I remember when Zen3 launched there was speculation that 5900X might perform better in some workloads than 5950X because it would have more cache per thread. I don't think that proved to be true, but theoretically I can understand how you could imagine that.
And who assigns CPU affinities to the game threads to ensure that they don't wander off to another cache domain?It may help because most games won't go above 16 threads which will ensure no thread needs to go outside the 8 core CCD.
This test was done without locking the core clocks.
Incidentally, the Linux kernel's process scheduler ignores last level cache domains in its scheduling decisions for threads of the same process. This is in contrast to NUMA nodes, for which the kernel implements the default policy to try and keep all threads of a process on the same node. I don't know why such a policy is not applied WRT cache domains, but I suspect it is because the optimum for performance cannot be predicted by the kernel. I don't know what Windows' process scheduler does in this regard.That depends on nature of communication, what you advocate would help if the threads are talking to each other heavily, but if they are just workers spawned to do something and die then CCD2CCD overhead doesn't matter that much and 6C x 6C could be better due to more balanced nature [same amount of LLC per core, where 8x4 would require more complex scheduling logic to extract max performance]. After all there are some gaming benchmarks were 7900x is leading 7700x.
Not sure what you're saying here. Yes some games require more threads than 6, but how does that help when windows will put threads on both CCDs. You still get cross talk in 8+4, plus that doesn't address the point around binning. It doesn't make commercial sense, which is backed up by the fact we haven't seen this config.It may help because most games won't go above 16 threads which will ensure no thread needs to go outside the 8 core CCD. On 6 core CCD, the number of threads prove insufficient and some game/gpu driver threads need to be moved to the other CCD.
Like look here: https://www.techspot.com/review/2811-cpu-cache-vs-cores/
View attachment 104907
12 core CPU's don't sell well, agreed. More cores is such a niche anyway. AMD has a killer product with 8 cores and cache. If you're into games you buy that. If you really want more cores on a consumer platform then 16c is the way to go. 12c is sort of unloved for good reason. It isn't core maxing, it isn't gaming maxing. An esoteric config within the 12c space, that costs money to assemble and explain to people doesn't make business sense, and we haven't seen AMD do it for precisely that reason.It could work if they at least create some new 9905X SKU with dissimilar CCDs and look at the sales data after a year to find out what consumers prefer. This is now the 3rd generation where x900X SKU seems to be disliked and always has to be gotten rid of through firesales. They should just stop with this stupid SKU.
They also have the option of creating 9930X or something like that with one 8 core CCD and one 6 core CCD inbetween 9900X and 9950X for 28 threads and I bet it would sell better than 9900X.
Plenty of people are interested in STXH, it's just that we're far away from its release. The hot topic is Granite Ridge dumpster fire.Is lacking love of Strix Halo here. No one understand how much this SoC is important to the whole computer market.
3N with RX 6750 XT performance.
Gpus becoming part of the CPU is future and m1 ultra is here to prove how much is possible to do an Ultimate APU.
Zero attention for the STXH tablet/notebook console that Lenovo is doing.
Strix Halo I really want to be successful. This is what AMD can do when the naysayers in OEMs stop holding them back from doing something truly fun. Combining RAM between CPU and GPU makes so much sense.Is lacking love of Strix Halo here. No one understand how much this SoC is important to the whole computer market.
3N with RX 6750 XT performance.
Gpus becoming part of the CPU is future and m1 ultra is here to prove how much is possible to do an Ultimate APU.
Zero attention for the STXH tablet/notebook console that Lenovo is doing.
12 core CPU's don't sell well, agreed. More cores is such a niche anyway. AMD has a killer product with 8 cores and cache. If you're into games you buy that. If you really want more cores on a consumer platform then 16c is the way to go. 12c is sort of unloved for good reason. It isn't core maxing, it isn't gaming maxing. An esoteric config that costs money to assemble and explain to people doesn't make business sense, and we haven't seen AMD do it for precisely that reason.
I've got a smol 6c CPU. I feel jealousI feel personally attacked
I feel personally attacked
I would assume game developer as they best know the nature of their own workload. Look up Cyberpunk settings for CPU, they have very basic option for this exposed to the user. Generally there are windows api that let you outright pin workloads to specific cores [though they are generally not recommended as you might shoot yourself in the foot] or to suggest to the kernel it should keep specific threads together on specific sets of cores. It's even mentioned in publicly available resources from AMD, but I don't have them at hand and can't look them up right now.And who assigns CPU affinities to the game threads to ensure that they don't wander off to another cache domain?