- Oct 9, 1999
- 4,952
- 3,385
- 136
With the release of Alder Lake less than a week away and the "Lakes" thread having turned into a nightmare to navigate I thought it might be a good time to start a discussion thread solely for Alder Lake.
HT on the small cores would be dead last in thread priority though (P > E > P-HT > E-HT). My knowledge of architecture is very limited, but just from that I would imagine it would probably have a very limited impact except for heavily multithreaded scenarios which are pretty rare in day to day usage for now. Maybe it would nudge a few % more out of encoding scenarios, but for gaming or similar I can't imagine it would make a difference since Intel provides more than enough threads for now.
From what Hulk posted about his workload using all cores when priority is at normal or higher, it should run on more than just the e cores, OBS does start the encoding task with a priority of normal by default which should make thread director utilize all the cores for it whenever possible.Still hoping someone can do some game streaming on their Alder Lake to let us know if OBS gets moved solely to the E cores!
Also it's extremely hard to figure out on which and how many cores a software runs if you have more than one thing running at a time. How is anybody going to be able to tell if core number X load is due to obs or the game, or how much % of this load is due to either?
Yes but how will that help you determine if streaming stays on the e cores alone, if all p and e cores are at 100% or close to it.Streaming puts a pretty heavy load on your system unless you offload streaming duties to a dedicated box. You will find out in a hurry if your system can't keep up with the bitrate and quality level you've selected.
Yes but how will that help you determine if streaming stays on the e cores alone, if all p and e cores are at 100% or close to it.
Until you decide the foreground app is Photoshop, at which point running PS on E-cores will be quite the i5 6600 experience. This needs to get fixed properly, not by guessing which workload should go where.I think it would be "smarter" to keep the P's on the video work while the E's handle my web browsing!
Over time it will be fixed properly. The programmers simply need to specify which thread gets which core. Visual Studio already let programmers specify priority levels when threads are created. But now, it lets you also specify which core to put it on. Programmers know best whether or not a thread needs P cores or E cores. Old software of course won't have this, but new software and updated software should be getting it over time.This needs to get fixed properly, not by guessing which workload should go where.
We're not talking about optimization (as in getting best possible results with hints from the app dev), we're talking about Thread Director currently intervening over the "default" thread allocation and forcing threads over to the E-core complex when the app is no longer on foreground.Over time it will be fixed properly. The programmers simply need to specify which thread gets which core. Visual Studio already let programmers specify priority levels when threads are created. But now, it lets you also specify which core to put it on. Programmers know best whether or not a thread needs P cores or E cores. Old software of course won't have this, but new software and updated software should be getting it over time.
I fully realize what you were talking about. Yes, the default Thread Director setting is to put background programs onto the E cores even if the user wants them on the P cores. The long-term solution is for programmers to state that a specific thread is for the P core only. Then that overrides the Thread Director and makes your point irrelevant. It just will take quite a lot of time for software to add these flags.We're not talking about optimization (as in getting best possible results with hints from the app dev), we're talking about Thread Director currently intervening over the "default" thread allocation and forcing threads over to the E-core complex when the app is no longer on foreground.
Apple has been doing this for some time now. Things like system services are fenced in to the e cores only for example. And compiler flags for devs to pin to either P or e core, or to let them float are in the tool chain now.I fully realize what you were talking about. Yes, the default Thread Director setting is to put background programs onto the E cores even if the user wants them on the P cores. The long-term solution is for programmers to state that a specific thread is for the P core only. Then that overrides the Thread Director and makes your point irrelevant. It just will take quite a lot of time for software to add these flags.
We're not talking about optimization (as in getting best possible results with hints from the app dev), we're talking about Thread Director currently intervening over the "default" thread allocation and forcing threads over to the E-core complex when the app is no longer on foreground.
I already gave a pretty clear example of what happens in a prior post, all screenshots are the same Handbrake video conversion with different worker thread priority selected from inside the software. You can notice the scheduler is already perfectly capable to fill the E-cores with low priority threads and keep these threads away from the P-cores if it wants to. The problem is it wants to do it way too easily / often, it's almost as if someone tuned it for a 2+8 low power CPU...
View attachment 54660
I'm not sure that "paradoxically" is the word that I would use. That is the exact purpose of the efficiency cores: more performance at a lower power. In this case, 4.96% more performance at 79.4% of the average power.8P+HT graph makes the E-cores look useless from a performance perspective but paradoxically, 8P+HT+8E delivers bit more performance for a lot less power consumption, possibly due to P-cores not needing to turbo boost as high. 7P+12E should have been an option.
I'm not sure that "paradoxically" is the word that I would use. That is the exact purpose of the efficiency cores: more performance at a lower power. In this case, 4.96% more performance at 79.4% of the average power.
If I were to make the chip myself, I would have preferred 6P + 16 E in about the same die space.
It seems paradoxical if you just view the first graph in isolation. Adding 8E to the mix of 8P+HT should increase performance a whole lot but that would totally blow the power budget and probably melt the CPU. I do agree that 6P+16E seems very attractive. It just seems curious. Like they had more confidence in getting defect free P-cores than E-cores.I'm not sure that "paradoxically" is the word that I would use. That is the exact purpose of the efficiency cores: more performance at a lower power. In this case, 4.96% more performance at 79.4% of the average power.
If I were to make the chip myself, I would have preferred 6P + 16 E in about the same die space.
It seems paradoxical if you just view the first graph in isolation. Adding 8E to the mix of 8P+HT should increase performance a whole lot but that would totally blow the power budget and probably melt the CPU. I do agree that 6P+16E seems very attractive. It just seems curious. Like they had more confidence in getting defect free P-cores than E-cores.
Here's the thing though, these are desktop CPUs and not server or render boxes, windows and thread director are tuned to give users that don't want to mess with anything the best possible user experience, not the best possible performance, and that means that if a program states that it's background it gets shifted to the e cores so that the full power of the CPU (remember that the e cores are added on to a normal last gen CPU core count) is available to the user, it's a lot less efficient but it's going to guarantee the most responsive system to the user. "hey look, I'm rendering a video and still get 0 frame drop" and the like, how people see a product is much more important than how that product actually is.Yes, well stated. This is exactly the behavior we are talking about. Foreground apps work properly. It's the distribution of loading among foreground and background apps that is the problem. I don't think it is worked out on a per application basis. Windows needs to utilize all compute resources more effectively.
So I think we are talking about CPU inter-application performance as opposed to CPU intra-application performance, which was the one we were more concerned with before ADL came out. Or put different, the core distribution within a program is fine, it's the distribution among applications that is the problem.
This kind of reminds me of space exploration. Before the mission everyone has a question on their minds. When the probe gets there it's actually something else that no one thought would be puzzling that turns out to be the real interesting thing. Like when New Horizons found geologic activity on Pluto.
Here's the thing though, these are desktop CPUs and not server or render boxes, windows and thread director are tuned to give users that don't want to mess with anything the best possible user experience, not the best possible performance, and that means that if a program states that it's background it gets shifted to the e cores so that the full power of the CPU (remember that the e cores are added on to a normal last gen CPU core count) is available to the user, it's a lot less efficient but it's going to guarantee the most responsive system to the user. "hey look, I'm rendering a video and still get 0 frame drop" and the like, how people see a product is much more important than how that product actually is.
If said user is a power user that does want to mess with things, they know how to make the software use all cores.