Are OS's generaly suppose to use PCore first then thred onward to the HT/LCore or is just random and depend on the programming/er?
That's a good question. With HT each physical core has 2 threads** which are seen as logical CPUs. Both threads are equal and it shouldn't matter which of the two threads you run on performance wise. A performance hit can however come from running on both threads of the same physical core at the same time.
Here's some results with an i7-860 with HT enabled and W7HP64. Balanced power plan used.
4 physical cores ( Core 0 to Core 3 ) each with 2 threads.
1) This is the usual run of Linx using 8 threads, 33GFlops. Core parking is enabled and defaults mean at least one thread per physical core should be active. This has the effect of the first 4 software threads usually being assigned to different physical cores unless the logical cores are considered overloaded.
2) Linx is set to use 4 threads and because affinity is set to "all", the OS gets to choose which logical CPUs they will run on. They may swap threads when context switching occurs. Well in my case the scheduling didn't go too well with the software threads running mainly on just physical core 0 and core 1 and a little on core 3. Result, 24-28GFlops.
3) Still using 4 threads, affinity has been set to use 1 thread per physical core. This gives a much better result, 37GFlops.
4) Now there seems to be a bug with the core parking algo'. Selecting affinity of the 4 parked cores results in only one thread being un-parked and used resulting in terrible performance. BTW I did not disable turbo for these tests so that 1 core was probably using 26x multi against 4 cores using 22x. I should redo it with fixed multi. Naah, I'm not that keen.
So in my case I guess the answer to your question would be errm... yes, to everything lol !
** Possibly more than 2 threads per physical core may become available in the future.