What part of 'no' didn't you understand? Windows will schedule threads across modules before it puts two threads on the same module.
On an eight core BD, each module will get one running thread until the thread count gets to five, then modules will get multiple threads.
Putting more than one thread on a module when it's not necessary will result in a performance decrease, so why would it get scheduled that way?
It's exactly the same as Hyperhtreading, Windows will schedule threads so only one thread lands on a core, until you are out of physical cores.
It s not the same as hyperthreading , and that s why you misled yourself.
HT mandate that each core is first used for a single thread since
this provide the best efficency in IPC.
When the thread count is more than the physical core count,
then virtual cores are used as a last ressort, with little gain in perfs,
that s why it s scheduled the way you re talking about.
With BD, if two thread are to be scheduled, they are sent to a single
module , allowing to gate off the three inactive modules and using
the saved TDP to boost even more the active module frequency.
If 4 threads are to be executed, same thing, two modules are used
in lieu of four , allowing to shut off two complete module and increase
the two active one s frequencies.