Does your brilliant discovery also follow for Ryzen processors? If a Ryzen can rune P95 stable at 3.9Ghze, then is it able to run any other workload at 4.3Ghz?
I guess it also means that any Skylake - X that runs P95 stable at 4.4Ghz is capable of 4.8Ghz stable? Or does the magic frequency increase only apply to processors running P95 at 4.2-4.3Ghz?
I'm just trying to gain some of that great knowledge you have about processors, LOL!
This is a bit of an oversimplification of what is going on but I think it will help you understand. A Ryzen CPU for example doesn't need an AVX offset for two major reasons. Ryzen hits its limits mostly in silicon. Without active refrigeration it just plainly won't run at higher levels than the what most people are seeing on overclocks (3.8-4.1 GHz). Whereas most of the AVX clocking intricacies of SL-X are mostly power and thermal limitations meaning if you had the power and the cooling you could overcome the limits.
So to get an idea of how SIMD instructions work, basically they are shortcuts. A CPU might have to go through let's say 11 stages in a pipeline for normal X86 instructions. You send a piece of code it takes 11Hz for the CPU/Core to process it. AVX, MMX, and SSE and 3dnow basically exist as single or near single stage pipeline specifically with that code and the answer sheet all in the CPU/Core. You send a piece of code that instead of saying "do this calculation" it instead says "pull up answer 1A of the answersheet". Assuming you understand that lets move on. With MMX, 3dnow, SSE and the like, the pipeline and the answer sheet were basically in line with the width of the normal pipeline for the CPU. AVX changes that. Without changing much else in the layout of the CPU AVX, AVX2, AVX-512 pipelines are 128bit, 256bit, and 512 bit respectively. In the past whatever clock speed and power usage that was OK on the normal pipeline also worked just as well as lets say on SSE. That meant outside the increased die size these SIMD instructions were free.
So here is where it changes with AVX. Since these are much larger width pipelines for handling these much wider chunks of code. There can be higher limit on clockspeed and when measured at the same clock speed it will use much more power. Just to give you an example, there was a statement at one point that the answer sheet for AVX-512 was the size of an Atom core within SL-X. That 512 support on SL-X accounts for almost 12% of the die size. AMD doesn't really have this issue because they only use 128 bit pipelines and combines them to do AVX2 (256 bit code) so in AVX2 it's half as productive as AVX. This means the AVX pipelines are closer in line with the the rest of the CPU design and power and clockspeeds don't have to be changed. On the 256b and 512b pipelines it gets farther and farther away causing the core to run hotter and hotter.
So Intel to make sure they were running as fast as possible on regular X86 instructions created and offset that we can now change. That means when the core is running X86 instructions it runs at the speed you set. When it starts to process AVX code it clocks down by whatever you have the offset set to. This means that if you can clock up 5GHz when running a burn in test without AVX, but it always crashes at 4.6GHz or higher with an AVX burn in test, you know that it is the AVX pipeline that is the limit and can set and offset to -500MHz. This means you will run at 5GHz when running x86 and as soon as it starts to do an AVX piece of code it will clock down to 4.5GHz for that core.