Core benchmarks like Geekbench are over large frequency range almost linear. Only at the top frequencies you may observe some non-linear behavior due to limitations in the memory subsystem.
i am hardly a geekbench expert, but i know x86 uarch's quite well. That doesn't really align, on int workload:
your not Dram throughput bound.
your not L1-L2-L3 clock limited (L3 runs at fastest core speed)
your not going to be any more outstanding memory request limited (cache subsystem at core clock)
your not anymore contention limited ( only limited read/write out of a ccx a cycle)
Zen has a really large oooe window
The only thing is IF latency and that disapears as a measurable bottleneck at around 2933mhz DDR memory speed.
So what is the bottleneck?
As such you over-estimating performance numbers heavily when assuming an integer score for Ryzen of 3500@2.9GHz. In fact even if i go with your metric, that the score only degrades by 222 points every 600MHz - which i mentioned is unrealistic at lower clocks, the estimate would be 3651-222=3429@2.9GHz.
Yet at times it is scales just like that, when other CPU benchmarks show more linear performance. That said i haven't look at sub tests
My estimate would be something around 3100-3200 for Ryzen@2.9GHz - certainly not in the same ballpark as Exynos.
https://browser.geekbench.com/v4/cpu/search?dir=desc&q=AMD+Tambourine&sort=score
What does that do for you?
Besides an x86 based design achieving similar efficiency to an ARM based design in the same performance class, assuming both design teams know what they are doing, is close to impossible. That's essentially what was also mentioned by Jim Keller, when he was working on Ryzen.
No he didn't say anything like that, Micheal clake also didn't say anything like that either,
Jim said we can for the same number of transistors have about a 10% bigger OOOE engine with arm then with x86, Micheal Clake said we can deliver Zen level of performance regardless of ISA.
IF you search RWT you will find the wars about ISA covered very well, to me i would summarize the issue as at 4 wide decode x86 spends more transistors on the front end but it doesn't cost you power, uop caches help that limit and save power, over 4 is a big problem. ARM ISA has some nicer load operations.
At that point your done, everything else weak vs strong memory ordering are all just different trade offs for different workloads.
I cannot follow this calculation. What absolute voltages are you assuming?
Im not assuming a specific voltage, the voltage is dynamic with guardbands etc, but what i am talking about is relative change, the stilts data is hard set minimums, i am suggesting that under normal operation a Zen core @2.9 will be using 200mV less then a core @3.6 (2200U boost). I am then using the per core power data from Anandtechs review to give an estimate on per core power usage @2.9.
At this point im more then happy to blame Microsofts compiler