You have to remember that Intel's max power consumption have to be measured on AVX/AVX2 workloads as they have 256 bit FP units. Most common workloads like gaming or productivity rarely (if at all) use the 256 bit FP units. AMD's Zen has only 128 bit and not 256 bit FP units. Intel will have higher performance in AVX/AVX2 workloads wrt Zen but at higher power cost. If we measure power consumption in an AVX/AVX2 workloads then we are likely to see Intel get closer to rated TDP.
arg this bugs me so much (it important to get the semantics right dammit!)
Its not AVX workloads, its 256bit ops, any 256bit ops, meaning both avx and avx 2 can target 128bit vectors if they want. For example the extra instructions in avx/avx2 vs see, or because their console code is tuned to 3 operand 128bit avx etc. ( first thing that bugs me out of they way)
Also its not really the width of the units that's the limiting factor for Zen because it has more FP units then skylake. Its the load store bandwidth in and out of the core's. Zen has 256bits load and 128bits store vs 512/256 of >Haswell.
The point being AMD AVX and AVX2 performance is fine, there isn't some magically thing making AMD crap at those instruction sets( BD and PD had real 256bit instruction issues). Its that intel have an advantage on anything thats at 256bit operation. At the same time AMD/ZEN have an advantage on 128bit operations because they have more units.
If i was amd i wouldn't go chasing 256 or 512bit avx performance or SMT 4*, i would be using the massive die and power budget those things cost to increase clocks and IPC. If you look at AMD GPU's ( or NV) they are becoming much better at being CPU like. The more GPU's compute capacity becomes flexible and general the more a 512bit CPU becomes a jack of all trade master of none. If you have a master at both you can just eat them from both sides.
The General server base doesn't care about really wide vectors, thax to intels own segmentation neither does the consumer market.
*those rumors from Fottemberg aren't worth the bits in the database they are stored on unless AMD plan on basically copying a power9 style methodology to core design which is really a more unified version of CMT.