First my take. Intel needed a way to be competitive with AMD in multicore benchmarks and performance scenarios but were limited by space and power due to foundry limitations, so they created E-cores.
AMD wanted more cores with less power, so they reduced the speed to allow for more cores in a small space, and a few other changes, but essentially kept all the same capabilities, like avx-512.
My opinion is that AMDs solution is much easier to implement and much more sane and capable in todays world. Please give your thoughts, and this discussion is more about methods than companies.
Wow. I am really late to this thread, but I love the subject.
In order to achieve higher PERFORMANCE (IPC x Clock) you need a design that has a longer pipeline (to avoid misalignment at the end of each stage in ILP execution in superscalar designs) which makes the core bigger. Additionally, the type of transistor you use to achieve high clock speeds are different than to achieve power efficiency. Many of us remember the good ole Netburst years where Intel got out over their skis trying to get the clock speeds up at the expense of IPC and Power ... so there are definitely limits to this approach (and I think we have pretty much settled in now).
Ideally, you would have a longer, more complex P core that did this while having a lower pipelined core for the E core. I agree with others here that keeping the instructions compatible should DEFINATELY have been a design decision at Intel.
I think the above logic is how Intel got where it is today..... and I think they are wrong.
First, in order to compete in DC, you have to have high performance per CORE. Many DC software packages are licensed PER CORE, not per thread, and not within some performance metric.... per CORE. These software licenses are usually recurring on a yearly basis, so the pain never goes away. This makes the cost of the hardware a minor portion of the total cost of ownership.
In a DC product, performance is limited by the power you can draw from a single socket. Turin's socket power limit is 700W I believe. This brings up the 2nd big design decision that must be made. in DC performance per Watt is very important since the design will ultimately be power bound (you can always just tack on more cores in a socket)
More advanced processors are superscalar (can execute many instructions in parallel). To maximize peak performance, the number of execution units must be much bigger than the average need for execution units. This leaves a lot of performance in a MT load just sitting around. To maximize performance per Core, and performance per watt, and performance per area (the latter being a lesser concern in DC), you need SMT .... but SMT greatly complicates the core design from the load to the dispatch and everything in between.... but it pays great dividends for the reasons I mentioned above AND the fact that in MT loads, SMT gets you 40% (for AMD) more performance for ~15% more transistors, and a trivial amount of power (it is nearly free in power I think).
Now, I would argue, that if you have all these features in your E Core (which from my above arguments, you really do need), is it really an E core.... or is it just a thinned down P core? This is what AMD has implemented.
Now, I do believe that there are some loads where having a max core count with each core having lower peak performance do exist .... thus Turin D. In these cases (where you can avoid the huge cost of annual per core licensing) you can consider a much smaller and simpler core design that is much smaller ..... then just pack a metric crap ton of them on a socket ..... BUT, you still are limited by the socket power, so these cores MUST still be very high performance per watt or it doesn't make sense.
Everyone here marvels at Skymonts performance per area; however, I am really curious to see how its performance per watt is compared to Zen 5c ... because in DC, who cares how big the die is? They are more than willing to pay for it.
Finally, why have I fixated on DC? The simple answer is that this is where the highest growth and highest margins are (product management 101).
Can you just have different design teams for each market segment for processors? Sure, if you are the US government and have no need of profit. If you are in free market competition, it isn't enough to have the best product in every space, if the price of the product exceeds the market price (and you thus go bankrupt).
There you are. My 20 cents (I wrote too much to consider it 2 cents worth ).