That`s an interesting proposition i do agree to. Essentially the set of use-cases, where increased CPU performance show benefits is getting smaller. What we see is diminishing returns of CPU performance on every increasing usage scenarios.
Another point i would like to throw into the discussion is efficiency. An architecture, which is potentially faster by throwing immense transistor budgets into the architecture is doomed to fail because today we already see thermal and power densities as the ultimate limit of performance. Therefore i am more than sceptic about the prospects of dynarec etc. in order to achieve x86 compatibility on ISA level because energy per instruction will go up. For the very same reasons i do consider x86 as a showstopper going forward.
I see potential solutions to this problem:
Break the CPU die (for an 8-core design like this on 14nm FinFET: ~600-900mm^2) up into much smaller die. These smaller die implement the logic but are much cheaper to manufacture and have a much higher yield. Perhaps even move a lot of the DDR5 memory controller logic off the CPU and onto the motherboard/RAM: this logic will force-feed the CPU cache with data.
An external branch predictor/memory controller design??? (Atomicity and cache coherency will be major problems)
Do initial low-volume manufacture on small 200mm wafers to lower costs. Use a TwinScan 200mm with ArF laser or better yet: electron beam etchers and precision ion dopers.
Probe with an Electroglas 4090u+, TEL P-8XL, Accretech UF200, Cascade 200mm automatic or the various manual Micromanipulator, Teradyne, SPEA, Delta and Cascade manual probers at 200mm/8" or greater. Sell these early CPUs (with large registered ECC supporting motherboards, LGA 4600 or so pin sockets) to large server vendors in an aggressive bidding process for
$500,000 apiece or more. Then, get GlobalFoundries, STMicroelectronics, TSMC and/or UMC to fabricate the chips at high volume on 300mm wafers. Within a few years, prices drop to $300 or less for a top-end model. Mid-end models will be PGA 2400 or so (easier to repair pin damage vs LGA socket damage).
CPUID maybe:
CertifiedCBR, CertifiedCTN, VIA VIA VIA , CentaurHauls, AuthenticAMD, ST ST ST ST .
801486-class or Am1486-class
8 core(s), 16 thread(s)
Model 0 Family 0 Stepping C0
Extensions supported:
MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, VIA 64 (or AMD64 for AMD), VT-x, AES, AVX, AVX2, AVX-512, CVT16, FMA3, FMA4
(more details in that 6502.org thread at the beginning of this post).
And at 1.4Ghz, would the power draw really be so bad? The Elbrus 2000 itself, with its 300Mhz clockspeed (and sadly rather poor memory controller, cache, and large die size) only drew roughly 6W of power.
Aggressively push into photonics (OptiCORDIC??? on glass wafers/glass-doped silicon wafers/vacuum or argon cavity light chambers, with inter-core waveguides for light transmission) and room-temperature superconductors for 801586 and 801686 designs. 801586 will be 100x or more faster than the 1486 at AVX workloads or even others!
Intel goes bankrupt in the mid-2020s as investors desert them due to long lead times, poor yield of large dies and slow movement against external foundries and fabless/labless chipmakers.