There are actual numbers of the node itself, which are measured the same way across foundries, that you could use to compare Intel 4 and TSMC nodes.
Ironically enough however, despite having a ~15?% lower density than TSMC 5nm, The 512KB data array in RWC is smaller than in Zen 4. Same story in Zen 3 vs ADL for SRAM in L2. Design stuff I guess.
Integer performance: Integer performance is very difficult to increase and improvement in Integer performance indicates "quality" of the core. It is a complex combination of latency, frequency, balance of units, branch prediction(and pipeline length), across
all areas. You cannot have too big of an L1 cache as it increases latency, but can't be too small. You cannot just increase branch targets either, latency will come into play and so does performance of the algorithm. Decoders won't scale without beefing up the rest.
How do you increase top speed of a car from 200km/h to 300km/h? Just double engine displacement and horsepower? No. Aerodynamics needs to be greatly improved, since wind resistance is the limit at high speeds. You need the transmission to keep up so it does not fail, and also switch fast and seamlessly. And you need to do all that while making it lighter. You need a capable driver, because otherwise an accident might happen. Not a simple solution at all.
The 787 achieved 20% fuel consumption reduction by moving to composite materials rather than just aircraft aluminum for the chassis. Then they had to move the battery to lighter lithium technology. The engine has been replaced for a bigger, and slower running turbojet. And they had to do aerodynamic work using CFDs as well.
That's why Integer performance is the most important aspect of a general purpose CPU. It is said that in order to chase 1% improvement in performance, CPU architects did what can be called "heroic" work. It's amazing what they are doing now. The work pays off though. Improving Integer performance, or uarch benefits
everything. Deep learning, floating point, word processing, gaming, emulation, snappiness.
That's why it's absolutely retarded when mega corporations mistreat employees, especially veteran ones. These kind of decisions require very seasoned, and experienced architects with 30+ years of experience. That's an entire lifetime doing nothing but being an CPU architect and being top of the field at that!
Density: The L2 array is not LLC(Last Level Cache) anymore and is partway a core cache. It has more stringent voltage and power requirements, thus the cells used aren't exactly bog standard ones for L3.
They could have improved the density aspect compared to the competition in Redwood Cove, thus making it more favorable against chips such as Zen 4. This is me speculating based on what you said though.
Also, there is a fundamental limit. A company that does a better job scaling down will reach limits
faster than those that won't. You could argue the limits "democratize" compute and make latest technologies available to the small companies and those with less resources.
That's why DRAM has been on the 10nm class node(10-19nm) for a 5 years now. 10x, 10y, 10z, 10a, 10b, and they even talk about 10y! They will reach a decade, since Micron just announced 10b(10 Beta) availability as being the most advanced node. 10x = 17-18nm depending on the manufacturer.
Very much a possibility the designations are as follows:
10x = 17-18nm
10y = 16-17nm
10z = 15-16nm
10a = 14-15nm
10b = 13-14nm
10y = 12-13nm
1nm improvement per "generation".
Why? Because DRAM at 10nm is far, far denser than logic DRAM(3x the density of eDRAM) and that is denser than logic cells. TSMC is showing almost no gains for SRAM on the N3 node, meaning even SRAM is hitting hard limits. Logic isn't hitting yet because it's less dense.