I think it would be quite a bit larger than the that die size and even with those specs, there a good chance it would lose to full ADA102.
Just look at Navi 31 vs AD103 at the moment.
12288(6144) vs 9728(4854) Cores 25.8% advantage.
384 vs 304 TMU 26.3% advantage
192 vs 112 ROPS 71.4% advantage
960 vs 716GB/sec 34.1% advantage.
You would think Navi 31 would smash AD103 with that advantage, yet there is only a 2-5% advantage for Navi 31 vs AD103 in raster. Compare your imaginary N30 die and you will see the advantages for N30 are even less than navi 31 vs AD103 by quite a bit.
20480(10240) vs 18432(9216) 11% advantage
640 vs 576 11% advantage
256 vs 192 33% advantage
1280 vs 1084GB/sec 18% advantage.
Why do you think It would be a lot larger than 428mm2? This is already 42% bigger than N31 GCD.
12288(6144) vs 9728(4854) Cores 25.8% advantage is really only on paper.
The shaders increased only by 20%(96CU vs 80CU), but now are capable of dual-issue, but honestly It does very little to performance.
N31 vs N21:
18% higher median clockspeed from TPU reviews(
1,
2), 20% more shaders and 49% higher performance.
I got about 5% for that dual-issue(149/1.18/1.2=1.05).
On the other hand, moving from Turing to Ampere provided ~25% of performance.(RTX 2080 vs RTX 3070)
So in reality It is 6144*1.05 vs 4854*1.25 = 6% difference.
If shaders are a bottleneck, then It doesn't really matter how much more TMUs, ROPs or BW N31 has over AD103, It won't increase performance in my opinion.
With this is mind, there is nothing to suggest your imaginary N30 would beat full AD102. If your only evidence is the poor scaling of AD103 vs AD102, there is nothing to suggest that Navi 30 would scale well that high, particularly since one of the biggest bottlenecks(bandwidth) would only increase 33% vs Navi 31. Also as I mentioned, the RTX 4090 only has 12.5% more L2 cache than a RTX 4080. This along with the modest increase of bandwidth vs last gen is likely a bottleneck which Full AD102 with 33% more cache won't have. Add in the reduction of clocks to accommodate that huge increase in specs for N30 and the problems NAVI 31 is already having being increased from being a more complex chip, I think you would likely get something that combines some of the worst parts of VEGA and Fermi and ultimately a loss.
This imaginary N30 wouldn't be limited by BW, both IC and BW would increase by 33% compared to N31.
You think that's not enough?
Imaginary 160CU N30 vs 80CU RX 6900XT would have a more capable IC of the same size and BW would be 2.5x higher. I think that's more than enough.
If RTX 4090 had a memory bottleneck as you think, then It doesn't make much sense for It to use slower GDDR6x chips than RTX 4080 or to have only 12.5% more L2 in my opinion.
Even If there is memory bottleneck, then I don't think performance would be more than 20% higher than RTX 4090, but in this case It should be still faster than N30. If the scaling is bad for N30, then I see N30 somewhere in the middle between RTX 4090 and Full Ada102.
Only CUs saw a huge increase in specs(+66.7%) in my imaginary N30.
ROPs, BW and IC were increased by only 33%, I also reduced clocks by 10% and precisely high clocks are a big problem for N31, so this will actually help.
RTX 4090 having +68% SM(Cuda, TMU), +57% ROPs, +12.5% L2, +41% BW, +50% Vram compared to RTX 4080 resulted in only 41% higher TBP.
Imaginary N30 has +66.7% SM(Shaders, TMUs), +33.3% ROPs, +33.3% L2, +33.3% BW, +33.3% Vram and -10% clockspeed should be enough for 500W TBP at worst.
Full Ada102 should also have 500W TBP in my opinion.
So why would this imaginary N30 be a FLOP combining some of the worst parts of Vega and Fermi?
The only real disadvantage I see is the performance and perf/W with RT enabled, where the difference would be larger than 15%, but this could be mitigated by a competitive price I mentioned before.