You are not considering other factors, like the increased L0/L1 cache, i.e., and other architectural improvements. It depends a lot also on where the bottleneck is. The comparison above, i.e., puts RDNA3 CU vs RDNA3 CU but RDNA3 CU has double issue, giving a performance increase of 17% per CU according to AMD's words (and that is probably calculated on final FPS, not simply on FLOPs). Also, clocks on 7900XT are nominally lower than those of 6950XT. So I don't see why a reasonably clocked (2.7-2.8GHz) 7800XT cannot be on par or slightly above the 6950XT. My bet is on around 5-10% more (-20/25% respect to the 7900XT).
Those improvements you mentioned are part of 17.4% increase, and they are likely talking about WGPs not CUs or maybe WGP sees higher increase, but they didn't mention It yet.
I don't see a problem with adding that 17.4% increase to FLOPs. It looked about right for N31.
If you exclude architectural improvements of 17.4% and clocks, then N32 based on specs is ~1.5x of N22.
6950XT is 69% faster than 6700XT.
If you add specs and architecture together 1.5*1.174 = 1.761, then N32 is 4% faster at only 2583MHz than RX6950XT.
The problem is that increasing specs by 50% doesn't translate to 50% higher performance, so the question is how much performance you loose. The lost performance due to scaling needs to be compensated by clocks.
If the loss is within 10% then yes, N32 at 2.8GHz could be a few % faster.
What I want to point out is that 2.8GHz is just the shader clockspeed, the Frontend will be clocked higher, based on N31 It should be 3050 MHz.
I would like to see 3GHz for shaders and 3250-3300MHz for frontend.
Then I can see It being 10% faster or a bit more.