I still don't get why people insist on comparing Polaris to GP104. It's the equivalent of expecting 7850/7870 to beat 670/680.
The latest rumors have GP104 with only 2560 shaders, the same count as Polaris 10 is said to possess. If these rumors/leaks are true (and one or both of them may well be false), then I think GP104 and Polaris 10 are going to be a lot closer than a raw analysis of the die size would indicate. If both GP104 and Polaris 10 have 2560 shaders, then any advantage for the Nvidia GPU would have to come from higher shader IPC, higher clocks, or software trickery of some kind. Maxwell leveraged all three to some extent to edge out GCN. But Polaris is supposed to represent a major leap forward for AMD; we should see substantially higher shader IPC (Silverforce11 indicated that AMD even has a patent on shader hyperthreading, though we don't know whether that will make it into Polaris). We will also be seeing substantially higher clocks because of the FinFET process combined with improved power management capabilities; that said, it's likely that Pascal will still clock somewhat higher than 4th generation GCN. And Nvidia's luck on the software side is starting to run out; where they once had a substantial advantage with GameWorks and better DX11 drivers, AMD is starting to edge out victories with low-effort console ports that come pre-optimized for GCN.
Keep in mind that, on 28nm, AMD produced a 438mm^2 chip (Hawaii) that matched and in some cases beat a 561mm^2 chip from Nvidia (GK110). Then later in the 28nm cycle, Nvidia turned the tables, beating the first generation of Hawaii cards in most gaming applications with the 398mm^2 GM204 chip. Die size isn't everything. And while both AMD and Nvidia used the same TSMC 28nm process, we are seeing a divergence this time, with Nvidia going with TSMC 16FF+ while AMD will be using the Samsung/GloFo 14LPP process. The latter process is thought to be slightly denser - I've heard estimates of 2x 28nm transistor density for 16FF+ compared to 2.2x 28nm density for 14LPP.
Unless Polaris 10 goes far and above expectations but based on the supposed 232mm2 die size and what 2560 shaders, that thing would be lucky to tie or barely beat a stock Fury. I just don't see it.
R9 390 (with 2560 shaders) needs about a 23% increase in performance to catch up to Fury X at 1080p, and about a 35% boost in performance to catch up at 4K. Between architectural improvements and the substantial increase in clocks enabled by FinFET, this doesn't seem like much of a stretch. Personally, I think the most likely outcome is for Polaris 10 to fall behind Fury X a bit at 4K, but to outshine it at 1080p. (This is because of the reduced memory bus, assuming that it will be standard GDDR5 and not GDDR5X. If it's GDDR5X, then it should beat Fury X across the board.)
AMD set perf/watt increase of 2.5X over GCN1.0-1.1, NOT over Fiji. It's on their roadmap. That means slightly faster than 390X in say a 120W TDP.
Which roadmap slide are you interpreting as saying this? The one shown at Capsaicin (with 28nm -> Polaris -> Vega -> Navi) had the box for "28nm GPUs" located in late 2014. The only GPU that AMD released around that time was Tonga, which is GCN 1.2.
R9 390X performance at 120W would be rather disappointing for a full node shrink. GTX 980 can already match that in most titles at ~180W. The 390X is basically a factory overclocked part, and has one of the worst perf/watt ratings of any GPU. Pitcairn did much better on that score. Hawaii would have had excellent perf/watt if it had been run at 800-900 MHz with the memory controller at the 1250 MHz it was intended for, but we never got a SKU like that because it would have lost to GK110 and GM204 in raw performance.