Average user cares only (mostly) about one thing when buying new graphics card, and that is FPS/$. Of course, there are things like power consumption, noise, etc.… So if you offer to anyone e.g. 1060 3GB and 980 Ti at the same price, even those people with cheap PSUs would take 980Ti and buy new PSU.
But this is forum for PC enthusiasts and we like to talk about other stuff also. And one should be very careful when analyzing some product or comparing different architectures. Especially today, when chips are so complex, and companies are using different approaches to solve same problems.
If we would like to compare AMD’s and nVidia’s architecture, we should take chips with exactly the same configuration and clocks. But there are no such chips. So let’s see what we have.
Good thing on AMD’s side is that we have chips with 36 CUs in three generations (Tahiti, Tonga and Polaris 10). Which makes it much easier to compare them, and there is also test Tonga vs Polaris on same clock. So we see the main reason for better of performance of RX 470 is higher clock.
On nVidia’s side, it’s a little bit harder, since there are no GP and GM chips with exactly the same configuration. Though we will soon have GTX 1050Ti to compare it with GTX 950. But even without that, when we include larger (different) number of SM units with higher clocks, it seems the main reason for better Pascal performance is also higher clock (compared to Maxwell cards).
So looks like there are no significant performance per CU/SM improvements on either side, compared to previous generations.
Now we need to compare apple vs orange I really don’t know what would be the best choice. We could take reference RX 470 and GTX 980 for example, since they have very similar (boost) clocks and configuration. Though 980 has 2 times more ROPs, which leads to much higher pixel fillrate. Or maybe it would be better to take R9 390 and GTX 1080 – exactly the same configuration, but quite the difference in clocks and production process.
Let’s take a look at performance chart* we can see that GTX 980 is ~20% faster than RX 470, but is the difference caused by better architecture or double amount of ROPs (pixel fillrate)? GTX 1080 is ~70% faster than R9 390. But it is also clocked ~70% higher. Does that mean that AMD’s and nVidia’s chips have the same performance per clock?
So it looks like the main (not to say only) advantage nVidia has, is possibility to clock its GPUs much higher compared to AMDs. But since GPUs are made for parallel tasks, that problem is easily solved by adding more cores, which is what AMD is doing. Of course, that leads to larger chips, and therefore higher cost. Which doesn’t necessarily mean P10 is more expensive than GP106, since we don’t know GF’s and TSMC’s prices.
But besides gaming, GPUs are used for other tasks, so we don’t know where nVidia’s efficiency is coming from and for example, how much DP compute capabilities in GCN influence on max clocks and power consumption. But we know GCN has advantage in those tasks. How much could AMD increase clock speeds and lower TDP if they decide to cut its DP performance? I suppose efficiency of blocks for audio, video encoding, CF/SLI, display controllers, etc. is different also, though that’s only a small part of overall card efficiency.
So after all, I think it’s hard to say how much is AMD behind nVidia when it comes to chip design. And is it better for them to increase R&D funds to improve architecture, or go with brute force approach if they can get lower production price (GF vs TSMC). End customer only wants to get as much as possible FPS per dollar, and doesn’t care about what’s inside the chip.
* I really hate those charts since is impossible to make relevant one. And depending on games chosen, relative performance between 2 products can vary a lot. If in one game card A is 40% faster than card B, and in next one card B is 20% faster, it looks silly to say card A is usually 20% faster than card B. And we can’t be sure if reason for that lies in hardware or in software. You could also say – developers of game C are much better than those who developed game D. But we have to use something, and here’s one to take a look at:
https://www.computerbase.de/thema/grafikkarte/rangliste/