Making draw calls multithreaded on DX11 in a way that is developer and game-engine agnostic was only possible due to the SW scheduler in the NVIDIA driver.
Yes, i have seen that.
With GCN, AMD hoped that developers would code their games such that the draw call thread is kept as free as possible so it can efficiently feed the ALUs via the command processor, ie. its hardware scheduler.
Yes, i have seen that too.
However, the SW scheduler of the NVIDIA driver has its own overhead, and this shows in comparatively recent DX11 games like COD:BLOPS 3 and Witcher 3 where it causes near-max CPU utilization across multiple cores in CPU-limited scenarios.
Yes, i have seen even that.. Though it has to be noted that such near-max CPU utilization is already present there, nV driver just makes it worse in this case.
Incidentally, the RX 480 is faster than the GTX 1060 in Titanfall 2.
*After a minute of checking* Technically correct, but i fail to see relevancy since it is hardly CPU heavy in the first place.
This is also carried over to DX12 where NVIDIA GPUs take a hit regardless of what CPUs are being used.
Go on... I mean, AMD GPUs take a hit on regular basis too, but it is a fair statement that nV GPUs do so way more often.
The only way to overcome this, as the video points out, is when developers take their time to code in a way that prevents driver bottlenecks
Yes, it is kind of twisted mirror of GCN Dx11 situation. Make no mistake, i have no issues admitting that nV's approach is flawed too, if that's your point.
examples of this being TW:W and Doom.
Wait, i listened to entire video, but that's one thing i did miss, did he mention it at some point or is that your addition?
This is basically the difference between AMD and NVIDIA software and hardware implementation that the video talks about, and if you claim that it only strengthens your claim that GCN is flawed, I suggest you rewatch it.
It does because when i claimed GCN is flawed i meant the implementation not the very idea. A video highlighting the design flaw of GCN in certain situations is addition i totally did not have in mind, but granted it does more to explaining why badly made games run worse on AMD hardware than i ever hoped to, too bad i have seen it after writing my original post.
This exact reason, for the above, is why you NEED very fast CPU, rather than slightly slower, but with higher number of cores, in most game engines that are today. But brace yourselves. Shader Model 6.0 will change this.
And here i am thinking it is because games are ultimately ALWAYS bound to a single threaded performance, or rather, performance of an event loop thread.