It is also worth noting that once Folding Core 17 is out of beta, AMD will likely also have the performance advantage in
Folding@Home.
This is to be seen production code, as this is currently in beta
If you take the FahCore17 Benchmark utility build on the Open CL interface, the GTX Titan is slightly faster that the 7970. (With Cuda, the Titan is 58% faster than with its OpenCL interface). I recently added 4 x 7970 and like in other areas, the answer on performance questions depends normally on many factors, not one.
For example:
The assertation made in the first post is that AMD enjoys a lead in certain application areas due to its integer and bit instruction set better aligned to these requirements.
NVidia introduced with the GK110 a new bit instruction (available only with SM 3.5, but not available by GK104 and earlier GPU chips). It might improve the NV performance in these kind of workloads, but first, the apps developers need to upgrade their apps to utilize this extension.
Similarily, OpenCL as a standard is evolving and if IT patterns repeat here as well, the maturity of the interface and the quality of the JIT compiler in the driver will improve over time with more experience and more workloads driving the utility of this interface. This will be the case for both manufacturers.
The rather generic and more important long term question is about the future of benchmarking as an Art & Science.
Historically, benchamrk codes leveraged the 30+ years of speedup driven primarily by increase in frequency, memory hierarchy developments and better ILP. All these factors drove a huge innovation wave for 3 decades on the hardware side of IT, giving software developers basically a free ride to enjoy the performance improvements without changing their software.
We are now entering the massively parallel area and the only big way to improve system performance is by explicitly building parallelism in every SW to the maximum extent. So the speed increase in the future will most likely come from re-architected software, were the parallellism is explicitly expressed. The free ride is over.
Its visible on well known benchmarks like 3dMark. Add more CPU cores to your rig and the check the perf increase (welcome to the new software driven speed-up world), overclock the existing CPU and enjoy the increased score (this is the outgoing HW driven speed-up era)
Like with other things in life. Some technical things will be more appropriate for one worktype, and less so for others. But I assume the search for the universal champion will continue to enjoy people.
Andy
PS:
Besides performance, other utilities of a product are quite often decisive. Energy consumption, 24/7 stability, heat, price, software driver stability, etc ...
In the end, we enjoy with the current generation of GPUs a performance scale which matches for some workloads the capability of the fastest supercomputer of 12 years ago. 4 x Titan's are approximately as fast as ASCI RED, the 120m $ supercomputer of the the department of energy in 2000, listed as no1 in the Top500
4x Titan & 4 x 7970 = ca. 19.000 cores