Fair way to compare? So I guess we should only compare AMD CPUs to 2 year old intel chips, since the new intel chips are on a newer node? No, obviously not. You compare the best available to the best available.
There is no fair in business, products compete as soon as they are released, and if nvidia doesn't release it's next GPU for 6 months or a year it'll suffer in the meantime, that is nvidia's fault not AMD.
You are confusing comparing "products" and comparing "architectures." The comment was specifically related to "efficiency of the architectural design", not efficiency of the product.
Please re-read this part of my post more carefully.
"The only way to fairly compare the efficiency of both architectures is to place them on equal nodes."
Notice, you keep talking about comparing products, while my post was discussing architectures, not products. You can compare efficiency of SKUs/products vs. each other, even if they are on different nodes. However, the only way to compare the "architecture" itself is to compare it on the same node.
Think about it: if GCN architecture was designed on 90nm node, it would get blown away by a 40nm Cayman. So you'd conclude that Cayman architecture is way more efficient? If Sandy Bridge architecture was designed on a 65nm node, would you compare it against a Nehalem architecture on a 32nm node? You see how absurd it is to compare
architectures across different nodes and try to derive any meaningful information from that about the efficiency of the architecture? If you took the Pentium 4 architecture and put it on an 22nm node and clocked it to 20 ghz (because a lower node allows for higher transistor switching clock speeds), it'll suddenly appear far better than an Ivy Bridge architecture on 180nm. etc. etc.
You
can't directly conclude which architecture is actually more efficient across different nodes because the node differences affect:
1) Transistor density (performance/transistor);
2) Transistor power consumption (performance/watt);
3) Transistor switching speed.
Here is another way of understanding this principle. If someone suddenly put a 40nm Fermi architecture against a 28nm Fermi architecture, without any changes to the Fermi architecture by itself, we would see a dramatic improvement in performance/transistor, performance/watt and performance per clock. But in fact, the efficiency of the Fermi architecture would be EXACTLY the same.
* 28nm transistors offer up to 60% higher performance than 40nm at comparable leakage with up to 50% lower energy per switch and 50% lower static power. ~
Global Foundries
So using your premise that you can compare architectures across 2 different nodes, the 28nm Fermi architecture would be "more efficient" than a 40nm Fermi architecture. Since this conclusion is false as the Fermi architecture is actually constant in efficiency and performance, that means your premise that you can compare
architectures across 2 different nodes is incorrect. If you can't isolate the architecture as the
only variable, the comparison is not conclusive since the node shrink itself brings
3 major advantages listed above. Comparison across 2 different nodes can tell you about the efficiency of 2 products, but not much about the architectures themselves.
Yet another way of looking at it: For instance, Product A's architecture might be 20% more efficient than Product B's architecture. But if Product B's architecture is on a 28nm node (which brings 60% higher performance with 50% lower power consumption vs. a 40nm product), then architecture B will suddenly appear to be more efficient, while in fact it was the
node that made it more efficient. We can't know for sure since we are discussing 2 variables (Variable 1: node, Variable 2: architecture).
The only way to compare something to see clear causation is to keep all the variables constant and change the variable you are trying to compare. In this case, if you are comparing architectures, the
changing variable has to be the architecture.