Interesting observation, but the 6850 at 850/1150 only beats the stock 6870 in ME2 and Civ5, so it's not universal. Also, note that the 6870 stock is 900, not 950. So, yes, in a sense, the 1150 memory clock on the 6850 comes close to making up for a 50Mhz clock deficit and fewer SMUs.
Yeah, my bad on the clocks thing. Not sure what I was thinking on that. And you're right, there is some error on the Anandtech commentary where they said the 850/1150 6850 beats the 6870 in 3/5. In reality, it wins 2, ties 1, and loses 2. Seeing as the margins are all fairly close, it's reasonable to call them the same speed.
Also keep in mind that besides the 50Mhz deficit and fewer SPs 6870 also has proportionally fewer TMUs.
So roughly, in terms of compute (and texturing I suppose), 6870 has a:
(900/850) * (1120/960) = 1.235x advantage over our 850/1150 6850.
In terms of memory bandwidth our 6850 has a:
1150/1050 = 1.095x advantage over a stock 6870.
Putting that all together, and assuming both configurations perform equally as per the Anand results, Barts is more sensitive to memory clock than it is to core by:
0.235/0.095 = 2.47 = ~2.5x
So, yeah, I was a bit off due to idiotically saying 6870 has a 950MHz core clock, but 2.5x more sensitive to memory clock is still very significant. Especially when you take into account that previous ATI designs have traditionally been much more sensitive to core clock, and that Barts already has a much lower compute/bandwidth ratio than Cypress. You're still looking at a hypothetical 512-bit Barts competing at a GTX 480/580 level assuming that kind of scaling doesn't quickly fall off.
And I don't quite agree that the 256-bit Cayman "will be completely hamstrung" on a 256-bit bus. The problem is not the width, it's the quality. If Cayman's memory bus is at least as good as the 5870s (allowing ~1250-1300Mhz clocks on the memory), it probably won't be a bottleneck).
That Cayman will be hamstrung by a 256-bit bus is the most reasonable assumption. Obviously, with Cayman looking to be a different or at least highly modified architecture anything can happen, but:
a. The most obvious reason for Barts being so bandwidth limited is that the shaders are much more efficient at game code than Cypress (which itself is horribly inefficient at game code, losing to GTX 480 despite having more texturing power and theoretically over twice as much compute) and thus have higher bandwidth requirements.
b. Even at 1400MHz, a 256-bit Cayman has only a third more bandwidth than Barts.
c. Cayman is going to have quite a few more shaders than Barts. The exact number still being up in the air. More shaders to feed = more bandwidth is needed to feed them.
d. If Cayman in indeed VLIW4, which is looking to be the case right now, then it's going to be even more efficient at game code than Barts is, somewhere much closer to Fermi in efficiency as opposed to Cypress. This is going to increase bandwidth needs per shader to keep the chip fed.
e. No matter what you clock the memory at, it's probable that a 256-bit Cayman is going to see way more than the 2.5x more memory sensitivity we currently have with Barts. Anything greater than 4x can easily be definable as "hamstrung". Heck, depending on how you would define hamstrung one might even go as far as to use that word to describe Barts.
Cayman, as other people said will be use a more complex 256-bit memory controller (not every memory controller with the same width are created equal) with faster memory on top.)
It will probably use a more complex controller, and yes, admittedly there is a good chance that it will be 256-bit. However, the reason AMD used a redwood-derived controller in Barts in the first place was because the memory controller in Cypress was already highly expensive, and that one only got them to 1.2GHz. Given the signals we've been getting from AMD, and the path the Nvidia has taken, wider, less complex memory controllers may be the way to go if you have the die size to support it.
@HurleyBird,
The 6970 is rumored to be 256bit bus width and have up to 6Ghz memory speeds.
(6ghz / 4 (GDDR5) = 1500mhz).
256bit bus width x 1.500 / 8 bits pr byte x 4 = 192.0 Gb/s memory bandwidth.
192 Gb/s is 43% more memory bandwidth than a 6870 has.
Firstly, it's unlikely in the extreme that even if 6GHz rated chips are used that the memory controller will be up to the task of running them at those speeds. And this is going by, well, pretty much every GDDR5 using graphics card that has ever been released. But even then, 43% more memory bandwidth might not even be enough to balance *Barts*, and does anyone here really expect Cayman to be less than 43% more powerful than Barts when it comes to compute/core? anyone?