Originally posted by: chizow
Originally posted by: toslat
On being CPU limited:
A simple test using two quads at two different clocks (all else same) should put this to rest
I've been linking them throughout, but its *very* obvious in any bench that uses 4GHz and some of the exotic configs out there, like Tri-SLI or CrossFireX. Anyways, here's two really good examples from Tweaktown:
4870 Crossfire @ 3GHz
4870 Crossfire @ 4GHz
Summary: Up to 1920, there is very little difference in performance between 4870CF and 4850CF and much less scaling compared to a single 4870 to 4870CF vs 4850 to 4850CF. Scaling at 2560 is less consistent, but you still see big gains meaning you are not completely GPU bottlenecked. Tweaktown also has 4850CF @ 3GHz and 4GHz and GTX 280 in SLI/Tri-SLI that echoes similar results. What you will notice is that the GTX 280 also scales very well as a single-GPU depending on CPU speed, meaning it is also CPU bottlenecked up to 1920. I'd expect similar from a faster 4870 variant up until it capped out about the same as the GTX 280.
On PLX 'bridge' and shared PCIe:
The PLX chips on x870X2 do not split lanes i.e. 16x switched ~= 8x:8x. It is wrong to use 8x:8x performance of 4850 CF to predict performance of 4870X2
IIRC the 3870X2 has a 48 lane 3 port v1.1 PEX8547 switch (not a bridge) and thus each GPU gets full 16x access when it is granted and not 8x. The advantage of this is that since the PCIe connection to the north bridge carries bursty traffic (actual traffic pattern depends on the game), the impact of latency is reduced and bandwidth preserved, as opposed to the bandwidth bottlenecking that would occur with a fixed 8x:8x split.
In the 4870X2, the new switch (possibly PEX8648) will support v2.0 which will double the available data rate (though with slight latency increase).
Also its is rumored that inter GPU traffic will improve. Not sure if inter GPU traffic went through the NB (and memory) in the 3870X2 since the switch functionally could route traffic between the GPUs. So I expect, at the least, that inter GPU traffic is switched at the PEX chip, and if possible they might have a common/duplicate memory area.
So there is no basis yet to state that the 4870X2 will be bottlenecked by the 'bridge' unless you can provide stats that show that the traffic pattern on the bus is sustained at >50% for a 4870 (single or CF),and even then show that ( 2x individual data - common data) exceeds 100% of 16x in CF
That does look to be true about the PLX switch, but that inherently assumes a normal PCIE card will be using the bus at less than 50% efficiency plus you add latency into the equation over a straight split. Here's a pretty good write-up on the 3870 and the switch, with diagram (about 1/2 way down)
Digital-Daily 3870X2 Bridge Chip. It actually seems like there's more overhead with a switch rather than a splitter regardless of available bandwidth.