* Unofficial 3870 X2 Reviews Thread *

taltamir · Jan 23, 2008

cost. GDDR4 is much more expensive, and adds very little performance.

pcslookout · Jan 23, 2008

Originally posted by: ArchAngel777
My biggest bone with this card and the GX2 is going to be with minimum frame rates. When I was checking out the 3870 Xfire reviews, I noticed the minimum frame rates were horrendous, while the average frame rates were just fine. I could care less about average frame rate, what I really wanted to know is how low does it go, and how often.

:thumbsup:

thilanliyan · Jan 23, 2008

Originally posted by: ArchAngel777
My biggest bone with this card and the GX2 is going to be with minimum frame rates. When I was checking out the 3870 Xfire reviews, I noticed the minimum frame rates were horrendous, while the average frame rates were just fine. I could care less about average frame rate, what I really wanted to know is how low does it go, and how often.

From the InsideHW review the minimum frames took a decent jump (at least in Crysis) over a single 3870 so it's not all bad.

Demoth · Jan 23, 2008

"I disagree with the assertion that SLI and CF are in their infancy. SLI is approaching 4 years old and CF is right behind that. Both companies have been through numerous hardware revisions (NV40, G70, G80, G90; R400, R500, R600) in the time to allow themselves to optimize things on the hardware side. And you know what? The implementation still sucks."
--------------------------------------

For the most part, I agree with you. The reason I consider SLI and Crossfire to be in their infancy is because they have had very little by way of resources or manpower put into developement and have been mainly used as a marketing ploy. However, in certain games and CAD applications, SLI/Crossfire shows almost a 100% increase. If the resources had been put into dual GPU solutions, we would likely be seeing a 3 to 4 ratio across the board in added GPU to performance enhancement.

By most accounts, it looks like ATI (AMD) is putting everything into multi-core platforms. Ultimately, because of the way a GPU can scale up well within a single application when running numerous cores, it is possible we will eventually see 100s of micro-processors on high end cards, fewer for mid-range and just a few of the same design running low end integrated video.

AMD has pretty much already told their investors they are planning on a route like this and even if the 3870 X2 itself doesn't sell too well, it will have served it's purpose as a test bed for future cards.

taltamir · Jan 23, 2008

Originally posted by: Demoth
"I disagree with the assertion that SLI and CF are in their infancy. SLI is approaching 4 years old and CF is right behind that. Both companies have been through numerous hardware revisions (NV40, G70, G80, G90; R400, R500, R600) in the time to allow themselves to optimize things on the hardware side. And you know what? The implementation still sucks."
--------------------------------------

For the most part, I agree with you. The reason I consider SLI and Crossfire to be in their infancy is because they have had very little by way of resources or manpower put into developement and have been mainly used as a marketing ploy. However, in certain games and CAD applications, SLI/Crossfire shows almost a 100% increase. If the resources had been put into dual GPU solutions, we would likely be seeing a 3 to 4 ratio across the board in added GPU to performance enhancement.

By most accounts, it looks like ATI (AMD) is putting everything into multi-core platforms. Ultimately, because of the way a GPU can scale up well within a single application when running numerous cores, it is possible we will eventually see 100s of micro-processors on high end cards with just a few of the game design running low end integrated video.

AMD has pretty much already told their investors they are planning on a route like this and even if the 3870 X2 itself doesn't sell too well, it will have served it's purpose as a test bed for future cards.

We already have 100s of processors on a single card... the 3870 has 320 stream processors on each die, as well as ROPs etc... Each one individually processes data!
The whole point isn't multiple processors/cores, its multiple GPU DIE or even multiple video CARDS.

Demoth · Jan 23, 2008

I am not talking about simply cell like individual processors. I am alluding to each micro-processor being a fully functional GPU on it's own with each additional processor effectively doubling the effective speed of the whole.

Now those 320 stream processors you are alluding to are simply a function of the whole and that function is being extremely bottle-necked by very primitive design in other areas of the card.

Also, my conjecture is looking decades down the road. For the purposes of gamers today, we will be seeing quad, octo and maybe deca GPU cards in the next 5 years. Die shrinks eventually start to become much more expensive then the savings once your down below 22nm. Nano tech will eventually solve this issue, but for now it seems like ATI concentrating on crossfire type technology is a good move.

taltamir · Jan 23, 2008

It is highly inefficient to have whole duplicate processors instead of function specific ones... the stream processors are always busy, but what about memory crossbars, what about L2 cache? what about communications part of the processor. There are so many aspected that are just wasted silicon that happen when you simple add more processors. On the other hand with specific increases (ie, more stream processors) you can increase the aspect most bottlenecked in the GPU. giving greater performance.

BFG10K · Jan 23, 2008

It looks like the official release date has been pushed back to the 28th. :thumbsdown:

nullpointerus · Jan 23, 2008

Originally posted by: taltamir
It is highly inefficient to have whole duplicate processors instead of function specific ones... the stream processors are always busy, but what about memory crossbars, what about L2 cache? what about communications part of the processor. There are so many aspected that are just wasted silicon that happen when you simple add more processors. On the other hand with specific increases (ie, more stream processors) you can increase the aspect most bottlenecked in the GPU. giving greater performance.

Theory is fine, but multi-core will be necessary when manufacturing hits its limits.

Where are those 10 GHz P4's from Intel's slides?

We need multi-core in the CPU arena.

Now, GPU's are more parallel, but the progress has greatly slowed down in the last two generations. G80 has been the performance king for a LONG time, before being dethroned by a die-shrink refresh part (the Ultra), which also lasted a bit longer than anyone expected. So why has progress slowed?

People are looking at this relative dearth of new high-end parts and wondering whether it's cheaper to make simpler GPU dies and combine the working ones into more powerful multi-GPU cards. Basically, that's the same reason we are going multi-core in CPUs (and getting irritated when new games don't fill all our cores): manufacturing limits.

What would have to happen is a fundamental change to SLi/CF that makes the multi-GPU scaling more automatic and predictable. If the success rate went up and there were less related support incidents, going multi-GPU will make more sense to the cost-concious end-user. Right now, SLi/CF only have acceptable price-performance ratios starting at around $450 (even though a few people do run X1300's in CF for no good reason).

What about competition?
Wouldn't you like a *real* price/performance fight:
2x9800GS vs. 1x9800GTX?
2x9600GT vs. 1x9800GT?
2x9600GS vs. 1x9800GS?

Right now, it would almost certainly be no contest. A slightly less-powerful (spec-wise) single card solution is always superior. It would draw less power, produce less heat, have less problems, and probably be faster in most real-world cases anyway.

I wonder if that's the real reason we haven't seen better SLi/CF support. Neither nVidia nor ATi would benefit from generating new sources of competition between their own low/mid-range parts, where the profit margins are much thinner than the high-end stuff. Or maybe with 4 years of TWIMTBP games, SLi/CF is more of an afterthought in the game design process...and in next-gen Direct3D API design...for some technically valid reason.

Oh, well...just random thoughts...I'm not interested in arguing about this. *yawn*

synic · Jan 24, 2008

Originally posted by: taltamir

Originally posted by: Puffnstuff
I'll hang on to my 8800gts 512 which has consistant performance. Thanks for the links.

Click to expand...

Also what I have... interestingly enough, the 8800GTS 512 beats it in a few games. but usually the X2 comes on top.

Like someone else here said, it's the minimum frame-rates that are important not the average. If I'm playing Oblivion and there are seventeen enemies on screen and I'm playing at 1920x1200 and HDR enabled, then I sure as hell don't want 5 FPS.

ViRGE · Jan 24, 2008

Originally posted by: nullpointerus
Theory is fine, but multi-core will be necessary when manufacturing hits its limits.

Where are those 10 GHz P4's from Intel's slides?

We need multi-core in the CPU arena.

Now, GPU's are more parallel, but the progress has greatly slowed down in the last two generations. G80 has been the performance king for a LONG time, before being dethroned by a die-shrink refresh part (the Ultra), which also lasted a bit longer than anyone expected. So why has progress slowed?

Do you even understand why Intel never produced the 10GHz P4? The limit isn't complexity in a single die, it's frequency. Power consumption with CMOS transistors is P=V^2 * F, where V is voltage and F is frequency. A 10ghz processor would eat 3x as much power as a 3.33ghz processor (before optimizations) solely due to the fact that it's running at 10ghz.

This isn't a problem with GPUs, they don't need to hit massive core speeds because everything can be done in parallel (when you're on the same die). There's no need for GPUs with multiple dice, you can just add more functional units to the current die, until you reach what you feel is the biggest die you want to have.

apoppin · Jan 24, 2008

Do you even understand why Intel never produced the 10GHz P4? The limit isn't complexity in a single die, it's frequency. Power consumption with CMOS transistors is P=V^2 * F, where V is voltage and F is frequency. A 10ghz processor would eat 3x as much power as a 3.33ghz processor (before optimizations) solely due to the fact that it's running at 10ghz.

So what happened? Didn't intel's engineers understand these very basic concepts before boldly stating that 10Ghz was their *goal* for p4?

--or did marketing shush them up in an internal political decision?

nullpointerus · Jan 24, 2008

Originally posted by: ViRGE

Originally posted by: nullpointerus
Theory is fine, but multi-core will be necessary when manufacturing hits its limits.

Where are those 10 GHz P4's from Intel's slides?

We need multi-core in the CPU arena.

Now, GPU's are more parallel, but the progress has greatly slowed down in the last two generations. G80 has been the performance king for a LONG time, before being dethroned by a die-shrink refresh part (the Ultra), which also lasted a bit longer than anyone expected. So why has progress slowed?

Click to expand...

Do you even understand why Intel never produced the 10GHz P4?

Yep, it had something to do with leakage current at high frequencies. The higher they scaled the P4 cores in frequency, the less efficient they became (power used vs. power lost to heat) such that it became infeasible to manufacture 4+ GHz P4's. While the 10 GHz P4 slides sounded nice at the time, and Intel probably could have gotten individual transistors to run reliably at that frequency, putting them in a real CPU turned out to be an infeasible manufacturing problem. The design worked in theory, but not in practice.

The limit isn't complexity in a single die, it's frequency.

Don't we have a frequency problem with GPUs? The layout and function of transistors in a chip definitely affect the frequency at which the chip is feasible to manufacture, right? GPUs are massively parallel, true, but that means a LOT of transistors are packed into a single die, which sees a very high utilization under typical loads. That's why we don't have 2.6 GHz GPUs, right?

(And what about Itanium processors, which had different frequency limits due to their different die design?)

Yet we still have speed-binning of GPU cores, and overclocking, and heat issues, and all the other things that come with (relatively) high frequencies. And we're back to the same problem: ~800 MHz chips that suck down tons of power and put out lots of heat...and little to no increase in actual performance (because nVidia and ATI are so busy focusing on reducing price and power/heat?) of a single die for a very long time (in this market).

The way I see it, here are some solutions:

- put MORE transistors onto the same die?
- significantly increase clock speed?
- transition the market to multi-die solutions?

It's seems like a manufacturing problem. In theory, doubling the number of stream processors, adding more ROPs, and doing whatever else you think should be done is EASY. In theory. But is it feasible to manufacture such chips? i.e. Are the yields and profit margins high enough? Will they be easy to cool? Energy efficient?

Last I heard, R700 is designed to be a multi-die solution across the board...

I'm no expert, but, given the current layout of these chips, 2x600 MHz GPU dies looks a heck of a lot more feasible than 1x1000 MHz GPU die, even if performance scaling of SLi/CFX could be improved (through better API and driver support) only to a reliable expectation of 70% (vs. the 30-50% we have now).

Maybe I'm just imagining the frequency wall here...it IS lower than CPUs...?

Power consumption with CMOS transistors is P=V^2 * F, where V is voltage and F is frequency. A 10ghz processor would eat 3x as much power as a 3.33ghz processor (before optimizations) solely due to the fact that it's running at 10ghz.

Generalizing what you just said (because there are so many exponential scaling problems with frequency), increases in clock frequency make manufacturing the dies exponentially more difficult, which is why 600 MHz GPUs would see much higher yields (and thus be much cheaper to produce) than 1000 MHz GPUs of the same die.

This isn't a problem with GPUs, they don't need to hit massive core speeds because everything can be done in parallel (when you're on the same die). There's no need for GPUs with multiple dice, you can just add more functional units to the current die, until you reach what you feel is the biggest die you want to have.

Add more functional units? Just like that? OK...

Won't you run into another exponential scaling problem with functional units per die?

Specifically, what is "the biggest die you want to have," and how would that impact yields when compared to multiple lower-frequency dies w/ less of those functional units?

From what I have read on the subject, microprocessor manufacturing isn't about making the most efficient use of the number of transistors, but instead about lowering costs and improving yields. Intel, AMD, nVidia, IBM, etc. are businesses first and technology developers second. All I was saying is that if the industry has reached a point it is becoming significantly cheaper to produce multi-die GPUs, then SLi/CF scaling will have to improve fundamentally.

Why do *you* think GPU progress has slowed?

Cookie Monster · Jan 24, 2008

Good example is how the G71 turned out to be a smaller die shrink of the G70 with less transistors while performing a little better (not to mention having a higher clocks), instead of a 32 pipeline monster.

BFG10K · Jan 28, 2008

I've removed the crappy reviews from the OP and replaced them with better ones. :thumbsup:

Zap · Jan 28, 2008

Originally posted by: tuteja1986
Anyways now they just need to sell it at $450 and it will be a good seller.

Coincidentally, Newegg has four different brands in stock at $449.

daveybrat · Jan 28, 2008

Wow, the card looks pretty sweet in the latest Anandtech review that just came out. Almost as fast as an 8800GT SLI setup in most of the games tested. And a crossfire motherboard is not required.

happy medium · Jan 28, 2008

Originally posted by: BFG10K
I've removed the crappy reviews from the OP and replaced them with better ones. :thumbsup:

Bfg10k , I have 10 or so sites in my other thread. Feel free to take them and add to your list. I didn't see this thread. They are all dated for the 28th of Jan.

http://forums.anandtech.com/me...=2147532&enterthread=y

* Unofficial 3870 X2 Reviews Thread *

taltamir

Lifer

pcslookout

Lifer

thilanliyan

Lifer

Demoth

Senior member

taltamir

Lifer

Demoth

Senior member

taltamir

Lifer

BFG10K

Lifer

nullpointerus

Golden Member

synic

Member

ViRGE

Elite Member, Moderator Emeritus

apoppin

Lifer

nullpointerus

Golden Member

Cookie Monster

Diamond Member

BFG10K

Lifer

Zap

Elite Member

daveybrat

Elite Member

happy medium

Lifer

TRENDING THREADS

*** Unofficial 3870 X2 Reviews Thread ***

Lifer

Lifer

Lifer

Senior member

Lifer

Senior member

Lifer

Lifer

Golden Member

Member

Elite Member, Moderator Emeritus

Lifer

Golden Member

Diamond Member

Lifer

Elite Member

Elite Member

Lifer

* Unofficial 3870 X2 Reviews Thread *