Originally posted by: ViRGE
Originally posted by: nullpointerus
Theory is fine, but multi-core will be necessary when manufacturing hits its limits.
Where are those 10 GHz P4's from Intel's slides?
We need multi-core in the CPU arena.
Now, GPU's are more parallel, but the progress has greatly slowed down in the last two generations. G80 has been the performance king for a LONG time, before being dethroned by a die-shrink refresh part (the Ultra), which also lasted a bit longer than anyone expected. So why has progress slowed?
Do you even understand
why Intel never produced the 10GHz P4?
Yep, it had something to do with leakage current at high frequencies. The higher they scaled the P4 cores in frequency, the less efficient they became (power used vs. power lost to heat) such that it became infeasible to manufacture 4+ GHz P4's. While the 10 GHz P4 slides sounded nice at the time, and Intel probably could have gotten individual transistors to run reliably at that frequency, putting them in a real CPU turned out to be an infeasible manufacturing problem. The design worked in theory, but not in practice.
The limit isn't complexity in a single die, it's frequency.
Don't we have a frequency problem with GPUs? The layout and function of transistors in a chip definitely affect the frequency at which the chip is feasible to manufacture, right? GPUs are massively parallel, true, but that means a LOT of transistors are packed into a single die, which sees a very high utilization under typical loads. That's why we don't have 2.6 GHz GPUs, right?
(And what about Itanium processors, which had different frequency limits due to their different die design?)
Yet we still have speed-binning of GPU cores, and overclocking, and heat issues, and all the other things that come with (relatively) high frequencies. And we're back to the same problem: ~800 MHz chips that suck down tons of power and put out lots of heat...and little to no increase in actual
performance (because nVidia and ATI are so busy focusing on reducing price and power/heat?) of a single die for a very long time (in this market).
The way I see it, here are some solutions:
- put MORE transistors onto the same die?
- significantly increase clock speed?
- transition the market to multi-die solutions?
It's seems like a manufacturing problem. In theory, doubling the number of stream processors, adding more ROPs, and doing whatever else you think should be done is EASY. In theory. But is it feasible to manufacture such chips? i.e. Are the yields and profit margins high enough? Will they be easy to cool? Energy efficient?
Last I heard, R700 is designed to be a multi-die solution across the board...
I'm no expert, but, given the current layout of these chips, 2x600 MHz GPU dies looks a heck of a lot more feasible than 1x1000 MHz GPU die, even if performance scaling of SLi/CFX could be improved (through better API and driver support) only to a reliable expectation of 70% (vs. the 30-50% we have now).
Maybe I'm just imagining the frequency wall here...it IS lower than CPUs...?
Power consumption with CMOS transistors is P=V^2 * F, where V is voltage and F is frequency. A 10ghz processor would eat 3x as much power as a 3.33ghz processor (before optimizations) solely due to the fact that it's running at 10ghz.
Generalizing what you just said (because there are so many exponential scaling problems with frequency), increases in clock frequency make manufacturing the dies exponentially more difficult, which is why 600 MHz GPUs would see much higher yields (and thus be much cheaper to produce) than 1000 MHz GPUs of the same die.
This isn't a problem with GPUs, they don't need to hit massive core speeds because everything can be done in parallel (when you're on the same die). There's no need for GPUs with multiple dice, you can just add more functional units to the current die, until you reach what you feel is the biggest die you want to have.
Add more functional units? Just like that? OK...
Won't you run into another exponential scaling problem with functional units per die?
Specifically, what is "the biggest die you want to have," and how would that impact yields when compared to multiple lower-frequency dies w/ less of those functional units?
From what I have read on the subject, microprocessor manufacturing isn't about making the
most efficient use of the number of transistors, but instead about
lowering costs and improving yields. Intel, AMD, nVidia, IBM, etc. are businesses first and technology developers second. All I was saying is that if the industry has reached a point it is becoming significantly cheaper to produce multi-die GPUs, then SLi/CF scaling will have to improve fundamentally.
Why do *you* think GPU progress has slowed?