NVIDIA Kepler GPU Speculation thread

Page 5 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Genx87

Lifer
Apr 8, 2002
41,095
513
126
It would barely be faster than the hd7970, a larger chip, and wouldn't be out until some time after the hd7970. It would hardly be exciting, unless it was priced ridiculously low.

I dont know what benchmarks you are looking at. But the anandtech review put the 7970 about midway between the 580 and 590 on most benchmarks. I would hardly call outperforming your competition by 20-30% and nearly doubling the performance of your previous generations single GPU a failure.

I dont know why people create such lofty and unrealistic expectations for GPUs.
 

rgallant

Golden Member
Apr 14, 2007
1,361
11
81
TSMC won't be up to full[2012] 28 nm cap. until march 2012 when the latest plant comes on line if it's on time , so will any products really be in full swing before june ish ?
-won't matter who has what if you can't buy them.
 

RussianSensation

Elite Member
Sep 5, 2003
19,458
765
126
Seriously if the highest end Kepler part only has gtx590 performance, it will be a fail in my book.

It would barely be faster than the hd7970, a larger chip, and wouldn't be out until some time after the hd7970. It would hardly be exciting, unless it was priced ridiculously low.

So 25-30% more performance over HD7970 would be a failure then? Because that's how much GTX590 has over the 7970 from Anandtech's Review. Considering AMD took 14 months to barely get 25% over GTX580......and Anandtech's Review shows an average of just 14%.

I think you are underestimating how fast GTX590 is and/or overestimating how fast HD7970 is over GTX580. If Kepler ~ GTX590, it would be far more impressive of an advantage over 7970 than what 7970 now has over the GTX580.

If Kepler manages to get to GTX590 performance level, that would be astounding! Check out the math below.



Crysis: Warhead
GTX590 vs. HD7970 = +27%
HD7970 vs. GTX580 = +21%

More importantly, check out the minimum framerate difference





Metro 2033
GTX590 vs. HD7970 = +18%
HD7970 vs. GTX580 = +21%



Dirt 3
GTX590 vs. HD7970 = +41%
HD7970 vs. GTX580 = +6%

And now minimum framerate difference





Shogun 2
GTX590 vs. HD7970 = +15%
HD7970 vs. GTX580 = +37%



Batman: AC
GTX590 vs. HD7970 = +26%
HD7970 vs. GTX580 = +14%
 
Last edited:

RussianSensation

Elite Member
Sep 5, 2003
19,458
765
126


Portal 2
GTX590 vs. HD7970 = +44%
HD7970 vs. GTX580 = +9%



BF3
GTX590 vs. HD7970 = +42%
HD7970 vs. GTX580 = +5%



Starcraft 2
GTX590 vs. HD7970 = +38%
HD7970 vs. GTX580 = +2%



Civ 5 (1200P bench was not available)
GTX590 vs. HD7970 = +35%
HD7970 vs. GTX580 = +13%

Average performance advantage at 1920x1200:
GTX590 vs. HD7970 = +32%
HD7970 vs. GTX580 = +14%

GTX590 and HD6990 completely dominate the HD7970. So if any GPU is as fast as those 2 at stock speeds, that would be insanely impressive.

I would hardly call outperforming your competition by 20-30% and nearly doubling the performance of your previous generations single GPU a failure.

Ya, seriously. HD7970 is only 40% faster than HD6970 and people found that to be impressive despite a $200 price hike. But suddenly beating GTX580 by anything less than 2x is unacceptable? If Kepler is "only" 60% faster than GTX580, it would already wipe the floor with the 7970.
 
Last edited:

tviceman

Diamond Member
Mar 25, 2008
6,734
514
126
www.facebook.com
Yeah maybe I was underestimating the gtx590, but I still think Kepler has a legitimate shot at being faster than it. Fermi included so many big changes, and it's potential IMO was never able to be fully realized on 40nm (hence the big overclocks the gtx580 can attain, but at the cost of a very large power draw increase). With JHH continuing to preach how much more performance per watt Kepler will have over Fermi (I'd hope some of the improvements apply not only to compute but to graphics as well), I'm hoping they will have a lot more wiggle room when it comes to clocking the chips high and keeping the power consumption at acceptable levels.
 

AtenRa

Lifer
Feb 2, 2009
14,003
3,361
136
Then there is the matter of TSMC ~1.6x (optimistic) gain.
This is from a TSMC spokes person, giveing a "optimistic" outlook on gains.

I believe you are talking about the following,



http://blog.shrinkingviolence.com/2010/02/TSMC-40nm-28nm-yield.html
TSMC’s senior vice president of R&D, Shang-yi Chiang

“There are misunderstandings that TSMC introduced RDR mainly because of the gate-last [structure]. I’d like to share with you the reason we use RDR had little to do with the choice of gate-last. The [reason] is more for the performance and for the control which will result in [better] yield and cost.”

There is a penalty to introducing this kind of restriction: gate density plummets. Chiang said a 70 per cent linear shrink generally leads to a two-fold improvement in areal density from process generation to generation. Factor in the restricted design rules, and the areal improvement is just 1.6x. “So there’s a 40 per cent layout penalty and that’s very, very significant. However, this can be recovered if we adopt a different layout style, and it’s not difficult to do,” Chiang claimed.

It’s not clear what this style looks like and how it affects standard-cell design in particular. But Chiang added: “TSMC has demonstrated that it will be very willing to share with our customers on how we did that and I’m sure there is more than one way to do that.”



AMD Cayman 2,64B transistors at 389mm2 = 6,78 transistors per mm2
AMD GCN 4,31B transistors at 365mm2 = 11,8 transistors per mm2

AMD got 1,74 higher density from TSMC 28nm, since 28nm RDR gives 1,95% density over the previous 40nm, I will guess that GCN transistors are bigger vs Caman and the reason they can operate at more than 1GHz easily.

520mm^2 on 40nm -----> 520/1.6 = 325mm^2 on 28nm.

Since a 520, has 512cores, a 1024 core, would likely be twice as big.

Okay.... 325mm^2 doubled up = 650mm^2.



Then you add in new features, that might take up more space ect..... yeah...
1024 core fermi for 680 is definately out of the picture.

With 1,95% density and by keeping the same transistor characteristics, a 28nm GF110 could be close to 260-280mm2.

Doubling the Core count will not double the die size if we keep the rest (MC, L2, Front Ent etc) the same. But because we will add 2 more Memory Controllers, plus 256kb L2, plus 2 more Render Back Ends, plus some more features, I will guess it will raise the die size by 80 to 90%.

That will make a 1024 core Kepler close to 500-550mm.



That said Nvidia doesnt need that big a increase to beat the 7970.
Im guessing 768 cores or so..... which should put it around 20-25% faster than the 7970.

NV will have to compete with its own GF110 Fermi first, Tesla/Quadro products will not look impressive with only a 25% higher performance than the previous generation. That will have a tremendous negative effect in HPC product sales. Not to mention that Desktop products will not have a big lead against the competition with only 25% higher performance over the GTX580.
 

notty22

Diamond Member
Jan 1, 2010
3,375
0
0
I agree, that design goals have to target /compelling reasons to upgrade current Fermi workstation cards.
..............below is a copy and paste................

Nvidia Kepler and Maxwell will Focus Mainly on Performance per Watt


Nvidia Kepler to arrive by 2012


Nvidia in the event of GPU Technology Conference (GTC) 2011, has revealed some details on upcoming GPUs based on Kepler and Maxwell. It is expected that Kepler will be released sometime in 2012, and will be manufactured on a 28nm process.
Nvidia’s Maxwell to Offer 16 Times Higher Performance than Fermi


Nvidia claimed that, Fermi architecture would achieve typical double precision (DP) performance of 1.5GFLOPS per watt. Also in Kepler architecture the performance per watt is increased to about 3 or 4 times, hence, we can expect 1.125TFLOPS – 1.50TFLOPS of DP performance by a chip with 250W thermal design power.
What’s new with Nvidia Kepler?


Last month, Nvidia’s CEO, Jen-Hsum Huang, revealed that Kepler will have lower power consumption than current GPUs based on Fermi architecture and is also expected to provide high energy efficiency. Kepler has promised to improve the capability for GPU computing and enhance programming features that simply double the number of stream processors. Kepler, ultimately, is expected to have a slightly different design from the current Fermi architecture in order to meet all their objectives.
Future of Nvidia Kepler and Maxwell


The specifications and performance of Kepler is not yet revealed yet, but it is expected to have features similar to Fermi. Echelon will be its future chip with 1024 cores and 10 teraflops power which will mainly focus on high performance computing. Also it will run on 128 mini-Streaming Multiprocessors (SM) each consisting of 8 cores
http://lenzfire.com/2011/12/nvidia-...l-focus-mainly-on-performance-per-watt-81669/
 

RussianSensation

Elite Member
Sep 5, 2003
19,458
765
126
notty22, great slides there. I've never seen those before. I am seeing a lot of emphasis on Performance/Watt and optimization for lower power consumption & efficiency. Perhaps NV doesn't want a 300W monster card after all.

The 1.125TFLOPS – 1.50TFLOPS of DP performance by a chip with 250W thermal design power sounds impressive for GPGPU compute applications, but consumer GPUs will for sure be capped at 1/4-1/8th of their SP. That's how NV separates their Tesla line from consumer GeForce line.

Also, we know that Fermi was a monster for GPGPU compute tasks but compared to even Cypress (HD5870), its dual precision performance was very weak:

Techreport
GTX480 Single Precision = 1.35 Tflops
GTX480 Double Precision = 168 Gflops

vs.

HD5870 Single Precision = 2.72 Tflops
HD5870 Double Precision = 544 Gflops

I think if we just start comparing dual-precision performance, that by itself won't tell us a lot about real world improvements in GPGPU computation. Afterall, Fermi was superior for GPGPU tasks vs. HD5870 with only 30% of AMDs DP performance!
 

Eisenstein

Junior Member
Dec 25, 2011
6
0
61
I agree, that design goals have to target /compelling reasons to upgrade current Fermi workstation cards.
..............below is a copy and paste................

Nvidia Kepler and Maxwell will Focus Mainly on Performance per Watt


Nvidia Kepler to arrive by 2012


Nvidia in the event of GPU Technology Conference (GTC) 2011, has revealed some details on upcoming GPUs based on Kepler and Maxwell. It is expected that Kepler will be released sometime in 2012, and will be manufactured on a 28nm process.
Nvidia’s Maxwell to Offer 16 Times Higher Performance than Fermi


Nvidia claimed that, Fermi architecture would achieve typical double precision (DP) performance of 1.5GFLOPS per watt. Also in Kepler architecture the performance per watt is increased to about 3 or 4 times, hence, we can expect 1.125TFLOPS – 1.50TFLOPS of DP performance by a chip with 250W thermal design power.
What’s new with Nvidia Kepler?


Last month, Nvidia’s CEO, Jen-Hsum Huang, revealed that Kepler will have lower power consumption than current GPUs based on Fermi architecture and is also expected to provide high energy efficiency. Kepler has promised to improve the capability for GPU computing and enhance programming features that simply double the number of stream processors. Kepler, ultimately, is expected to have a slightly different design from the current Fermi architecture in order to meet all their objectives.
Future of Nvidia Kepler and Maxwell


The specifications and performance of Kepler is not yet revealed yet, but it is expected to have features similar to Fermi. Echelon will be its future chip with 1024 cores and 10 teraflops power which will mainly focus on high performance computing. Also it will run on 128 mini-Streaming Multiprocessors (SM) each consisting of 8 cores
http://lenzfire.com/2011/12/nvidia-...l-focus-mainly-on-performance-per-watt-81669/


Somehow I'm a little skeptical about Maxwell being 16x... Leaked slides etc show Kepler to be a max of 2x faster and for Maxwell to be 16x that would require an 8x jump in just one more generation. I think 4x performance for Maxwell is more realistic...
 

Eisenstein

Junior Member
Dec 25, 2011
6
0
61
notty22, great slides there. I've never seen those before. I am seeing a lot of emphasis on Performance/Watt and optimization for lower power consumption & efficiency. Perhaps NV doesn't want a 300W monster card after all.

The 1.125TFLOPS – 1.50TFLOPS of DP performance by a chip with 250W thermal design power sounds impressive for GPGPU compute applications, but consumer GPUs will for sure be capped at 1/4-1/8th of their SP. That's how NV separates their Tesla line from consumer GeForce line.

Also, we know that Fermi was a monster for GPGPU compute tasks but compared to even Cypress (HD5870), its dual precision performance was very weak:

Techreport
GTX480 Single Precision = 1.35 Tflops
GTX480 Double Precision = 168 Gflops

vs.

HD5870 Single Precision = 2.72 Tflops
HD5870 Double Precision = 544 Gflops

I think if we just start comparing dual-precision performance, that by itself won't tell us a lot about real world improvements in GPGPU computation. Afterall, Fermi was superior for GPGPU tasks vs. HD5870 with only 30% of AMDs DP performance!



That is because AMD's so called performance was theoretical performance. VLIW5 and VLIW 4 give you mad projected performance if u assume that the compiler can optimize perfectly by packing every instruction with 5 or 4 (respectively) meaningful ops. This is rarely the case. Hence NVIDIA's SIMD is actually superior though it doesn't appear as such in these specification tables .

Edit: Also my last post doesn't seem to have appeared. Based on leaked slides

I'd say Kepler will be ~2x Fermi. So I find it hard to believe Maxwell will be 16x

NVIDIA GTX 780 vs GTR 785 looks to be an interesting match up.
 
Last edited:

Borealis7

Platinum Member
Oct 19, 2006
2,914
205
106
one thing i did read about Kepler, is that nVidia will stop listing 3 different clock speeds (Core, Shaders, Memory, when Shaders = Core X 2) and will tie down the Shader clockspeed to the Core clock and only list 2 speeds.

it's been on a major hardware site i'll find the link.

(source pending)
 

Lonbjerg

Diamond Member
Dec 6, 2009
4,419
0
0
one thing i did read about Kepler, is that nVidia will stop listing 3 different clock speeds (Core, Shaders, Memory, when Shaders = Core X 2) and will tie down the Shader clockspeed to the Core clock and only list 2 speeds.

it's been on a major hardware site i'll find the link.

(source pending)

It's been mentioned in the Beyond3D forums...
 

blastingcap

Diamond Member
Sep 16, 2010
6,654
5
76
Anybody know if Kepler can do at least 5760x1080 without the need for an adapter? Or if it does require one, what kind? I'm guessing mDP?
 

Jacky60

Golden Member
Jan 3, 2010
1,123
0
0
I reckon Kepler will demolish 7970 performance this time. Nvidia have far more money and more time than last to deliver something serious. Also the fallout from the 480 cycle should have settled ( it was a bit hot and not vastly superior to the 5870). I reckon 50% more performance than 7970 at 2560/1600.
 

Arzachel

Senior member
Apr 7, 2011
903
76
91
I reckon Kepler will demolish 7970 performance this time. Nvidia have far more money and more time than last to deliver something serious. Also the fallout from the 480 cycle should have settled ( it was a bit hot and not vastly superior to the 5870). I reckon 50% more performance than 7970 at 2560/1600.

Yeah, the 50% more performance sounds about right... but everything points to it being even more late than Fermi. If Nvidia expected Kepler to come out ~6 months after SI, they wouldn't shrink Fermi and would just ride it out on 40nm - a mature process with cheaper wafers.

You don't pay R&D for a product, that gets obsoleted by your own so quickly. I believe that either Nvidia will only release Kepler as the top end part, filling the rest of the next gen lineup with shrinks or the wait for Kepler will be "slightly" longer than some have anticipated.
 

tviceman

Diamond Member
Mar 25, 2008
6,734
514
126
www.facebook.com
Yeah, the 50% more performance sounds about right... but everything points to it being even more late than Fermi. If Nvidia expected Kepler to come out ~6 months after SI, they wouldn't shrink Fermi and would just ride it out on 40nm - a mature process with cheaper wafers.

Nvidia hasn't released any 28nm Fermi products - not on notebooks and not on desktops. The only rumors that existed regarding 28nm Fermi was for notebooks, and that rumor pointed to a December launch, but that didn't happen. Shrinking an existing GF114, GF116, or GF118 GPU would likely have been a much easier task, and quicker to market option, than AMD engineering and releasing it's new Tahiti chip. I think it's safe to say that rumor was probably BS.

Kepler in some form or another is coming out this quarter, otherwise I do not think Apple would have signed Nvidia on to deliver GPU's to their various 2012 mac models.
 

Arkadrel

Diamond Member
Oct 19, 2010
3,681
2
0
Kepler in some form or another is coming out this quarter, otherwise I do not think Apple would have signed Nvidia on to deliver GPU's to their various 2012 mac models.
Both AMD and Nvidia "rebagded" their mobile parts, to the 7xxx and 6xx card names.
However their just the same old same old... with a new name.

It just means Apple is probably gonna use the "same" old 40nm mobile GPU tech for their laptops,
as everyone else (for now).

Rumors are still saying around 6months or more, for nvidia to launch its high end (Afaik).
 

VulgarDisplay

Diamond Member
Apr 3, 2009
6,193
2
76
Anybody know if Kepler can do at least 5760x1080 without the need for an adapter? Or if it does require one, what kind? I'm guessing mDP?

By an adapter do you mean a second graphics card? To the best of my knowledge nvidia can only do "surround" resolutions with dual GPU's. AMD's eyefinity can be done on a single card.
 

bunnyfubbles

Lifer
Sep 3, 2001
12,248
3
0
Coming from a GTX580 (actually a pair of them as just one won't cut it for the likes of Skyrim/BF3, etc) the 7970 is rather disappointing (overclocking may just save it, we'll have to see) however I am certainly hoping for something along the lines of 1024 shaders, 64 ROPs, 128 TMUs, assuming a 512bit bus with GDDR5 or maybe a 256bit bus with XDR2 (some rumors seem hopeful of that but I'm very skeptical)

Nvidia hasn't released any 28nm Fermi products - not on notebooks and not on desktops. The only rumors that existed regarding 28nm Fermi was for notebooks, and that rumor pointed to a December launch, but that didn't happen. Shrinking an existing GF114, GF116, or GF118 GPU would likely have been a much easier task, and quicker to market option, than AMD engineering and releasing it's new Tahiti chip. I think it's safe to say that rumor was probably BS.

Kepler in some form or another is coming out this quarter, otherwise I do not think Apple would have signed Nvidia on to deliver GPU's to their various 2012 mac models.

With Apple moving ever away from desktop/workstation units, high end Kepler is going to be largely irrelevant to them, their decision to move to nVidia will likely have nothing to do with GK100. So unless we see die shrunk Fermi re-badged as Kepler, it really won't mean much.
 

tviceman

Diamond Member
Mar 25, 2008
6,734
514
126
www.facebook.com
With Apple moving ever away from desktop/workstation units, high end Kepler is going to be largely irrelevant to them, their decision to move to nVidia will likely have nothing to do with GK100. So unless we see die shrunk Fermi re-badged as Kepler, it really won't mean much.

Agreed that it may not have anything to do with high end Kepler but I'll bet a paycheck Apple isn't using 40nm Fermi GPU's in 2012 mac models, which means that (as I said) Kepler in some form or another is going to be released in Q1 2012. Traditionally Nvidia has always launched high end GPU's first. In fact, as far as I am aware, they have never led into a new product line with anything but the high end first. So either this will be the first time Nvidia doesn't release high end GPU's first, or current "rumors" are considerably off.
 

blackened23

Diamond Member
Jul 26, 2011
8,548
2
0
Its time for lets "google 3 benchmarks out of 500" that support my argument"! We all know certain posters do this every day and now its my turn. I can handpick a couple of benchmarks that show a 6970 faster than GTX 580 just to make a ridiculous argument on a forum, but is the 6970 faster than the 580? No.









and my favorite

 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |