GP106 speculations

Page 2 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

antihelten

Golden Member
Feb 2, 2012
1,764
274
126
GP106 should be much faster than a GTX970:
GP106: 1280C x 1750Mhz x 2 = 4.5 TFLOPs
GM204: 1664C x 1200Mhz x 2 = 3.99 TFLOPs

That is alone a 15% increase of the compute performance and nearly on par with the GTX980. Combine this with 1.33x (Single-Stereo-Pass) and 1.33x (Lens Matched Shading) GP106 should be roughly 2x faster than the GTX970 with VR.

By that logic the 1080 should be about 45% faster than the 980 Ti, and yet it is only about 32% faster.

1080: 2560C x 1750Mhz x 2 = 8.96 TFLOPs
980 Ti: 2816C x 1100Mhz x 2 = 6.20 TFLOPs

8.96 / 6.20 = 1.45

So we're looking at a performance deficit of about 9% (1.32/1.45=0.91). This will more or less cancel out the 15% increase in theoretical performance, and thus GP106 should end up about even with a 970.

The various VR specific features come on top of this of course.
 

R0H1T

Platinum Member
Jan 12, 2013
2,582
162
106
The current GP106 leak from sources that is relatively trustful is, that GP106 is 6GB, 192bit, 90-100W, ~970 and ~980 performance in its 2 bins. 249-279$ for top bin. In other words, somewhat equal to Polaris 10. Just much better perf/watt and multi projection for VR.
You keep parroting that as if it's true & atm it's not? Also with the super high clocks, you know the 2GHz boost that many dreamt of, on 1080 all that efficiency gets thrown out of the window, the same reason why AIB models require additional power connectors
 

Enigmoid

Platinum Member
Sep 27, 2012
2,907
31
91
By that logic the 1080 should be about 45% faster than the 980 Ti, and yet it is only about 32% faster.

1080: 2560C x 1750Mhz x 2 = 8.96 TFLOPs
980 Ti: 2816C x 1100Mhz x 2 = 6.20 TFLOPs

8.96 / 6.20 = 1.45

So we're looking at a performance deficit of about 9% (1.32/1.45=0.91). This will more or less cancel out the 15% increase in theoretical performance, and thus GP106 should end up about even with a 970.

The various VR specific features come on top of this of course.

Clocks are a little off.

http://www.anandtech.com/show/9306/the-nvidia-geforce-gtx-980-ti-review/16

http://www.computerbase.de/2016-05/...bschnitt_bis_zu_1785_mhz_takt_unter_dauerlast

1080: 2560C x 1650Mhz x 2 = 8.45 TFLOPs
980 Ti: 2816C x 1150Mhz x 2 = 6.48 TFLOPs

The 1080 (FE) has about 30% more FLOPS. Pretty in line with performance.
 

Sweepr

Diamond Member
May 12, 2006
5,148
1,142
131
Current rumours from Chiphell/Baidu mention 60-65% Geforce GTX 1080 performance, which imply more than 1/2 GP104 IMHO.
 

khon

Golden Member
Jun 8, 2010
1,319
124
106
You have no idea of RX 480 power consumption. 150W is max board power (75W from PCIe + 75W from 6 pin). Most likely will draw a lot less.

Edit: Interesting point.
https://youtu.be/PJ5wYVu-tM8?t=562

The calculation is wrong, since that same slide show 51% utilization for the 2nd RX 480

So it shouldn't be 180/2, it should be 180/1.51 = ~120W, which would make perfect sense for a card with 1 6-pin connector, and is in line with earlier leaks of 110-130W power.
 

renderstate

Senior member
Apr 23, 2016
237
0
0
Except price of course. And GCN also can render to multiple viewports in single pass (aka marketing "simultaneous multi-projection"), even Maxwll can



Not true. Maxwell can render to more viewports at once in a single pass & at full speed, GCN can only select one viewport out of many. If you render to multiple viewports at once on GCN you have to use a geometry shader and performance goes in the gutter.

Pascal allows to render both views (left and right eye) at once at full speed. Viewports *are not* views, you can't render left and right eye at once on Maxwell because viewports only allow simple 2D transformations, while stereo rendering requires more complex transformations that only Pascal supports in one pass & at full speed.
 

renderstate

Senior member
Apr 23, 2016
237
0
0
Then why dont they do it?



Because they can't without going much slower or they would expose it in LiquidVR. GCN has always been bad at geometry shaders. There is hope RX480 fixes this since AMD is emphasizing VR so much for this part you'd expect them to have something more than great pricing to show for it.
 

khon

Golden Member
Jun 8, 2010
1,319
124
106
My guess for 1060 is that it will be half of a 1080, just like 960 was half of a 980. But this time around it will have a more adequate amount of memory.

In the matchup against RX 480, I think the 1060 will be a little more expensive, and have slightly worse performance, but have better power efficiency and feature set. Also, I think the 1060 will be more readily available, since tsmc is a much more reliable supplier than GF.
 

antihelten

Golden Member
Feb 2, 2012
1,764
274
126
The GTX1080FE doesnt run with 1750Mhz. The actual clock is way lower: http://www.computerbase.de/2016-05/...bschnitt_bis_zu_1785_mhz_takt_unter_dauerlast

GP106 will also have a 256bit memory interface (based on shipping informations from Zauba und pictures of Drive PX2).

You're the one who used the 1750 number in your original post, so I just went with that.

Besides it's only the Founders edition that throttles.

Clocks are a little off.

http://www.anandtech.com/show/9306/the-nvidia-geforce-gtx-980-ti-review/16

http://www.computerbase.de/2016-05/...bschnitt_bis_zu_1785_mhz_takt_unter_dauerlast

1080: 2560C x 1650Mhz x 2 = 8.45 TFLOPs
980 Ti: 2816C x 1150Mhz x 2 = 6.48 TFLOPs

The 1080 (FE) has about 30% more FLOPS. Pretty in line with performance.

As I said above I just used the same clocks that Sontin used, and as mentioned throttling is only an issue with the Founders edition.

Also you shouldn't mix and match review since they often use different settings and test cases.

TPU who tests on an open air bench (a case with one side removed), got 1783 MHz for the 1080 and 1128 for the 980 Ti, a theoretical increase of 43.7%. The performance gap is only 31.6% though so a performance deficit of 8.4%. Computerbase.de also tested with the temp limits increased to prevent throttling and got 1778 MHz for the 1080 and 1101 MHz for the 980 Ti a 46.8% theoretical increase in performance and an actual performance increase of 37%, so a deficit of 6.5%. Using stock settings Computebase.de shows a deficit of 5.5%

Regardless of whether the deficit is 5.5%, 6.5% or 8.4%, GP106 would still only end up about even with a 970 since basically every single 970 in existence boosts higher than the 1200 MHz number Sontin used. The lowest 970 I could find boosted to an average of 1280 MHz, which would put the theoretical advantage of GP106 at only 5%. Even using the 1200 MHz number that Sontin posted you still only end up with a 12% theoretical gap, which after the aforementioned deficit would shrink to about 3-6%.

The calculation is wrong, since that same slide show 51% utilization for the 2nd RX 480

So it shouldn't be 180/2, it should be 180/1.51 = ~120W, which would make perfect sense for a card with 1 6-pin connector, and is in line with earlier leaks of 110-130W power.

The 51% utilization number was only for the normal batches. Medium and heavy batches had 71.9% and 92.3% utilization respectively. So the average would be about 71.7% (assuming the benchmark spends an even amount of time on all batches sizes)

Secondly you can't use the utilization percentage to predict power usage in the manner that you did.

Unless something weird was going on (one gpu have a higher average load than the other or running hotter) the power usage should be identical for the two GPUs. So if they used 180W combined, then each one would use 90W on it's own.

If the average utilization is 71.7%, then at 98% utilization (what the 1080 showed for a single GPU), then the power usage would be 90W * 98% / 71.7% = 123W. This of course assumes that the RX 480 can achieve the same level of utilization as the 1080 and that power usage scales linearly with utilization, neither of which is certain.
 
Last edited:

JDG1980

Golden Member
Jul 18, 2013
1,663
570
136
Current rumours from Chiphell/Baidu mention 60-65% Geforce GTX 1080 performance, which imply more than 1/2 GP104 IMHO.

60%-65% of GTX 1080 performance is about on par with GTX 980. That could be achieved by 1280 shaders at higher clocks. A 2000 MHz boost clock would easily do it.

Remember, whatever the raw silicon specs for GP106 are, they would have been finalized long ago. Clock speeds, on the other hand, can be tweaked almost right up until the moment of release as market needs dictate.

Besides, I can't see any shader count except 1280 making much sense, given the current positioning of GP104.

GTX 1080 has 2560 shaders; GTX 1070 has 1920. The same rumor sites that accurately predicted these figures indicate that there will also be a third-tier GP104 salvage part, probably OEM-only. That part will probably have a shader count between 1536 and 1664. Any lower and it would be encroaching on GP106, any higher and there wouldn't be enough to differentiate it from GTX 1070.

In theory, they could use modified GPCs with 384 CUDA cores, and give the GP106 four such GPCs for a total of 1536 shaders. But that would result in a part that comes too close to GP104. Why would they spend extra on that die space? GM206 wasn't very popular among the enthusiasts who read Anadtech Forums, but it sold extremely well. A GP106 with GM204/Hawaii-level performance would be a good product, and would sell well even if Polaris 10 beat it in one or more significant metrics. And every mm^2 over what it takes to provide reasonably competitive performance is, from Nvidia's perspective, wasted money.
 

JDG1980

Golden Member
Jul 18, 2013
1,663
570
136
My guess for 1060 is that it will be half of a 1080, just like 960 was half of a 980. But this time around it will have a more adequate amount of memory.

I think there's a good chance we will see 3GB and 6GB versions of the full chip (assuming the 192-bit bus rumors are true). The cut-down version (GTX 1050 or whatever they call it) will probably be officially 3GB only, though a few AIBs may offer 6GB versions anyway.

In the matchup against RX 480, I think the 1060 will be a little more expensive, and have slightly worse performance, but have better power efficiency and feature set. Also, I think the 1060 will be more readily available, since tsmc is a much more reliable supplier than GF.

We'll see. I think it's going to be close, but Polaris 10 will have a slight edge in perf/watt. A lot of this depends on what clock rates AMD and Nvidia decide to use for competitive reasons. A boost clock of 2000 MHz on GP106 would at least rival, if not defeat, Polaris 10 in performance (assuming there isn't a hidden 2560 SP version in AMD's back pocket), but at that clock rate, expect to see it approach 150W and occasionally throttle if the power limit is set for 1x6-pin.

As for availability, TSMC unquestionably has a more mature process, but don't forget that Nvidia is only one of their many customers, and far from the most important. Apple always takes precedence. GloFo may have lower yields until they get their act together, but AMD has contractual first dibs to their output. There's also the possibility (though not certainty) that AMD could resort to dual-sourcing from Samsung, since the GloFo 14LPP process is just a licensed version of Samsung's.
 

sontin

Diamond Member
Sep 12, 2011
3,273
149
106
Why would Polaris 10 have an edge in perf/watt?
A GTX1070 has a 150W TDP rating and uses ~13xW in games.
 

antihelten

Golden Member
Feb 2, 2012
1,764
274
126
Why would Polaris 10 have an edge in perf/watt?
A GTX1070 has a 150W TDP rating and uses ~13xW in games.

This is the only 1070 review I've seen where they actually isolate the GPU power usage (the other sites that does this haven't posted their 1070 review yet).

They show the 1070 using 145-150W.

If RX 480 is somewhere around 75-80% the performance of 1070, then it would have to use less than 110-120W to beat 1070 in efficiency.
 

JDG1980

Golden Member
Jul 18, 2013
1,663
570
136

It peaks at almost exactly 150W. We need to compare apples to apples. Average (non-peak) gaming load is often much lower than TDP, for any GPU.

The bottom line is that we won't know complete details about perf/watt for Polaris 10 or GP106 until both are tested in the real world. I suspect for both products, clock rates were not finalized until the last minute, and this can make a big difference.
 

Stormflux

Member
Jul 21, 2010
140
26
91
If lens matched shading is effective at reducing the number of shaded pixels and if developers adopt it..even supposedly smaller SKUs based on GP106/7 might pull some VR miracle

If ANY of these Video Card specific VR technologies become widely adopted it would be interesting. However, neither the nVidias or AMDs suite of tools have really been employed enough to say these aren't just marketing points. They've far worse support than regular SLI/CF at this point. I think it's safer to just bet on raw performance.
 

renderstate

Senior member
Apr 23, 2016
237
0
0
If ANY of these Video Card specific VR technologies become widely adopted it would be interesting. However, neither the nVidias or AMDs suite of tools have really been employed enough to say these aren't just marketing points. They've far worse support than regular SLI/CF at this point. I think it's safer to just bet on raw performance.



True, but if you plan to buy a new GPU every two to three years and you are interested in VR you probably want to go with the most future proof one, although we just don't know which one yet.

People here make a big fuss about 5-10% better performance from using async compute but VR related features in Pascal could easily boost performance by 50% or more once adopted. I guess VR is not mainstream enough (yet)
 

Qwertilot

Golden Member
Nov 28, 2013
1,604
257
126
Now, I think we know it - currently most future proof for VR == fastest thing possible

I know they set the 970 as the original standard, but I imagine they'll want more than that soon enough (or ideally now) - I think they're going to scale the headset resolutions quite fast?

I'd not really be sure about even the 1080. Really high res/refresh rates probably makes for massive memory bandwidth requirements, so big Pascal/Vega etc in ~1 years time perhaps.

That's for VR gaming mind you, very unsure what the requirements for stuff like smooth VR videos/tourism/walking simulators are going to be like. That sort of stuff could very easily be where it breaks in to the mainstream.
 

sirmo

Golden Member
Oct 10, 2011
1,014
391
136
I don't think 1060 per/watt advantage is a given at all. 1080 doesn't look to be the per/watt improvement over Maxwell that rx480 is over 28nm GCN parts.

We still need more info, but if AMD's claims are true than Polaris should have a perf/watt advantage.

2x rx480s with 2x 8Gb of less efficient VRAM were both faster and more power efficient than a 1080 in that AoS demo.
 
Last edited:

JDG1980

Golden Member
Jul 18, 2013
1,663
570
136
So an average of about 135 in RotTR and Witcher 3, 146 in Risen 3, 148 in Anno 2070, and 149 in Crysis 3.

That gives an average of 143W Across those 5 games.

So the RX 480 would have to use 107-114W or less to have better efficiency (again assuming that it has 75-80% the performance of the 1070).

I think GP104 (1080/1070) may be more efficient (perf/watt) than Polaris 10. I am much more doubtful that GP106 will be. And it will be GP106 that is competing in Polaris 10's market segment. For an apples-to-apples comparison in the >300mm^2 segment, we will have to wait for Vega.

As I stated before, we saw this with Maxwell: GM204 was the most efficient 28nm GPU, with GM206 lagging behind in perf/watt. And back in the Kepler days, GTX 660 (full GK106) had a 140W TDP. I wouldn't be surprised to see the same thing again, especially since GP106 may be clocked as high as GP104 or even higher. As long as it fits on one 6-pin connector, it's not that big a deal for desktops; no one really cares much about whether it draws 125W or 150W when gaming. It's a bigger deal for laptops and other thermally-constrained designs, such as AIOs (iMac).
 

Qwertilot

Golden Member
Nov 28, 2013
1,604
257
126
Well yes, but said notebooks are probably a bigger market for these chips than desktop, so think they'll at least make sure that GP106 can be a very efficient notebook chip.

Entirely possible that both companies might them clock them outside the optimal efficiency range for desktop of course.
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |