The technical merits: Polaris vs. Pascal

Page 3 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

R0H1T

Platinum Member
Jan 12, 2013
2,582
162
106
In techpowerup charts 980 is significantly ahead of 980 ti, which is a much bigger chip. Performance per watt suffers when these things inevitably run into a bandwidth bottleneck, and all flagship cards do. So the size argument is dubious.

With Maxwell Nvidia did a similar thing, they released the small 750 ti first, instantly denying all claim AMD held over the budget market. Now jumping 1 and a half nodes we finally get a card that has the same (load) power efficiency as the 750 ti, from AMD. But it's pretty clear that AMD has to yet implement the kind of fine grained power management/ power gating that allowed Maxwell, to basically jump a generation without a node shrink. This isn't a matter of sizes.
We'll know more about bottlenecks with the upcoming 470, though it's not just bandwidth that holds a design or particular GPU back. Agree with the die size not being that important, the 750Ti & 980 are more efficient than anything Maxwell at 1080p.
 

Ancalagon44

Diamond Member
Feb 17, 2010
3,274
202
106
Here is something to think about - Nvidia got a lot more out of their die shrink than AMD did.

Consider the following:
GM204 - 5.2 billion transistors on a 28nm process
GP204 - 7.2 billion transistors on a 14nm process
GM200 - 8 billion transistors on a 28nm process

In terms of number of transistors, GP204 is less complicated than the GTX Titan X. However, it performs better while using less power.

Now consider the AMD scenario:
Tonga XT: 5 billion transistors on a 28nm process
Polaris 10: 5.7 billion transistors on a 14nm process
Hawaii XT: 6.2 billion transistors on a 28nm process

Polaris 10 has slightly fewer transistors than Hawaii XT, but despite being manufactured on a smaller process, does not outperform it. It can sometimes match an R390, but doesn't outright beat it. It does use less power, but does not perform better than an R390X.

So, Nvidia got more out of their die shrink/rearchitecture. They were able to drastically reduce power consumption while simultaneously improving performance over the previous generation. AMD was not able to do both at the same time, so they opted to reduce power consumption.
 

sirmo

Golden Member
Oct 10, 2011
1,014
391
136
Here is something to think about - Nvidia got a lot more out of their die shrink than AMD did.

Consider the following:
GM204 - 5.2 billion transistors on a 28nm process
GP204 - 7.2 billion transistors on a 14nm process
GM200 - 8 billion transistors on a 28nm process

In terms of number of transistors, GP204 is less complicated than the GTX Titan X. However, it performs better while using less power.

Now consider the AMD scenario:
Tonga XT: 5 billion transistors on a 28nm process
Polaris 10: 5.7 billion transistors on a 14nm process
Hawaii XT: 6.2 billion transistors on a 28nm process

Polaris 10 has slightly fewer transistors than Hawaii XT, but despite being manufactured on a smaller process, does not outperform it. It can sometimes match an R390, but doesn't outright beat it. It does use less power, but does not perform better than an R390X.

So, Nvidia got more out of their die shrink/rearchitecture. They were able to drastically reduce power consumption while simultaneously improving performance over the previous generation. AMD was not able to do both at the same time, so they opted to reduce power consumption.
Hawaii didn't have many of the features which both Maxwell and Tonga had. Memory compression and codecs for new video standards. Pascal also implements AVFS which Hawaii didn't have. So I don't know if you can draw any conclusion from it, looking purely at perf/transistor.
 

biostud

Lifer
Feb 27, 2003
18,398
4,963
136
Do you think Nvidia can add all these DX12 features to their GPU without compromising some of the pure graphics workload efficiency? I doubt it.

Maybe not, but it seems they are very far ahead in efficiency so even if they, drop by 20% they are still ahead of AMD.

And if nvidia can produce the same or better fps without these features, why would they use a lot of resources to implement it.
 

renderstate

Senior member
Apr 23, 2016
237
0
0
I don't think it's the power gating, I think it's simply the fact that Maxwell stripped a lot of the compute resources from Kepler in order to achieve the efficiency it gets in purely graphical workloads.

Example:


If you calculated perf/watt in these types of workloads Maxwell''s efficiency would be abysmal compared to Kepler and GCN.


So all AMD has to do is to remove support for fast double precision math on their gaming oriented GPUs? This is naive at best.

Clearly there is so much more to NVIDIA HW power efficiency that we don't necessarily know about and that we might never know.
 

Leadbox

Senior member
Oct 25, 2010
744
63
91
So all AMD has to do is to remove support for fast double precision math on their gaming oriented GPUs? This is naive at best.

Clearly there is so much more to NVIDIA HW power efficiency that we don't necessarily know about and that we might never know.

AMD pair up SPs for their DP, they don't use a separate removable block like nv does as far as I know. Their marketing missed another trick here, this little thing is anywhere from 9-29% faster in dx12 than both the 970 and 980 (HC review). Along with all the VR talk they really should be making more noise about that. Who buys gpus to play crysis 3 in 2016, it is these dated titles that are used to inform our opinion on whats faster or better.
 

AtenRa

Lifer
Feb 2, 2009
14,003
3,361
136
Here is something to think about - Nvidia got a lot more out of their die shrink than AMD did.

Consider the following:
GM204 - 5.2 billion transistors on a 28nm process
GP204 - 7.2 billion transistors on a 14nm process
GM200 - 8 billion transistors on a 28nm process

In terms of number of transistors, GP204 is less complicated than the GTX Titan X. However, it performs better while using less power.

Now consider the AMD scenario:
Tonga XT: 5 billion transistors on a 28nm process
Polaris 10: 5.7 billion transistors on a 14nm process
Hawaii XT: 6.2 billion transistors on a 28nm process

Polaris 10 has slightly fewer transistors than Hawaii XT, but despite being manufactured on a smaller process, does not outperform it. It can sometimes match an R390, but doesn't outright beat it. It does use less power, but does not perform better than an R390X.

So, Nvidia got more out of their die shrink/rearchitecture. They were able to drastically reduce power consumption while simultaneously improving performance over the previous generation. AMD was not able to do both at the same time, so they opted to reduce power consumption.

You got this the wrong way,

Polaris 10 is a Tonga replacement and GP104 is a GM204 replacement.

So lets see how each of them are compared against the older 28nm Chips they are replacing.


Tonga XT: 5 billion transistors on a 28nm process
Polaris 10: 5.7 billion transistors on a 14nm process = 14% more transistors

GM204 - 5.2 billion transistors on a 28nm process
GP204 - 7.2 billion transistors on a 16nm process = 38,5% more transistors

Also take in to consideration the TDP.

Tonga XT = 190W TDP
Polaris 10 = 150W TDP = -21%

vs

GM 204 = 170W TDP
GP 104 = 180W TDP = +6%

Now have a look at the reviews and see how much faster Polaris 10 (RX 480) is vs Tonga XT (R9 380X) and how much faster GP104 (GTX 1080) is vs GM204 (GTX 980).

I will take the Guru3d RX 480 review because it has all the cards we need.

http://www.guru3d.com/articles-pages/amd-radeon-r9-rx-480-8gb-review,1.html

1080p DX-11 + OpenGL

AMD

R9 380X = 561fps / 10 games = 56,1fps
RX 480 = 776fps / 10 games = 77,6fps

38,32% faster than R9 380X

Perf/watt
R9 380X = 56,1 /190W TDP = 0,29
RX 480 = 77,6 / 150W TDP = 0,51

RX 480 is 75,86% more efficient

perf/mm2
R9 380X = 56,1 / 359mm2 = 0,15
RX 480 = 77,6 / 232mm2 = 0,33

RX 480 has 122% (2.2x) higher perf/mm2

--------------------------

NVIDIA

GTX 980 = 858fps / 10 games = 85,8fps
GTX 1080 = 1348fps / 10 games = 134,8fps

57,10% faster than GTX 980

Perf/watt
GTX 980 = 85,8 /170W TDP = 0,50
GTX 1080 = 134,8 / 180W TDP = 0,74

GTX 1080 is 48% more efficient

perf/mm2
GTX 980 = 85,8 / 398mm2 = 0,21
GTX 1080 = 134,8 / 314mm2 = 0,42

GTX 1080 has 100% higher (2x) perf/mm2


---------------------------------------------------

Now, if you take the performance increase of RX 480 and GTX 1080 they got over the 28nm chips and divide it by the increase of Transistor count

RX 480 = 38,32% (increase of performance over Tonga XT) / 14% more transistors = 2.73x higher performance over Tonga XT

GTX 1080 = 57,10% (increase of performance over GM204) / 38,5% more transistors = 1,48x higher performance over GM204

ps. hope I havent made any calc mistakes
 
Last edited:

AtenRa

Lifer
Feb 2, 2009
14,003
3,361
136
From my post above it is very clear that AMD went for the higher efficiency and performance increase per transistor (we could also call it higher IPC) and NVIDIA went for the absolute max Performance to retain the top performance GPU KING of the hill crown.
 

Erenhardt

Diamond Member
Dec 1, 2012
3,251
105
101
From my post above it is very clear that AMD went for the higher efficiency and performance increase per transistor (we could also call it higher IPC) and NVIDIA went for the absolute max Performance to retain the top performance GPU KING of the hill crown.
Very interesting. You could add every metric divided by the number of xtors.

Going by your numbers amd is actually in a better position than before, which is quite the opposite to what you hear.
 

biostud

Lifer
Feb 27, 2003
18,398
4,963
136
From my post above it is very clear that AMD went for the higher efficiency and performance increase per transistor (we could also call it higher IPC) and NVIDIA went for the absolute max Performance to retain the top performance GPU KING of the hill crown.

Although you have to take in the fact that AMD had a worse starting point, so the conclusion could just as well be that the 380X has a really crappy performance/watt ratio, and so the 480 will shine, when you compare it to a turd.

Hopefully AMD will fare better with Vega, but gcn seems to be an over complicated design compared to nvidia.
 

Erenhardt

Diamond Member
Dec 1, 2012
3,251
105
101
Well, this is very misleading number play on Aternra's part.
amd managed to get 21% better perf/xtor (138.32/114)
nv managed to get 13% better perf/xtor (157,1/138,5)

But because of the base perf/xtor,:
AMD tonga 11.2 (55,1fps/5b xtors)
NV maxwell 16.5 (85,8fps/5,2b xtors)

If you apply improvements:
AMD 11,2*1,21= 13.6 which is 2,4 more fps/b xtors
NV 16,5*1,13=18.7 which is 2,2 more fps/b xtors

While amd have almost 2 times the % improvement per xtor compared to nv, yet the difference between them stayed exactly as is were previously.

Increasing a low base by 50% will sometimes yield smaller benefit than increasing a high base by 10%. It depends on the ration between those base values.
 

AtenRa

Lifer
Feb 2, 2009
14,003
3,361
136
Although you have to take in the fact that AMD had a worse starting point, so the conclusion could just as well be that the 380X has a really crappy performance/watt ratio, and so the 480 will shine, when you compare it to a turd.

Hopefully AMD will fare better with Vega, but gcn seems to be an over complicated design compared to nvidia.

The difference is that GCN is oriented more for DX-12 workloads, NVIDIAs Kepler/Maxwell and Pascal focused more in DX-11.

If you compare Tonga vs Maxwell in DX-12 then the situation is changing dramatically.
 

AtenRa

Lifer
Feb 2, 2009
14,003
3,361
136
Well, this is very misleading number play on Aternra's part.
amd managed to get 21% better perf/xtor (138.32/114)
nv managed to get 13% better perf/xtor (157,1/138,5)

But because of the base perf/xtor,:
AMD tonga 11.2 (55,1fps/5b xtors)
NV maxwell 16.5 (85,8fps/5,2b xtors)

If you apply improvements:
AMD 11,2*1,21= 13.6 which is 2,4 more fps/b xtors
NV 16,5*1,13=18.7 which is 2,2 more fps/b xtors

While amd have almost 2 times the % improvement per xtor compared to nv, yet the difference between them stayed exactly as is were previously.

Increasing a low base by 50% will sometimes yield smaller benefit than increasing a high base by 10%. It depends on the ration between those base values.

As it is obvious, my analysis above is only made for DX-11 + one OpenGL game (Doom). If you compare in DX-12 the perf/ transistor is changing dramatically.
 

Headfoot

Diamond Member
Feb 28, 2008
4,444
641
126
You got this the wrong way,

Polaris 10 is a Tonga replacement and GP104 is a GM204 replacement.

So lets see how each of them are compared against the older 28nm Chips they are replacing.


Tonga XT: 5 billion transistors on a 28nm process
Polaris 10: 5.7 billion transistors on a 14nm process = 14% more transistors

GM204 - 5.2 billion transistors on a 28nm process
GP204 - 7.2 billion transistors on a 16nm process = 38,5% more transistors

Also take in to consideration the TDP.

Tonga XT = 190W TDP
Polaris 10 = 150W TDP = -21%

vs

GM 204 = 170W TDP
GP 104 = 180W TDP = +6%

Now have a look at the reviews and see how much faster Polaris 10 (RX 480) is vs Tonga XT (R9 380X) and how much faster GP104 (GTX 1080) is vs GM204 (GTX 980).

I will take the Guru3d RX 480 review because it has all the cards we need.

http://www.guru3d.com/articles-pages/amd-radeon-r9-rx-480-8gb-review,1.html

1080p DX-11 + OpenGL

AMD

R9 380X = 561fps / 10 games = 56,1fps
RX 480 = 776fps / 10 games = 77,6fps

38,32% faster than R9 380X

Perf/watt
R9 380X = 56,1 /190W TDP = 0,29
RX 480 = 77,6 / 150W TDP = 0,51

RX 480 is 75,86% more efficient

perf/mm2
R9 380X = 56,1 / 359mm2 = 0,15
RX 480 = 77,6 / 232mm2 = 0,33

RX 480 has 122% (2.2x) higher perf/mm2

--------------------------

NVIDIA

GTX 980 = 858fps / 10 games = 85,8fps
GTX 1080 = 1348fps / 10 games = 134,8fps

57,10% faster than GTX 980

Perf/watt
GTX 980 = 85,8 /170W TDP = 0,50
GTX 1080 = 134,8 / 180W TDP = 0,74

GTX 1080 is 48% more efficient

perf/mm2
GTX 980 = 85,8 / 398mm2 = 0,21
GTX 1080 = 134,8 / 314mm2 = 0,42

GTX 1080 has 100% higher (2x) perf/mm2


---------------------------------------------------

Now, if you take the performance increase of RX 480 and GTX 1080 they got over the 28nm chips and divide it by the increase of Transistor count

RX 480 = 38,32% (increase of performance over Tonga XT) / 14% more transistors = 2.73x higher performance over Tonga XT

GTX 1080 = 57,10% (increase of performance over GM204) / 38,5% more transistors = 1,48x higher performance over GM204

ps. hope I havent made any calc mistakes

Definitely interesting metrics to keep in mind. Tonga was a turd though so I'm sure that's bumping the numbers some -- but at the same time, this is the Tonga successor so it makes the most sense in a pure product slot standpoint. Good post.
 

sirmo

Golden Member
Oct 10, 2011
1,014
391
136
Tonga was the best selling card in its price range in the last two quarters.
 

Irenicus

Member
Jul 10, 2008
94
0
0
I don't think it's the power gating, I think it's simply the fact that Maxwell stripped a lot of the compute resources from Kepler in order to achieve the efficiency it gets in purely graphical workloads.

Example:


If you calculated perf/watt in these types of workloads Maxwell''s efficiency would be abysmal compared to Kepler and GCN.

This is what I've heard before too. Discussed here with David Kanter at 11:18

https://www.youtube.com/watch?v=v_eRwxqhAGo#t=11m18s


Maxwell seems to have offloaded the scheduling of tasks largely to the driver, or much less via the gpu hardware. GCN had dedicated hardware scheduler before, and polaris has two hardware schedulers does it not?

Any task that took advantage of this vs maxwell had an advantage because of maxwells crippled performance with those workloads (one of many reasons anyone recommending 970s over an rx 480 is pissing down peoples throats if they chose a badly aging maxwell part like that over something much more modern).

I have read/heard that pascal has better and more fine grained preemption, but I still don't think it has the same type of power hungry hardware schedulers built onto the chip like amd does. For this reason alone, we should all expect gcn cards to use more power than an equivalent nvidia part, how much more I do not know, but it would be surprising for any of them to use less.

That is tolerable IF you have some tangible gain over nvidia parts under certain workloads. They did vs maxwell, but with the new tricks pascal is capable of executing, does it still have some advantage? If so what? And if there is such an advantage, why don't we see pascal suffer as much as maxwell? Is it the better software/hardware preemption? Is it just the much higher clocks masking the weak points?


These are the questions that need to be answered. If the gains almost entirely rely on some game developer making heroic efforts to add advanced and aggressive concurrent graphics/asynch workloads, then who knows whether we will see the same sorts of advancements against pascal we did vs maxwell.
 

know of fence

Senior member
May 28, 2009
555
2
71
To adjust for transistor delta, divide performance by transistor ratio:
1.8x / 1.26 % = 1.43x more performance

1120/1266 MHz (Radeon 480X)
1607/1733 MHz (GTX 1080)
Base/Boost
1.43x /1.37x times higher clocks

It's funny how the result is exactly 1.43x.

I have to correct myself there Nvidias Base clock is actually more of a typical average (according to Nvidia's Tom Petersen on PCper), and base is minimum clock. AMD base is typical average while boost is absolute maximum.

Maximum Clock (AMD Boost)
Typical Average (AMD Base) (Nvidia Boost)
Minimum Clock (Nvidia Base)

So the clock difference between the competitors is 1120 to 1733 Mhz or 1.55x
 

tviceman

Diamond Member
Mar 25, 2008
6,734
514
126
www.facebook.com
You got this the wrong way,

Polaris 10 is a Tonga replacement and GP104 is a GM204 replacement.

So lets see how each of them are compared against the older 28nm Chips they are replacing.

No, Polaris 10 replaces pitcairn insofar as die size, hierarchy, and power consumption (well, maybe not so much power consumption but that's because AMD can never get things just right on a technical level).

Tonga was an oddball turd meant to replace Tahiti, and P10 is no "Tahiti" of finfet.
 
Last edited:
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |