Full AMD Polaris 10 GPU has 2304 Stream Processors

Sweepr

Diamond Member
May 12, 2006
5,148
1,142
131
PCGH: Is the P10 a 36-CU-part and no hidden additional CUs?

Evan Groenke: I can absolutely confirm with you right here, that Polaris 10 in its full configuration defined by the silicon is a 36 Compute Unit configuration there’s nothing else hidden on that product that end users might be looking forward to unlocking. This is the pinnacle, the latest and greatest of the Polaris 10 product.

www.pcgameshardware.de/AMD-Polaris-Hardware-261587/News/Interview-1200545/
 

Kenmitch

Diamond Member
Oct 10, 1999
8,505
2,249
136
Nothing wrong with paying for and getting a full chip as far as I can see. Yields must be decent at GloFo if the RX 480 is the full chip. Seems silly but I'd rather have the full chip. Guess some dead silicon could somewhat help with heat issues a little bit.
 

el etro

Golden Member
Jul 21, 2013
1,581
14
81
Looks like the problem is the process. For this power range. Rx470 itself is a much more efficient design and Polaris 11 is looking to be even more efficient. 14LPP seems to have a problem scaling over 100 Watts.


The higher-end Vega parts could be fabbed on TSMC16FF+.
 
Mar 10, 2006
11,715
2,012
126
Looks like the problem is the process. For this power range. Rx470 itself is a much more efficient design and Polaris 11 is looking to be even more efficient. 14LPP seems to have a problem scaling over 100 Watts.

Don't blame the process. NVIDIA said when it launched Pascal that if it didn't spend all that time tweaking the critical paths in the chip and just did a straight shrink, it would have gotten only about 1300MHz.

Let this serve as a lesson to those who were mocking Pascal as being a "die shrink" of Maxwell -- a lot of work goes into being able to increase frequency by ~40% generation on generation, even with a new process, while keeping power consumption low.

Pascal uArch might not have changed much from Maxwell, but the layout/implementation saw a ton of work.

There's more to delivering a good architecture than whether it supports some buzzword feature.
 

JDG1980

Golden Member
Jul 18, 2013
1,663
570
136
Don't blame the process. NVIDIA said when it launched Pascal that if it didn't spend all that time tweaking the critical paths in the chip and just did a straight shrink, it would have gotten only about 1300MHz.

Nvidia says a lot of things, not all of them true. Of course they want to play up the amount of R&D they put into Pascal, so they can justify increasing the price point for medium-sized chips yet again.

Don't forget that the increased clock speeds of Pascal do come with a cost: lower shader density. 2560 shaders for a 314mm^2 chip on 16nm FinFET really isn't that high. Nvidia presumably would have had lower clocks if they'd packed the shaders more densely on the die, as AMD did. In this particular case it appears the trade-off was worth it. But it's closer than you think. GTX 1080 peaks at ~8.9 TFlops at max default boost clock. RX 480 with its 232mm^2 die peaks at ~5.8 TFlops at max default boost. This means Polaris 10 has ~65% of the raw computing power of GP104, at ~74% of the die size. The reason Nvidia comes out much further ahead than that is because Nvidia's drivers and architecture are much better at translating TFlops into real-world gaming performance - at least in DX11.
 
Mar 10, 2006
11,715
2,012
126
Nvidia says a lot of things, not all of them true. Of course they want to play up the amount of R&D they put into Pascal, so they can justify increasing the price point for medium-sized chips yet again.

They did put a lot of R&D into Pascal, and that R&D has clearly paid off in terms of very high clocks and an efficient, compact design.

Don't forget that the increased clock speeds of Pascal do come with a cost: lower shader density. 2560 shaders for a 314mm^2 chip on 16nm FinFET really isn't that high. Nvidia presumably would have had lower clocks if they'd packed the shaders more densely on the die, as AMD did.

GP104 features more TMUs than Polaris 10 (160 vs 144), twice the ROPs (64 versus 32), and there are obviously other parts of the GPU that aren't related to shader count (Polymorph engine, Simultaneous Multiprojection block, GDDR5X controller, etc.) that may add to the area/xtor count while not ballooning shader count.

In this particular case it appears the trade-off was worth it. But it's closer than you think. GTX 1080 peaks at ~8.9 TFlops at max default boost clock. RX 480 with its 232mm^2 die peaks at ~5.8 TFlops at max default boost. This means Polaris 10 has ~65% of the raw computing power of GP104, at ~74% of the die size. The reason Nvidia comes out much further ahead than that is because Nvidia's drivers and architecture are much better at translating TFlops into real-world gaming performance - at least in DX11.

As I said above, there's more to gaming performance than just raw FLOPs.

In terms of xtor density NVIDIA put 7.2 billion xtors in a 314mm^2 area, while AMD put 5.7 billion in 232mm^2. AMD's chip has ~24.57 million transistors/mm^2, while NVIDIA's is at ~23 million/mm^2.

AMD's design is slightly denser, but the slight areal disadvantage that NVIDIA has is more than offset by the perf/mm^2 advantage that NVIDIA has.
 

coercitiv

Diamond Member
Jan 24, 2014
6,393
12,826
136
NVIDIA said when it launched Pascal that if it didn't spend all that time tweaking the critical paths in the chip and just did a straight shrink, it would have gotten only about 1300MHz.
So Nvidia said that if they hadn't tweaked the critical paths, going from 980 to 1080 frequency would have stayed more or less the same at... what.. same TDP?! Can I take this with a grain of salt?
 

JDG1980

Golden Member
Jul 18, 2013
1,663
570
136
So Nvidia said that if they hadn't tweaked the critical paths, going from 980 to 1080 frequency would have stayed more or less the same at... what.. same TDP?! Can I take this with a grain of salt?

Yeah, I can buy that a straight port of Maxwell to 16FF+ might have only gotten them to, say, 1600-1700 MHz instead of the 2000-2100 MHz they actually got, but not that they wouldn't get any gains at all.
 

el etro

Golden Member
Jul 21, 2013
1,581
14
81
Don't blame the process. NVIDIA said when it launched Pascal that if it didn't spend all that time tweaking the critical paths in the chip and just did a straight shrink, it would have gotten only about 1300MHz.

RX470 packs 87% of RX 480 performance at 2.7x times R9 290 efficiency, will be rated at 110W. Never we get such disparity between same gen/node cards efficiency. Also AMD already states they achieved even more than this with Polaris11. There IS a process problem.
 

el etro

Golden Member
Jul 21, 2013
1,581
14
81
So Nvidia said that if they hadn't tweaked the critical paths, going from 980 to 1080 frequency would have stayed more or less the same at... what.. same TDP?! Can I take this with a grain of salt?

For sure they can screw all and make the card clock not better with offering not much increase in power efficiency, we saw what happened with Fermi.
 

Abwx

Lifer
Apr 2, 2011
11,167
3,862
136
So Nvidia said that if they hadn't tweaked the critical paths, going from 980 to 1080 frequency would have stayed more or less the same at... what.. same TDP?! Can I take this with a grain of salt?

They said same frequency not TDP, but then i do not agree with this point, it s about 100% sure that frequency would had still increased substancialy, the lower density is due to 16FF+ wich has less density than Samsung s 14nm LPP, i guess that it was forgotten by Arachnotronic when he explain that higher frequency was due to said lower density.
 

Sweepr

Diamond Member
May 12, 2006
5,148
1,142
131
They did put a lot of R&D into Pascal, and that R&D has clearly paid off in terms of very high clocks and an efficient, compact design.

According to Hardware.fr their perf/watt improvement was more substantial than AMD's, and looking at where they were with Maxwell, that's significant.





Note that Tonga is more efficient than Hawaii according to TPU, and it wasn't included here.
 

geoxile

Senior member
Sep 23, 2014
327
25
91
RX470 packs 87% of RX 480 performance at 2.7x times R9 290 efficiency, will be rated at 110W. Never we get such disparity between same gen/node cards efficiency. Also AMD already states they achieved even more than this with Polaris11. There IS a process problem.

If that's true then that guy on OCN might have been right about Polaris 10 having problems. Not sure why you think it's a process problem when we've only seen one bad case and heard of one hypothetically good case.
 

Sweepr

Diamond Member
May 12, 2006
5,148
1,142
131
Note that 480 is more efficient than fury and furyx according to TPU

Only at 1080p and below. Fury is ahead and Fury X is closer at 1440p (Hardware.fr resolution).



And Nano beats regular Fury.
 
Last edited:

el etro

Golden Member
Jul 21, 2013
1,581
14
81
Only at 1080p and below. Fury is ahead and Fury X is closer at 1440p (Hardware.fr resolution).



And Nano beats regular Fury.

You are right, but RX480 achieves its maximum potential at 1080p.

Also Nano is a very special low power bin, designed to cap a ~240W card into a much lower power limit, where its efficiency shines!
 
Feb 19, 2009
10,457
10
76
It may well be, but we will know the *truth* when the Mac refresh is here. Anything from PR needs to be taken with a grain of salt as you all should be fully aware of that by now!
 

Sweepr

Diamond Member
May 12, 2006
5,148
1,142
131
It may well be, but we will know the *truth* when the Mac refresh is here. Anything from PR needs to be taken with a grain of salt as you all should be fully aware of that by now!

Sorry but no. Polaris 10 is a 36 CUs GPU.

Evan Groenke is Senior Product Manager for Polaris 10 and he was deeply involved in the development of Polaris. He should know if the full configuration of Polaris is 40 compute units (like some websites state) or 36. Groenke made absolutely clear that the latter is the case. Here is his full quote of the interview, you will find the whole audio interview at the end of the article.

PCGH: Is the P10 a 36-CU-part and no hidden additional CUs?

Evan Groenke: I can absolutely confirm with you right here, that Polaris 10 in its full configuration defined by the silicon is a 36 Compute Unit configuration there's nothing else hidden on that product that end users might be looking forward to unlocking. This is the pinnacle, the latest and greatest of the Polaris 10 product.

They could not have made it more clear than this.
 
Last edited:

3DVagabond

Lifer
Aug 10, 2009
11,951
204
106
How many times did AMD claim that Tonga only had a 256bit memory bus? Turned out not to be true.
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |