AMD's next GPU uarch is called "Polaris"

Page 16 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Kenmitch

Diamond Member
Oct 10, 1999
8,505
2,249
136
I have been having a lot of fun reading your posts in this thread and I know this will sound crazy to most, but knowing how marketing thinks, this might actually be a clue.

Very much out there, but still fun to speculate.

Polaris, a double star that appears as a single one.

Correction, a multiple star

Speculation allows one to think outside of the box.

Box = known to exist things

Without thinking out of the box no new fun and exciting things will be invented.

The future is unpredictable in the end.

A crazy sounding post could be all that is needed to trigger a out of the box experience.
 

cbn

Lifer
Mar 27, 2009
12,968
221
106
Just bringing us back to earth for a moment.

What do we know:
2 new 14nm GPU die this year [Raja Koduri]
1st die released probably = 100-110 mm^2 [GTX950 price range @PcPer]

Infer:
A 250mm^2 die needed to equal FuryX assuming some architectural gains
A 300 mm^2 die needed to give FuryX + 20%
Implies 100mm^2 and 300mm^2 as the two new die designs

Problems:
A huge gap between them [worse ratio than R7 260 : R9 290X]
Lots of value wasted in die harvesting

Conclusion:
Unrealistic for cash poor AMD
Missing some important information

Suggested Solution:
100-110 mm^2 Gddr5 die
200-225 mm^2 HBM die
Interposer multi-die approach for high end market [Fury interposer is big enough]



Actually its well known that 16/14nm yields for die sizes above 200 sq mm are really bad. In fact only TSMC 16FF+ is going to be feasible for 300+ sq mm GPUs in terms of yields. But TSMC will be capacity constrained as demand far outstrips supply at TSMC 16FF+. Moreover TSMC will give first priority for Apple A9, A9X and A10/A10X (Q3 2016 release) . Nvidia and AMD are better served by using 16FF+ for selling high performance USD 300+ GPUs. AMD's choice to thus go with two GPU dies - a 110 - 120 sq mm low power GPU die fabbed at GF 14LPP and a high performance 300 sq mm GPU die fabbed at TSMC 16FF+ makes sense.

My guess is the the low power GPU specs will be a
R7 470 - 768 sp,
R7 470X - 1024 sp, 1 geoometry engine, 1 raster engine, 32 ROP, 128 bit memory bus 8 Ghz GDDR5

The performance will be on par with GTX 960 for the fully enabled SKU and GTX 950 for the salvage SKU.

the high performance GPU using HBM2 will power 4 SKUs as I expect yields to be really bad for 300 sqmm GPUs in 2016. I think there is going to be heavily salvaged SKUs in 2016 to fill the product stack. We will see a dedicated mid range chip in 2017 once yields are much better.

R9 490X - 4096 sp, 4 geometry engines, 4 raster engines, 128 ROPs, 2048 bit HBM2 , 512 GB/s, 8 GB.

R9 490 - 3072 sp, 4 geometry engines, 4 raster engines, 128 ROPs, 2048 bit HBM2 , 512 GB/s, 8 GB.

R9 480x - 2048 sp, 2 geometry engines, 2 raster engines, 64 ROPs, 1024 bit HBM2, 256 GB/s, 4 GB.
R9 480 - 1792 sp, 2 geometry engines, 2 raster engines, 64 ROPs, 1024 bit HBM2, 256 GB/s, 4 GB.

AMD's approach makes a lot of sense as they use GF 14LPP to serve the high volume GPU market as AMD has a WSA to meet. GF will be able to yield a 110-120 sq mm die reasonably well enough and AMD can try and push as much volume as possible from GF 14LPP. TSMC 16FF+ will be used for the bleeding edge GPUs of 2016.

I expect 4th gen GCN to have significant improvements in perf/sp and thus I think we can expect a 25-30% faster flagship R9 490X GPU compared to Fury X. I think Nvidia will come out with a faster GPU as Maxwell already has impressive perf/cc and Pascal should bring more. I think the Nvidia GPU will be 10% faster than AMD's flagship GPU.



I guess we're on opposite ends of this.

You actually see <50% shaders as a harvested die? Wow. I would never have considered that.

My belief is that we are still mentally trapped in the old world of monolithic designs. The use of interposers radically change the old design limits. Interposers are NOT PCBs.

With your above example, the 490X is a bit better than FuryX, maybe 20-25% performance wise, not power.

Two 200-225mm^2 die should allow a 490X to be 80 % better than FuryX AND maybe cost the same as a 300mm^2 one, if your prediction of terrible yields as you approach 300mm^2 is true.
You will need an interposer and HBM in both cases, and the present one in Fury is big enough for (2) 225 GPU die.

IF AMD wants to regain market share, they can't dance with Nvidia. They have to be clear in front. Raja must know this.


Also, if you see 300mm^2 as being poor yielding at first, why are so many Nvidia fans expecting a big GP100 early?

How much do you think Fiji on GF 28nm factors in here?

I am thinking in the early FinFET generation it might still exist for two at least reasons (and possibly three):

1. I'm sure AMD has accumulated some harvested dies by now (beyond what we see with R9 Fury).

2. Harvested chips would still make nice Nano type cards.

3. GF 28nm WSA?
 
Last edited:

Timmah!

Golden Member
Jul 24, 2010
1,463
729
136
In MrTeal's defense, his next paragraph conceded that you could have a massively parallel interface between die.

1) All of the elements are in place for this as they have never been in the past. Interposer tech. Synthesizable IP blocks.
2)The improvement in yield of a 200mm^2 die over a 300mm^2 die [to be heavily harvested] would possibly be cheaper enough to allow a lower perf/mm^2 and still lower costs
3) A separate small and mid/big die strategy with the same IP blocks but physically arranged differently.
4) Small die as shown for notebooks and low end discrete.
Finally:
Anyone here sees an AMD market-share comeback soon if they don't take risks?
In any case are the risks so great as one might think at first?
All of the tech is there and has been tested. Remember the game console SOCs gave them a lot of experience in mixing and integrating IP blocks to suit the desired product.

I have been having a lot of fun reading your posts in this thread and I know this will sound crazy to most, but knowing how marketing thinks, this might actually be a clue.

Very much out there, but still fun to speculate.

Polaris, a double star that appears as a single one.

Correction, a multiple star

Does that mean multiple dies on single PCB? :O
 
Last edited:

xpea

Senior member
Feb 14, 2014
449
150
116
Also, if you see 300mm^2 as being poor yielding at first, why are so many Nvidia fans expecting a big GP100 early?
because Nvidia won a government contract to power 2016 NOAA supercomputer with Pascal:
http://www.anandtech.com/show/9791/...ation-to-build-tesla-weather-research-cluster
Then, by 2017, you will have VOLTA / POWER PC ORNL Summit and LLNL Sierra FLOPS monsters (up to 300 peta FLOPS on Summit).
Nvidia cannot loose their market share in HPC, especially with intel KNL around the corner, thus the big daddy GP100 coming first
 

MrTeal

Diamond Member
Dec 7, 2003
3,586
1,747
136
How much do you think Fiji on GF 28nm factors in here?

I am thinking in the early FinFET generation it might still exist for two at least reasons (and possibly three):

1. I'm sure AMD has accumulated some harvested dies by now (beyond what we see with R9 Fury).

2. Harvested chips would still make nice Nano type cards.

3. GF 28nm WSA?

Isn't Fiji TSMC?
 

Headfoot

Diamond Member
Feb 28, 2008
4,444
641
126
because Nvidia won a government contract to power 2016 NOAA supercomputer with Pascal:
http://www.anandtech.com/show/9791/...ation-to-build-tesla-weather-research-cluster
Then, by 2017, you will have VOLTA / POWER PC ORNL Summit and LLNL Sierra FLOPS monsters (up to 300 peta FLOPS on Summit).
Nvidia cannot loose their market share in HPC, especially with intel KNL around the corner, thus the big daddy GP100 coming first

I expect it will be like Titan, they can sell big Pascal exclusively to the Fed Gov at likely a very high price per unit for a year before they'll sell it to the world at large. This lets them actually make money off a poor yield on big chips on a new process, while the process improves for the eventual mass market release. It's as close to a perfect plan as you can get from the chipmaker's perspective
 

maddie

Diamond Member
Jul 18, 2010
4,787
4,771
136
because Nvidia won a government contract to power 2016 NOAA supercomputer with Pascal:
http://www.anandtech.com/show/9791/...ation-to-build-tesla-weather-research-cluster
Then, by 2017, you will have VOLTA / POWER PC ORNL Summit and LLNL Sierra FLOPS monsters (up to 300 peta FLOPS on Summit).
Nvidia cannot loose their market share in HPC, especially with intel KNL around the corner, thus the big daddy GP100 coming first
How do you know the GPU size from this info?
 
Feb 19, 2009
10,457
10
76
How do you know the GPU size from this info?

You don't but supercomputers in the past have used top end Teslas.

Each slot is precious and they have to cram in as much compute power as possible, for links, cooling, space etc.

Fermi was the same, on a crap node with low yields, apparently single digit %, but Teslas have such high margins it doesn't matter.
 

maddie

Diamond Member
Jul 18, 2010
4,787
4,771
136
How much do you think Fiji on GF 28nm factors in here?

I am thinking in the early FinFET generation it might still exist for two at least reasons (and possibly three):

1. I'm sure AMD has accumulated some harvested dies by now (beyond what we see with R9 Fury).

2. Harvested chips would still make nice Nano type cards.

3. GF 28nm WSA?
1) Maybe not that many. Quite a few people are having success activating additional CUs in Fury, suggesting that yields must be very high.

2) True, but what about power consumption? AMD appears to be stressing this metric going forward.

3) I think TSMC makes Fiji. Last year some thought that GloFlo might manufacture the die but it seems they didn't.

I suppose we might get some new Fury LE sort of card with the lowest end die for the Chinese market as has happened in the past.
 

cbn

Lifer
Mar 27, 2009
12,968
221
106
2) True, but what about power consumption? AMD appears to be stressing this metric going forward.

I was thinking they could be used for Nano cards.

So even though its 28nm, it still relatively low power. (ie, big die at lower voltage)

3) I think TSMC makes Fiji. Last year some thought that GloFlo might manufacture the die but it seems they didn't.

I could be wrong.

Earlier info (I believe early this year) made me believe Fiji was made at GF, unlike the other GCN GPUs.
 

beginner99

Diamond Member
Jun 2, 2009
5,223
1,598
136
Earlier info (I believe early this year) made me believe Fiji was made at GF, unlike the other GCN GPUs.

It's TSMC. Before it was released it was just a rumor that it would be from GF due to GF having better 28 nm process. This was in combination with the exaggerated hype about it's performance and speculation was, that a better process helps it. But then it turned out it doesn't perform that good and is made at TSMC. Nothing special.

yeah dual-die on interposer sounds cool but I would bet a lot that won't happen. Even if it is possible to sync 2 GPUs I highly doubt AMD has the resources for that research and especially validation of such a design.
 

3DVagabond

Lifer
Aug 10, 2009
11,951
204
106
Interesting. Kind of leads me to believe both Nvidia and AMD are unfortunately going to stagger their new gpu releases to fully milk all performance gains from the process node.

Well, if they are going to be stuck on the same node for 4 years it would make sense.
 

3DVagabond

Lifer
Aug 10, 2009
11,951
204
106
I didn't even see there was a page two, thanks for finding it MrTeal. :thumbsup:

So, going back to the point that started all this, Nvidia is faster than AMD (980 Ti vs Fury X) in DX12 Ashes with AMD using async compute and Nvidia not?

It took vendor specific code for nVidia. They haven't done any optimizations for anyone else's hardware.

Personally, I think one could just as easily make the claim that we were biased toward Nvidia as the only 'vendor' specific code is for Nvidia

It's early days. Enjoy it while you can.
 

3DVagabond

Lifer
Aug 10, 2009
11,951
204
106
because Nvidia won a government contract to power 2016 NOAA supercomputer with Pascal:
http://www.anandtech.com/show/9791/...ation-to-build-tesla-weather-research-cluster
Then, by 2017, you will have VOLTA / POWER PC ORNL Summit and LLNL Sierra FLOPS monsters (up to 300 peta FLOPS on Summit).
Nvidia cannot loose their market share in HPC, especially with intel KNL around the corner, thus the big daddy GP100 coming first

That doesn't address the point. If yields are going to be problematic on even a 300mm^2 chip how are they possibly going to make a +500mm^2 chip? Also along with the shrink they are dealing with a new FF process and their first attempt at HBM.
 
Feb 19, 2009
10,457
10
76
That doesn't address the point. If yields are going to be problematic on even a 300mm^2 chip how are they possibly going to make a +500mm^2 chip? Also along with the shrink they are dealing with a new FF process and their first attempt at HBM.

Early Fermi was reported to be ~6% yielding IIRC, enough only for HPC contracts hence the 480 was delayed 9 months.
 

maddie

Diamond Member
Jul 18, 2010
4,787
4,771
136
You should look on their right/wrong numbers. People defended SA/Charlie with the same logic.
I can understand your thinking IF you have no other info, but a reasonable person should at the very least, judge each case by its own merit.

Jumping on a narrow minded bandwagon all of the time might get one accused of bigoted thinking.

We wouldn't want that, would we?
 

ShintaiDK

Lifer
Apr 22, 2012
20,378
145
106
I can understand your thinking IF you have no other info, but a reasonable person should at the very least, judge each case by its own merit.

Jumping on a narrow minded bandwagon all of the time might get one accused of bigoted thinking.

We wouldn't want that, would we?

Pleas enlighten me, why the biggest trash sites on the internet is good.
 

96Firebird

Diamond Member
Nov 8, 2010
5,712
316
126
It took vendor specific code for nVidia. They haven't done any optimizations for anyone else's hardware.

It's early days. Enjoy it while you can.

Who says the "vendor specific code" isn't just disabling async compute and going with serial compute for Nvidia cards?
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |