Question 'Ampere'/Next-gen gaming uarch speculation thread

Page 68 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Ottonomous

Senior member
May 15, 2014
559
292
136
How much is the Samsung 7nm EUV process expected to provide in terms of gains?
How will the RTX components be scaled/developed?
Any major architectural enhancements expected?
Will VRAM be bumped to 16/12/12 for the top three?
Will there be further fragmentation in the lineup? (Keeping turing at cheaper prices, while offering 'beefed up RTX' options at the top?)
Will the top card be capable of >4K60, at least 90?
Would Nvidia ever consider an HBM implementation in the gaming lineup?
Will Nvidia introduce new proprietary technologies again?

Sorry if imprudent/uncalled for, just interested in the forum member's thoughts.
 

GodisanAtheist

Diamond Member
Nov 16, 2006
7,058
7,478
136
It probably depends on how much they improved the RT over Turing, but even if it's as good as rumored I don't see a lot of point in including it in the low end cards that might still struggle to hit acceptable performance even with RT turned off.

I suppose that it would make for good marketing though and at some point they need to have a bigger install base to get developers to invest more time into adding ray tracing elements into their games.

- I figure DLSS will do a bunch of the heavy lifting on the lowest end cards.

Even if the cards cannot get acceptable RTRT performance (by our standards), they can move them with the promise of letting folks buy into the feature set and getting IQ improvements where they can afford the FPS hit.
 

Dribble

Platinum Member
Aug 9, 2005
2,076
611
136
It probably depends on how much they improved the RT over Turing, but even if it's as good as rumored I don't see a lot of point in including it in the low end cards that might still struggle to hit acceptable performance even with RT turned off.

I suppose that it would make for good marketing though and at some point they need to have a bigger install base to get developers to invest more time into adding ray tracing elements into their games.
With DLSS they only need 540p for a 1080p resolution. With the massively upgraded ray tracing chances are the new cards will have no trouble running it at 540p at max settings that are actually necessary (you don't need 8k textures at 540p) which means 1080p with ray tracing on probably working fine on a 3050.
 

itsmydamnation

Platinum Member
Feb 6, 2011
2,863
3,413
136
With DLSS they only need 540p for a 1080p resolution. With the massively upgraded ray tracing chances are the new cards will have no trouble running it at 540p at max settings that are actually necessary (you don't need 8k textures at 540p) which means 1080p with ray tracing on probably working fine on a 3050.
Thats a mightly big assumption, can i transact with you using Fermi estimation?
 

BenSkywalker

Diamond Member
Oct 9, 1999
9,140
67
91
Does anyone think it's unreasonable for the 3050 to at least match the 2060 in rasterization? Is it safe to assume the Ampere parts will be better at ray tracing than the Turing parts?

If the 3050 doesn't beat the 2060 in ray tracing I would consider it a disappointment. Also, I wouldn't expect nVidia to have any parts without ray tracing, it was one thing when they could compare the 16xx parts to Radeon VII for feature set, another thing entirely to be below a console chip.
 
Reactions: psolord

AtenRa

Lifer
Feb 2, 2009
14,003
3,361
136
Does anyone think it's unreasonable for the 3050 to at least match the 2060 in rasterization? Is it safe to assume the Ampere parts will be better at ray tracing than the Turing parts?

If the 3050 doesn't beat the 2060 in ray tracing I would consider it a disappointment. Also, I wouldn't expect nVidia to have any parts without ray tracing, it was one thing when they could compare the 16xx parts to Radeon VII for feature set, another thing entirely to be below a console chip.

I dont believe 3050 will reach RTX2060/RX5700 raster performance. They will have to make a 10k Billion transistor chip that even at 8/7nm will be at 250-300mm2 territory. That is too big for a 3050 category Graphics Card.

And you are expecting a faster RT performance on top of that ?? that will increase even farther the die size.
 
Last edited:
Reactions: Konan

BenSkywalker

Diamond Member
Oct 9, 1999
9,140
67
91
A die shrunk 2060 with a moderate clock speed bump would check all the boxes necessary. As far as die size, the 1650 is 200mm squared, the 950 was just under 230 mm squared.

Just because we saw obscenely expensive small die parts this generation from one vendor, it doesn't mean that trend will continue.
 
Reactions: GodisanAtheist

Thala

Golden Member
Nov 12, 2014
1,355
653
136
I dont believe 3050 will reach RTX2060/RX5700 raster performance. They will have to make a 10k Billion transistor chip that even at 8/7nm will be at 250-300mm2 territory. That is too big for a 3050 category Graphics Card.

And you are expecting a faster RT performance on top of that ?? that will increase even farther the die size.

You can double the number of RT units and we would still be looking at a single digit percentage die size increase. But yes, you would need to invest a gate count similar to 2060 of course.
 

Glo.

Diamond Member
Apr 25, 2015
5,761
4,666
136
Does anyone think it's unreasonable for the 3050 to at least match the 2060 in rasterization? Is it safe to assume the Ampere parts will be better at ray tracing than the Turing parts?

If the 3050 doesn't beat the 2060 in ray tracing I would consider it a disappointment. Also, I wouldn't expect nVidia to have any parts without ray tracing, it was one thing when they could compare the 16xx parts to Radeon VII for feature set, another thing entirely to be below a console chip.
Depends on SM count. 20 CU 107 die will have zero chance of reaching performance of RTX 2060, but will 100% reach GTX 1660 Ti performance.

24 SM 107 Die should have 100% chance to reach 90-95% of performance of RTX 2060.

20 SM die can be 3050, and 24 SM die can be 3050 Ti SKU.
 

AtenRa

Lifer
Feb 2, 2009
14,003
3,361
136
A die shrunk 2060 with a moderate clock speed bump would check all the boxes necessary. As far as die size, the 1650 is 200mm squared, the 950 was just under 230 mm squared.

Just because we saw obscenely expensive small die parts this generation from one vendor, it doesn't mean that trend will continue.

GTX950 is not in the same category as GTX1650, the GTX950(GM206) is what the RTX2060 (TU106) is today.

GT750 (GM107) is the same category as GTX1050Ti (GP107) and GTX1650 (TU117)

3050 should use the recently tape-out GA107
 
Reactions: Glo. and Konan

BenSkywalker

Diamond Member
Oct 9, 1999
9,140
67
91
The TU106 in 2060 form is the TU106 chip with parts disabled and lowered power targets, your estimation for die size is one based on matching the x70 tier, not the x60 tier.

If we take into consideration the disabled die space combined with a full node drop we end up with a die size roughly equal to the current 1650(4.7B to roughly 8B). This is a full node drop, just matching the prior gen x60 is actually not very good at all, I'm just expecting nVidia to add more tiers this generation in the mainline series.
 

Stuka87

Diamond Member
Dec 10, 2010
6,240
2,559
136
The TU106 in 2060 form is the TU106 chip with parts disabled and lowered power targets, your estimation for die size is one based on matching the x70 tier, not the x60 tier.

If we take into consideration the disabled die space combined with a full node drop we end up with a die size roughly equal to the current 1650(4.7B to roughly 8B). This is a full node drop, just matching the prior gen x60 is actually not very good at all, I'm just expecting nVidia to add more tiers this generation in the mainline series.

Its not actually a full node drop. It is a smaller node to be sure. But 12nm shrunk to 8nm is not a full node drop.

But, to your point, I would also expect an actual 3050 this series. I expect there to not be a GTX replacement for the GTX 1660. There will be an RTX card in that position, which should be the 3050.
 

BenSkywalker

Diamond Member
Oct 9, 1999
9,140
67
91
A full node is 0.7, half node 0.9- the transition in areal density actually exceeded a full node drop by a reasonable margin going from Vega to RDNA. I think that's what's throwing you off.

10.3B- 251mm 41M per mm
12.5B- 495mm 25M per mm

That's 0.64
A node and then another half node is 0.63 as point of reference.

Given we are all currently expecting the lower tier nVidia cards to be on Samsung 8nm a full node drop, 0.7, may even be a bit conservative.
 

Glo.

Diamond Member
Apr 25, 2015
5,761
4,666
136
A full node is 0.7, half node 0.9- the transition in areal density actually exceeded a full node drop by a reasonable margin going from Vega to RDNA. I think that's what's throwing you off.

10.3B- 251mm 41M per mm
12.5B- 495mm 25M per mm

That's 0.64
A node and then another half node is 0.63 as point of reference.

Given we are all currently expecting the lower tier nVidia cards to be on Samsung 8nm a full node drop, 0.7, may even be a bit conservative.
If RDNA2 is more than 40 mln xTors/mm2 then you can throw your calculations into toilet.

Transistor density is not defined by what is possible on this node, but how good effort physical design team has done.

Apple achieves on this node 80 mln xTors/mm2, and Renoir achieves 60 mln xTors/mm2.

Let RDNA2 have 60 mln xTors/mm2 and you calculations are plainly wrong.

Secondly, your calculations would be correct, but not for N7 process, but N10 process from TSMC.
 

Glo.

Diamond Member
Apr 25, 2015
5,761
4,666
136
The TU106 in 2060 form is the TU106 chip with parts disabled and lowered power targets, your estimation for die size is one based on matching the x70 tier, not the x60 tier.

If we take into consideration the disabled die space combined with a full node drop we end up with a die size roughly equal to the current 1650(4.7B to roughly 8B). This is a full node drop, just matching the prior gen x60 is actually not very good at all, I'm just expecting nVidia to add more tiers this generation in the mainline series.
Expect 30-35 mln xTors/mm2 on 8 NM LPP from Samsung, for next gen gaming cards from Nvidia.

107 die will also be sub 200 mm2 die, most likely around 160 mm2. In terms of die sizes - we may be back to sanity again.
 

AtenRa

Lifer
Feb 2, 2009
14,003
3,361
136
The TU106 in 2060 form is the TU106 chip with parts disabled and lowered power targets, your estimation for die size is one based on matching the x70 tier, not the x60 tier.

If we take into consideration the disabled die space combined with a full node drop we end up with a die size roughly equal to the current 1650(4.7B to roughly 8B). This is a full node drop, just matching the prior gen x60 is actually not very good at all, I'm just expecting nVidia to add more tiers this generation in the mainline series.

Nope, GTX950 is exactly what RTX2060 is today. GTX960 which is using a full GM206 is what the RTX2070 is today.
 
Reactions: Glo.

AtenRa

Lifer
Feb 2, 2009
14,003
3,361
136
Do not take the Ampere GA100 as standard 7nm density, most probably the GA100 is using the 7nm High Density (HD) library with reduced FMax . Consumer Ampere chips will use the less dense 7nm High Performance (HP) for higher performance.
 

BenSkywalker

Diamond Member
Oct 9, 1999
9,140
67
91
Nope, GTX950 is exactly what RTX2060 is today. GTX960 which is using a full GM206 is what the RTX2070 is today

The 2060 Super exists. 2060 is a third tier part off of that die.

If RDNA2 is more than 40 mln xTors/mm2 then you can throw your calculations into toilet.

Math doesn't change because you are happy with something, and what does RDNA2 have to do with how the 3050 will compare to the 2060....?

Transistor density is not defined by what is possible on this node

Yes, it is, by definition.

If AMD builds one chip with a huge cache amount and another with a very small cache amount the chip with the larger cache will have greater density. More transistors in and of themselves aren't going to help- put 1GB of L1 cache on RDNA and the density would jump sharply, and performance would barely budge.

When comparing like products, like two graphics chips from the same company, a relative balance of what each chip consists of can be assumed. Some outliers like the A100 are going to look different because of the balance of the design.

How big the actual transistors are is something determined by the process node- that is what a process node means.
 

Glo.

Diamond Member
Apr 25, 2015
5,761
4,666
136
It should be a lot better than that. TU102 is 24.7m/mm2, V100 is 25.8. Should be more like 45-50. A100 is 66, btw.
A100 is N7, not 8LPP.

Remember, 8LPP is slightly tweaked 10 nm process node, which is one node ago, compared to N7. 45-50 mln xTors/mm2 are the high-end of what is possible on this node, just like 60-80 mln xTors/mm2 is the highest achieved on N7.

I personally expect anywhere between 36-38 mln xTors/mm2, in transistor density for next gen Nvidia products, on this node.
 

Glo.

Diamond Member
Apr 25, 2015
5,761
4,666
136
Yes, it is, by definition.

If AMD builds one chip with a huge cache amount and another with a very small cache amount the chip with the larger cache will have greater density. More transistors in and of themselves aren't going to help- put 1GB of L1 cache on RDNA and the density would jump sharply, and performance would barely budge.

When comparing like products, like two graphics chips from the same company, a relative balance of what each chip consists of can be assumed. Some outliers like the A100 are going to look different because of the balance of the design.

How big the actual transistors are is something determined by the process node- that is what a process node means.
What I mean: On N7 node there is possible 80 mln xTors/mm2, to be achieved, just like Apple did, with their designs.

AMD achieved comparably only 40 mln, and Nvidia achieved 60 mln. This is what I meant by saying: Transistor density is not defined by the process node itself, but by how good effort physical design team has done.

N7 is two nodes away from 14 nm GloFo/16 nm TSMC, with 10 nm in between them.
 

sontin

Diamond Member
Sep 12, 2011
3,273
149
106
A100 is N7, not 8LPP.

Remember, 8LPP is slightly tweaked 10 nm process node, which is one node ago, compared to N7. 45-50 mln xTors/mm2 are the high-end of what is possible on this node, just like 60-80 mln xTors/mm2 is the highest achieved on N7.

I personally expect anywhere between 36-38 mln xTors/mm2, in transistor density for next gen Nvidia products, on this node.

Snapdragon S845 has 55 mln xTors/mm2 on 10nm. 8nm is 10% denser.
On 14nm S825 had 30 mln xTors/mm2 and GP107 had 25 mln xTors/mm2.

So dont make stuff up.
 
Reactions: Lodix

Glo.

Diamond Member
Apr 25, 2015
5,761
4,666
136
Snapdragon S845 has 55 mln xTors/mm2 on 10nm. 8nm is 10% denser.
On 14nm S825 had 30 mln xTors/mm2 and GP107 had 25 mln xTors/mm2.

So dont make stuff up.
Even on 16 and 12 NM TSMC process was possible to pack 35 mln xTors/mm2. So why neither Nvidia, nor AMD achieved that transistor density, and the best they could come up with was 25 mln xTors/mm2?

I think you are smart enough to understand the difference between high density libraries that are used in ultra mobile chips, versus high-performance libraries used in high-performance chips.

You do realize that if Nvidia will use high-density libraries, they will sacrifice the clock speeds and performance of their products?

Tone your expectations down.
 

Lodix

Senior member
Jun 24, 2016
340
116
116
Samsung's 8LPP is a full node shrink from 14nm ( which has a slight density advantage over TSMC 16nm ).

Samsung 8LPP has a theorical density of ~62 MTr/mm^2 compared to ~29 MTr/mm^2 for 16nm of TSMC.
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |