Question Speculation: RDNA3 + CDNA2 Architectures Thread

Page 12 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

uzzi38

Platinum Member
Oct 16, 2019
2,703
6,405
146

uzzi38

Platinum Member
Oct 16, 2019
2,703
6,405
146
Just wanted to point out I don't necessarily think that IC for N33 must be larger than N21. As for my reasoning:

AMD split RDNA2 into three main dies, each targeting a different resolution as their primary focus. N21 targeted 4K, N22 targeted 1440p and N23 targeted 1080p.

The same would surely go for RDNA3. The way I see it, if N31 is truly 256b bus + 512MB IC, then it's focus is probably 8K gaming. Seeing as we've basically skipped 2880p as a resolution for displays, it makes the most sense to me that N32 would be targeting 4K, and then N33 would be targeting 1440p.

At 1440p, 128MB is like a 67% hit-rate or thereabouts. Compared to N21 you'd be looking at lower effective memory bandwidth most likely - but at the same time, you don't need as much bandwidth when targeting a lower resolution anyway. Balance wise 128b bus + 128MB IC may be enough for 1440p high refresh rate, especially if we get improved colour compression, or improved per-WGP/SA caches to alleviate bandwidth requirements even further.
 

Glo.

Diamond Member
Apr 25, 2015
5,763
4,667
136
Just wanted to point out I don't necessarily think that IC for N33 must be larger than N21. As for my reasoning:

AMD split RDNA2 into three main dies, each targeting a different resolution as their primary focus. N21 targeted 4K, N22 targeted 1440p and N23 targeted 1080p.

The same would surely go for RDNA3. The way I see it, if N31 is truly 256b bus + 512MB IC, then it's focus is probably 8K gaming. Seeing as we've basically skipped 2880p as a resolution for displays, it makes the most sense to me that N32 would be targeting 4K, and then N33 would be targeting 1440p.

At 1440p, 128MB is like a 67% hit-rate or thereabouts. Compared to N21 you'd be looking at lower effective memory bandwidth most likely - but at the same time, you don't need as much bandwidth when targeting a lower resolution anyway. Balance wise 128b bus + 128MB IC may be enough for 1440p high refresh rate, especially if we get improved colour compression, or improved per-WGP/SA caches to alleviate bandwidth requirements even further.
And for 1440p target, AMD can "get away" with 8 GB VRAM buffer.
 

uzzi38

Platinum Member
Oct 16, 2019
2,703
6,405
146
Sorry, I should have added to that - even with a 128b bus 16GB is still possible, it would require clamshell 16Gb modules though.
 
Reactions: Tlh97

beginner99

Diamond Member
Jun 2, 2009
5,223
1,598
136
Our houses are double walled, the windows have two glass panes, one on the outside and one on the inside, to handle the cold.

off-topic:

I find it hilarious that one needs to mention this. I don't live somewhere very north or cold, yeah we get snow in winter, couple times and it can be freezing for some weeks in a row (well used to be, not very often in recent years), but double windows is like the minimum. And not just 2 panes but with insulation gas between the panes. 3 panes also isn't that uncommon albeit modern 2 panes are just as good AFAIK, just need a bigger space and more gas in between the panes, looking at my windows I would guess at least 1.5-2 inches. Walls mostly aren't double but with a very thick outer insulation. But yeah no AC as well, even in new apartment buildings and most older commercial buildings don't have any either. To be fair the thick insulation also keeps the heat out. 2 weeks of 35°C (and nights > 20°C) and it never gets any warmer than 27°C inside with proper "heat managment" (windows open at night, blinds shut during day).
 
Reactions: Tlh97 and scineram

DisEnchantment

Golden Member
Mar 3, 2017
1,687
6,243
136
off-topic:

I find it hilarious that one needs to mention this. I don't live somewhere very north or cold, yeah we get snow in winter, couple times and it can be freezing for some weeks in a row (well used to be, not very often in recent years), but double windows is like the minimum. And not just 2 panes but with insulation gas between the panes. 3 panes also isn't that uncommon albeit modern 2 panes are just as good AFAIK, just need a bigger space and more gas in between the panes, looking at my windows I would guess at least 1.5-2 inches. Walls mostly aren't double but with a very thick outer insulation. But yeah no AC as well, even in new apartment buildings and most older commercial buildings don't have any either. To be fair the thick insulation also keeps the heat out. 2 weeks of 35°C (and nights > 20°C) and it never gets any warmer than 27°C inside with proper "heat managment" (windows open at night, blinds shut during day).
You mean proper "active thermal management". But manually operated.
TBH though, I was just trying to point out in case some posters don't know that at the higher latitudes the problem is not any better from the tropics in the summer. I assume many posters from the tropics and southern Europe/US and Aus here.
I have a mate from Aus, and he is not aware about these. He called me a noob when I said 28 degrees is a bit too high for my comfort.

I don't doubt It will cost an arm and leg. N32 is supposedly also multi-die so the only "reasonably" priced would be N33, but that's also questionable, If performance similar or higher than RX 6900XT is really true.
I guess we will never find out how a monolithic N33 would perform on N5. The PPA gains looks sweet. 200W for 6900XT performance would have been incredible.
I am really curious to see the mobile part with 40CU on N5. It would probably run 2.4 GHz at less than 100 TGP.
 
Reactions: Tlh97

eek2121

Diamond Member
Aug 2, 2005
3,051
4,276
136
Some of the speculation here is bonkers. Here is mine:

If AMD goes MCM for Navi3x, it will be for the entire stack unless N33 and N34 is an RDNA2 rebadge. It is possible AMD may bring RDNA3 to 6nm, I am not ruling it out, but a rebadge is cheaper.

Chiplets will have all of the functionality of a monolithic die. 1 chiplet is essentially a monolithic die. Each chiplet has 40CUs. Top SKU has 4 chiplets, bottom SKU has 1.

GDDR6 with IC makes a return, but with more IC.

RT performance: A 160CU SKU would have 2.5-3x RT performance over the 6900XT. Most of this is due to double the RT units.

Raster performance: not as high as some think. 80%-90% faster. 200-220% in some scenarios. Higher increases at 4K-8K.

TDP: TDP will be in the 300-350W range for top skus, but thanks to MCM, cooling may not be an issue. There may be a 400W special edition SKU, but AMD is super focused on keeping thermals and power consumption in check. Expect something closer to 300W as opposed to 350W.

By producing one small die and binning it for defects and clocks, AMD can hit 55+% margins while making cheap, fast GPUs.

Clocks likely won’t increase much over Navi2x, as AMD is eating up thermal budget with MCM.

There will be more SKUs than the 6000 series.

AMD might launch a program similar to NVIDIA founders edition. (they are also rumored to be considering this for motherboards).

EDIT: A 40CU module will be 250-275mm2.
 
Last edited:

Glo.

Diamond Member
Apr 25, 2015
5,763
4,667
136
All of RDNA3 cards appear to have the same GFX family ID. So it appears Navi 33, will be the same architecture as Navi 31 and 32, but different than Navi 21, 22 and 23.



So here is my speculation why such big GPU will be "mainstream"(can we really call a GPU that big mainstream anymore?).

Powerful APUs will take out sub 200$, or even sub 300$ GPU market. Nothing on the roadmaps point to this, however we start to hear rumors that PS5's APU may be coming to PC's.

If it will have active 6 or 8 CPU cores, if it will have active 16, 24, 32 CUs from the GPU - does not matter. It may point to a reality that is right before us, for the future of entry level products.
 

Mopetar

Diamond Member
Jan 31, 2011
8,011
6,455
136
Chiplets will have all of the functionality of a monolithic die. 1 chiplet is essentially a monolithic die. Each chiplet has 40CUs. Top SKU has 4 chiplets, bottom SKU has 1.

Really just sounds like running a 4-way crossfire setup at that point, only now it's all on a single card.

Building the chiplets that way does seem like it wastes some die space, since all of them would need the necessary video-out IO, decoders, etc. but I suppose someone who really wanted to process a lot of video wouldn't mind having that additional hardware.
 

GodisanAtheist

Diamond Member
Nov 16, 2006
7,064
7,490
136
AMD Radeon 7900XT to feature up to 15k cores!

That's 3x 6900XT. Another 7970 moment? There are some misreportings in the article like N33 being 7800XT but long time since we got something new.

If someone can translate this and what does he mean by 6x10?

- If the "Chiplet" die is essentially a 6900XT (80CU/5,120SP die on a 5nm process), and AMD has essentially *solved* scaling, then the rumors start lining up.

Most likely to be more red herrings or theorycrafting this early in the game, but it does make one wonder Shovel more coal on the hypetrain boiler

Back to:
7900XT - 3x N33 dies = 240CU = 15,360SPs
7800XT - 2x N33 dies - 160CU = 10,240SPs
7700XT - 1x N33 die - 80CU = 5,120SPs = Basically a 6900XT with maybe a clock-bump.
 
Reactions: Tlh97 and Leeea

Saylick

Diamond Member
Sep 10, 2012
3,386
7,151
136
Damn! If the 15360 CU rumour is true, then the 2.7x perf increase over the 6900xt seems far more likely.

Same here. This is way too cryptic for me.
My guess is 6x10 meaning the grid of WGPs? That would mean 60 WGPs, but if Kopite7kimi is to be believed, Navi 31 has 15360 stream processors, that means each WGP has 256 SPs. If each CU is still 64 SPs, then that implies each WGP has 4 CUs, double that of RDNA 2.

Now, let's assume that this isn't the case, that the WGP:CU ratio is still 1:2 like RDNA 2. If Navi 31 is 2xN33, that implies that "6x10" refers to the number of WGPs in a single chiplet/N33 die. That's 60 WGPs or 120 CUs or 240 CUs in a single chiplet, which is triple that of N21.
 

Mopetar

Diamond Member
Jan 31, 2011
8,011
6,455
136
Back to:
7900XT - 3x N33 dies = 240CU = 15,360SPs
7800XT - 2x N33 dies - 160CU = 10,240SPs
7700XT - 1x N33 die - 80CU = 5,120SPs = Basically a 6900XT with maybe a clock-bump.

I don't know what other dies they would have planned, but it would be a little odd for the X700 part to be the top die. Obviously there's no rule that says AMD has to stick to the conventions they've used before, but they usually do.

There's also a matter of what they do with dies that aren't fully enabled. There are still going to be some defective parts (and large GPU dies make this a lot more likely) and those that need some hardware disabled to hit voltage targets. This seems to ignore that these exist at all.

With that much potential for variation it's hard to say how they'll handled the naming, but I wouldn't be too surprised if we see a departure from the current scheme to something that's better suited for a multi-chip approach.

I'm also not even sure that games would benefit all that much from such a massive number of shaders. We've already seen how much performance scaling starts to fall off after the 3080 even at 4K. The raw performance is probably great for compute workloads, but I don't think we'll see 2.7x when it comes to gaming.
 

gdansk

Platinum Member
Feb 8, 2011
2,492
3,387
136
Scaling at low resolutions will probably be awful for all the next generation GPUs but I wonder what portion of that you can actually attribute to the GPUs and not the CPUs feeding them.
 

jrdls

Junior Member
Aug 19, 2020
12
12
51
My guess is 6x10 meaning the grid of WGPs? That would mean 60 WGPs, but if Kopite7kimi is to be believed, Navi 31 has 15360 stream processors, that means each WGP has 256 SPs. If each CU is still 64 SPs, then that implies each WGP has 4 CUs, double that of RDNA 2.

Now, let's assume that this isn't the case, that the WGP:CU ratio is still 1:2 like RDNA 2. If Navi 31 is 2xN33, that implies that "6x10" refers to the number of WGPs in a single chiplet/N33 die. That's 60 WGPs or 120 CUs or 240 CUs in a single chiplet, which is triple that of N21.

60 WGP with 2 CU each in a Navi31 die makes perfect sense but it doesn't add up with what we know from the Linux kernel.

I believe num_se should be 6. Also, @uzzi38 believes that a WGP in RDNA3 is not the same as a WGP in RDNA2 (probably with good reason), so who knows what's going with RDNA3.
 

uzzi38

Platinum Member
Oct 16, 2019
2,703
6,405
146
60 WGP with 2 CU each in a Navi31 die makes perfect sense but it doesn't add up with what we know from the Linux kernel.

I believe num_se should be 6. Also, @uzzi38 believes that a WGP in RDNA3 is not the same as a WGP in RDNA2 (probably with good reason), so who knows what's going with RDNA3.
Only thing I can say for now is that it's 60WGPs and the performance target (for now) is >2.6x.

I can't imagine that performance target would even be possible with just 120CUs - there has to be significant changes to what makes up each WGP.

Give it a few weeks or so, there's still time before we get any concrete information.
 

Timorous

Golden Member
Oct 27, 2008
1,727
3,152
136
Only thing I can say for now is that it's 60WGPs and the performance target (for now) is >2.6x.

I can't imagine that performance target would even be possible with just 120CUs - there has to be significant changes to what makes up each WGP.

Give it a few weeks or so, there's still time before we get any concrete information.

60 WGPs total or per die? Assuming total then to hit 15k shaders means either 128 shaders per CU or 4 CUs per WGP.

If it is 60WGPs per die then they can keep the current 1:2:64 ratio of WGPs to CUs to Shaders.
 

Trumpstyle

Member
Jul 18, 2015
76
27
91
60 WGP with 2 CU each in a Navi31 die makes perfect sense but it doesn't add up with what we know from the Linux kernel.


I believe num_se should be 6. Also, @uzzi38 believes that a WGP in RDNA3 is not the same as a WGP in RDNA2 (probably with good reason), so who knows what's going with RDNA3.
It can make sense, Navi31 has 3 CU's dies with 5120 shaders. 5120 Shaders = 80 CU's. If we go by Kopite saying that Navi31 has 15360 shaders.
 

uzzi38

Platinum Member
Oct 16, 2019
2,703
6,405
146
60 WGPs total or per die? Assuming total then to hit 15k shaders means either 128 shaders per CU or 4 CUs per WGP.

If it is 60WGPs per die then they can keep the current 1:2:64 ratio of WGPs to CUs to Shaders.
Total. I'm pretty sure 15k shaders is speculation - in truth shader count is still unknown. Probably the right speculation, but speculation nonetheless
 

Trumpstyle

Member
Jul 18, 2015
76
27
91
Total. I'm pretty sure 15k shaders is speculation - in truth shader count is still unknown. Probably the right speculation, but speculation nonetheless
Yes, this I have noticed with leakers, you don't know when they are leaking stuff or just speculating. Despite new info I still think what I speculated on my previous post is accurate.


So 10240 shaders for Navi31.
 
Reactions: Tlh97 and Leeea

Timorous

Golden Member
Oct 27, 2008
1,727
3,152
136
Total. I'm pretty sure 15k shaders is speculation - in truth shader count is still unknown. Probably the right speculation, but speculation nonetheless

60 WGP total means each WGP has had a pretty large overhaul if they want to hit that 2.6x perf target.

With such a large overhaul it makes speculating on performance based on shader counts or anything pretty impossible so I think I will wait until early performance leaks are available or if AMD give us more data on their perf/watt target increase.
 
Reactions: Tlh97 and Leeea
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |