Question Speculation: RDNA3 + CDNA2 Architectures Thread

Page 226 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

uzzi38

Platinum Member
Oct 16, 2019
2,702
6,405
146

PJVol

Senior member
May 25, 2020
619
549
136
It seems like if the hardware was better utilized more power would be used. If it was some game specific optimization that reduced wasted effort by the hardware then perhaps power usage would remain the same.

I am just curious, not really expecting an answer.
That's a good question. While both usually contribute to performance gains, I believe it's more of the former in case of TLOU.
 

Attachments

  • TLOU 23.4.1 vs. 23.7.1.png
    2.6 MB · Views: 53

menhera

Junior Member
Dec 10, 2020
21
66
61

According to AMD, a cache request shown in its profiling tool is 64 Bytes or 128 Bytes. 128 Bytes for vector memory and 64 Bytes for Instruction/Scalar. So why not calculate data transferred in real world scenarios?



Wrote a simple program that calculates bandwidth with given values.



I tested these games with my 6900 XT at 4K simply because I have them installed on my PC.



Unfortunately AMD doesn't expose hitrate for L3, but this explains why AMD added L3 to RDNA 2. VRAM would be a serious bottleneck without L3.

RDNA 2's L1 cache is really bad. It spills too many requests. It's still acceptable in rasterization with other caches mitigating misses from L1. However it becomes problematic in ray tracing. Ray tracing hurts L0 quite a bit as shown in RE: V, and L1 can't mitigate the misses. L2 and (presumably L3) do all the work instead. They're just too slow for ray tracing. This also happens in almost every title with ray tracing support.

Although everyone is belittling RDNA 3, I wonder how RDNA 3's increased register files, L0, and L1 behave during entire frames. Chips and Cheese covered a bit, but only during specific workloads.
 

menhera

Junior Member
Dec 10, 2020
21
66
61

Forgot to mention that the graphs are average bandwidth required per frame. A single frame consists of thousands of workloads, and their characteristics are all seem to be different. While some workloads mostly rely on L0, there's certainly a number of workloads where L0 and L1 can't do anything. If not for L3, my 6900 XT with 512GB/s bandwidth would be choked in those memory-intensive workloads.
 

SteinFG

Senior member
Dec 29, 2021
520
610
106
Then you should mention that in your table, It's not like I know where you got those effective BW data except the one for N21. Where did you find them?
Also what you calculated vs data from graph are quite different.
N21 58% vs 62%. This isn't that different
N22 60% vs 69%. This is very different
N23 and N24 looks weird, but you already know It.
N22: AMD's website, 6700 XT specs page
N23: Review guide for RX 7600, where they compare it to 6600
N24: AMD's website, 6500 XT specs page
N31: AMD's website, 7900 XTX specs page
As for why the data is different, i have no idea why. But if you look at my calculated L3 hitrate for N21 and N22, they are very close to the nearest whole, which suggests that those were exact numbers used in their calculations. The rest doesn't look good though ¯\_(ツ)_/¯ I'm too lazy to do any more digging

switched to RDNA3 thread because there was nothing about RDNA4
 

TESKATLIPOKA

Platinum Member
May 1, 2020
2,423
2,914
136
N22: AMD's website, 6700 XT specs page
N23: Review guide for RX 7600, where they compare it to 6600
N24: AMD's website, 6500 XT specs page
N31: AMD's website, 7900 XTX specs page
As for why the data is different, i have no idea why. But if you look at my calculated L3 hitrate for N21 and N22, they are very close to the nearest whole, which suggests that those were exact numbers used in their calculations. The rest doesn't look good though ¯\_(ツ)_/¯ I'm too lazy to do any more digging

switched to RDNA3 thread because there was nothing about RDNA4
Thanks for the info about the data. I will check It out. I wouldn't be surprised If AMD marketing made a mistake in calculation.

edit:
I think N21 hitrate was at 4K, N23 at 1440p and N24 at 1080p, then It would be comparable to that graph I provided.
N22 is still different, but let's just say they made a mistake.

What I find interesting is that AMD managed to increase IC channels 3x per MB. Meaning, that with 128MB and the same clock It would provide 3x higher BW. BTW It looks like N31 IC works at 2.3GHz.
 
Last edited:
Reactions: Tlh97 and SteinFG

SteinFG

Senior member
Dec 29, 2021
520
610
106
N31 on N32 package spotted in the wild. 84 CUs, 16GB:

top image is Nitro+ 7900XTX, bottom is 7900 GRE
 

Timorous

Golden Member
Oct 27, 2008
1,727
3,152
136
N31 on N32 package spotted in the wild. 84 CUs, 16GB:

top image is Nitro+ 7900XTX, bottom is 7900 GRE
View attachment 83438

So a card that will sometimes match the 7900XT and other time fall behind due to bandwidth. Reminds me of the 5600XT.

If this launches globally and is priced in the $600 - $650 range it would be a good card IMO. Similar performance to a 4070Ti in pure raster but more VRAM and bandwidth to give it longer legs and priced a tad above the standard 4070.
 

Timorous

Golden Member
Oct 27, 2008
1,727
3,152
136
Here is the full spec of the 7900 GRE from This chinese review.

modelAMD Radeon RX 7900 GRE
Core codeNavi 31
Number of Stream Processor Units5120
Base Frequency(MHz)1287
Boost Frequency(MHz)2245
Memory Speed(Gbps)18
memory typeGDDR6
Memory capacity (GB)16
Memory interface width (bit)256
video interfaceHDMI 2.1×1
DisplayPort 2.1×2
USB Type-C×1
power supply interface2*8 Pin
TBP (W)260
Size(L*W*H)268mm*51mm*97mm

So slower RAM as well. 4K performance puts it quite a bit behind the 7900XT but quite a way ahead of the 4070. Taking the delta to the 7900XT it looks like this is about on par with the 6950XT which with the above spec is hardly surprising tbh.



Package certainly looks smaller than the one on the 7900 XTX / XT



Compared to N32 we have

5SE vs 3SE
80CU vs 60CU.

To overcome the raw hardware deficit you need 66% higher clocks to match in terms of ROPS or 33% higher clocks to match in terms of shader performance. Given the performance I don't think 160 ROPS is really required. Based on the spec boost clock of 2245 that would mean a 3Ghz N32 based part with 16GB of 18gbps ram should perform in a similar ballpark to this thing.
 
Last edited:
Reactions: Tlh97 and RnR_au

GodisanAtheist

Diamond Member
Nov 16, 2006
7,058
7,478
136
Let's just sit for a moment and admire what a flaming trash fire AMD's execution has been this generation.

It really is a sight to behold.

How does AMD so deftly, and seemingly skillfully, go from tightly executed and we'll oiled launches like RDNA2 to... Whatever the hell is happening right now?

And off in the distance is NV chugging away, selling crappy chips one tier up for absurd prices across a full product stack.
 

Timorous

Golden Member
Oct 27, 2008
1,727
3,152
136
Let's just sit for a moment and admire what a flaming trash fire AMD's execution has been this generation.

It really is a sight to behold.

How does AMD so deftly, and seemingly skillfully, go from tightly executed and we'll oiled launches like RDNA2 to... Whatever the hell is happening right now?

And off in the distance is NV chugging away, selling crappy chips one tier up for absurd prices across a full product stack.

I think it stems from the fact RDNA3 missed the 1.5x perf/watt increase they were expecting.

Had they hit that then the 7900XTX would be closer to the 4090 and the 7900XT would probably match the 4080. That would make the initial $900 price of the 7900XT far more palatable and they probably would have charged $1200 for the XTX.

From that a 3Ghz + N32 is up there ahead of AIB 6950XTs and has pricing room to be sold for $700 ish and the cut down 7700XT could come in at $450 or so with 6800XT tier performance.

7600 would also perform better in its power envelope pushing it on to perform around 6700XT tier.

Missing this has meant AMD have had to scramble to get their stack launched on a world where top N32 is either massively under performing or one where it has a price ceiling due to the poor 7900XT performance and the crashing prices.
 

Aapje

Golden Member
Mar 21, 2022
1,467
2,031
106
@Timorous

The real issue was not that they missed their targets, but that they thought that they could fix it in the drivers. So they didn't want to price it too low and then offer too much price/performance once the drivers were fixed.

But with the driver never being able to unlock that supposed potential, the positioning was just wrong.
 

jpiniero

Lifer
Oct 1, 2010
14,831
5,444
136

Videocardz makes it sound like the 7900GRE will launch globally but only through OEMs outside of China.
 
Jul 27, 2020
17,849
11,642
116
Faster than a 4070? Meh. They have such low ambitions. Could have targeted the 4070 Ti at least at $600. Morons.

I feel like going out in the desert and launching a few rockets into the sand dunes to cool off.
 

eek2121

Diamond Member
Aug 2, 2005
3,045
4,266
136
They should have named it the 7900 and launched it globally. I hate AMD’s naming scheme, however. Their scheme has folks mentally comparing 7s, 8s, and 9s with the competition, which makes them look bad. The 7900XTX is a 7800XT at best.

Maybe next gen they will do better. “Hey AMD, give me a reason to upgrade my 4090 to an AMD product.”
 

Heartbreaker

Diamond Member
Apr 3, 2006
4,262
5,259
136
It costs a lot more than N21, especially since it's likely the price gap between N5/N4 and N7/N6 has widened.

This is just a dud dump.

If N32 is as bad as the rest of the lineup, AMD will absolutely need an N31 16GB to fill that gap in the lineup.

Only if AMD N32 is going to be MUCH better (as I keep hoping) than expected (significant better perf/CU than other RDNA, and maybe more features (RDNA 3.5), can they not have a 16 GB N31.

If AMD eventually releases (very late) N32 cards, and they are no better than the rest of RDNA3 lineup, it just adds to the picture of AMD GPU incompetence.
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |