Discussion RDNA4 + CDNA3 Architectures Thread

Page 232 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

DisEnchantment

Golden Member
Mar 3, 2017
1,754
6,631
136





With the GFX940 patches in full swing since first week of March, it is looking like MI300 is not far in the distant future!
Usually AMD takes around 3Qs to get the support in LLVM and amdgpu. Lately, since RDNA2 the window they push to add support for new devices is much reduced to prevent leaks.
But looking at the flurry of code in LLVM, it is a lot of commits. Maybe because US Govt is starting to prepare the SW environment for El Capitan (Maybe to avoid slow bring up situation like Frontier for example)

See here for the GFX940 specific commits
Or Phoronix

There is a lot more if you know whom to follow in LLVM review chains (before getting merged to github), but I am not going to link AMD employees.

I am starting to think MI300 will launch around the same time like Hopper probably only a couple of months later!
Although I believe Hopper had problems not having a host CPU capable of doing PCIe 5 in the very near future therefore it might have gotten pushed back a bit until SPR and Genoa arrives later in 2022.
If PVC slips again I believe MI300 could launch before it

This is nuts, MI100/200/300 cadence is impressive.



Previous thread on CDNA2 and RDNA3 here

 
Last edited:

Thunder 57

Diamond Member
Aug 19, 2007
3,283
5,389
136
Putting aside that 6600s were barely available outside the US for much of the past few years (think 2022/23). Nvidia is practically the only choice for CS Majors who need to learn programming with GPUs (and game a bit for R&R) . 6600 being 40% more powerful or whatever doesn't matter when RoCm is/was a tire fire.

I meant for gamers as that is the primary seller and I wouldn't know about outside the US. I do know that inside the US though that 3050's were all over the top sellers of Amazon compared to 6600's. I doubt many of them were doing RoCM. But I'll shut up about this now as it's already pretty well known anyway and no need to create more controversy.
 

PJVol

Senior member
May 25, 2020
792
776
136
N31's GCD was around that size if I recall correctly, assuming you consider that a monolithic GPU like RDNA 4 has IO on it as well. If N48 can't even hit similar levels of performance, that would be just sad.
N32 is 200mm2 n5 + 150mm2 n6. Consider going back to monolithic package on N4 lets you save some area from bulky fabric phy on n48. This alone makes one doubt the estimated number of CUs for the 380mn2 sku is correct, or that they are comparable to the old ones, to which "optimized" hinted at in presentation slide.
I never even finished The Ancient Gods part 2. It seems to try to make it force you to play "their way" ie using certain weapon mods to kill demons
Actually they did that since Part 1 ("spirit"), but I agree that the 2nd dlc is not as good.
before they nerf to ground it
Not the first time I heard this quite an exaggeration. More like they rebalanced both parts.
 
Reactions: exquisitechar

eek2121

Diamond Member
Aug 2, 2005
3,202
4,635
136
VideoCardz mentions the CoD test showed artifacts, so there is possibility of FPS being lower once the rendering is fixed with proper drivers.
The drivers didn’t even recognize the card. Very early drivers. AMD also said that nearly all performance leaks weren’t accurate (rare statement, usually they decline to comment) and that no OEMs had final drivers.

My original prediction was that the top part would be close to 7900XTX in terms of performance, even possibly beating it. I stand by that statement. I suspect raster may lag a bit (up to 15%), heavy RT will be on par/ahead, and light RT will vary on workload, but mostly be close.

While there is a huge SM difference, the clocks on the 9070 are significantly higher, it has RT and there are architectural improvements.

Predictions:

Price : Between $449 and $499

It will trade blows or possibly beat the 5070 (ignoring NVIDIA’s marketing nonsense)

9070 non XT will be a bit behind and will be $50 less.

EDIT: The non-XT has 15% fewer shaders and slightly different clocks according to the rumor mill (which could be right or wrong)
 
Reactions: Tlh97 and Win2012R2

Heartbreaker

Diamond Member
Apr 3, 2006
4,653
6,107
136
I'll wait for reviews, find it hard to believe 5090 is just 25%-ish over 4090 (without fake frames obviously)


I don't. At some point people have to realize that the fat gains of the good old days are gone. The last time I saw people happy with a release was GTX 10 series.

Everything is flatlining on the process side. Dennard scaling fell apart years ago, transistor scaling slowed down, and cost per transistor really took the biggest hit. It all has consequences.

There is still progress but that progress has slowed. Instead of 70% gains with a new generation, 20%-30% may be the new normal.
 

Gideon

Golden Member
Nov 27, 2007
1,921
4,668
136
The drivers didn’t even recognize the card. Very early drivers. AMD also said that nearly all performance leaks weren’t accurate (rare statement, usually they decline to comment) and that no OEMs had final drivers.

My original prediction was that the top part would be close to 7900XTX in terms of performance, even possibly beating it. I stand by that statement. I suspect raster may lag a bit (up to 15%), heavy RT will be on par/ahead, and light RT will vary on workload, but mostly be close.

While there is a huge SM difference, the clocks on the 9070 are significantly higher, it has RT and there are architectural improvements.

Predictions:

Price : Between $449 and $499

It will trade blows or possibly beat the 5070 (ignoring NVIDIA’s marketing nonsense)

9070 non XT will be a bit behind and will be $50 less.

EDIT: The non-XT has 15% fewer shaders and slightly different clocks according to the rumor mill (which could be right or wrong)

My predictions were initially somewhat similar, but if the die size is correct (and why shoudn't it be?) at ~390 mm² it changes the equation a bit

It's comparable to the GB203 (that is in 5070 Ti and 5080) and about ~100 mm² larger than the AD104 that's used in 4070 SUPER and 4070 Ti. (GB205 used in 5070 probably has similar size, but not confirmed yet)

If AMD only "trades blows" with 5070 (in rasterization!) with a chip that's bigger than the one used in 5080 It'll be a bit dissapointing (miles better than intel, but still)

TBF the RT uplift on RTX 5070 appears to be quite large, but i'm pretty sure the raster uplift is actually <30%, e.g. within 10% from the 4070 SUPER
 

Saylick

Diamond Member
Sep 10, 2012
3,798
8,666
136
I don't. At some point people have to realize that the fat gains of the good old days are gone. The last time I saw people happy with a release was GTX 10 series.

Everything is flatlining on the process side. Dennard scaling fell apart years ago, transistor scaling slowed down, and cost per transistor really took the biggest hit. It all has consequences.

There is still progress but that progress has slowed. Instead of 70% gains with a new generation, 20%-30% may be the new normal.
Even with Nvidia’s approach to maximize tensor performance so that they can leverage software improvements instead of pure raster improvements is going to come to a halt soon enough. Right now, Blackwell’s tensor units go down to INT4/FP4… I suppose they can try INT2/FP2, but with precision that low I’m not convinced it will be a meaningful improvement.
 

Heartbreaker

Diamond Member
Apr 3, 2006
4,653
6,107
136
Even with Nvidia’s approach to maximize tensor performance so that they can leverage software improvements instead of pure raster improvements is going to come to a halt soon enough. Right now, Blackwell’s tensor units go down to INT4/FP4… I suppose they can try INT2/FP2, but with precision that low I’m not convinced it will be a meaningful improvement.

LOL!

I'm surprised they can get anything useful out FP4/Int4. I wonder if people are really using those, or is their best use case inflating benchmarks (Kind of an analog to Fake Frame generation in games).
 

gdansk

Diamond Member
Feb 8, 2011
3,768
6,016
136
Even with Nvidia’s approach to maximize tensor performance so that they can leverage software improvements instead of pure raster improvements is going to come to a halt soon enough. Right now, Blackwell’s tensor units go down to INT4/FP4… I suppose they can try INT2/FP2, but with precision that low I’m not convinced it will be a meaningful improvement.
The people repeating "raster is dead" never finish the sentence. Raster is dead and raytracing will never catch up.

But since machines will be inferring most pixels and most frames it makes raytracing relatively more appealing and increasingly smaller deficit.

It seems most the models used to do this will be overfit and hard to replicate. So we're heading toward something really swell I'm sure.
 

Saylick

Diamond Member
Sep 10, 2012
3,798
8,666
136
LOL!

I'm surprised they can get anything useful out FP4/Int4. I wonder if people are really using those, or is their best use case inflating benchmarks (Kind of an analog to Fake Frame generation in games).
I mean, how else would they keep this plot going exponentially upwards if it doesn’t include lower precision? If the precision bottoms out, it will start to plateau at the same rate you mentioned of 20-30% per new node generation.
 

gdansk

Diamond Member
Feb 8, 2011
3,768
6,016
136
I mean, how else would they keep this plot going exponentially upwards if it doesn’t include lower precision? If the precision bottoms out, it will start to plateau at the same rate you mentioned of 20-30% per new node generation.
View attachment 114495
We're going to infer inference. Why actually calculate 4 whole bits when we can calculate 1 and predict the other 3. And since we're predicting the least significant bits it has little impact, really.
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |