Discussion RDNA4 + CDNA3 Architectures Thread

Page 298 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

DisEnchantment

Golden Member
Mar 3, 2017
1,754
6,631
136





With the GFX940 patches in full swing since first week of March, it is looking like MI300 is not far in the distant future!
Usually AMD takes around 3Qs to get the support in LLVM and amdgpu. Lately, since RDNA2 the window they push to add support for new devices is much reduced to prevent leaks.
But looking at the flurry of code in LLVM, it is a lot of commits. Maybe because US Govt is starting to prepare the SW environment for El Capitan (Maybe to avoid slow bring up situation like Frontier for example)

See here for the GFX940 specific commits
Or Phoronix

There is a lot more if you know whom to follow in LLVM review chains (before getting merged to github), but I am not going to link AMD employees.

I am starting to think MI300 will launch around the same time like Hopper probably only a couple of months later!
Although I believe Hopper had problems not having a host CPU capable of doing PCIe 5 in the very near future therefore it might have gotten pushed back a bit until SPR and Genoa arrives later in 2022.
If PVC slips again I believe MI300 could launch before it

This is nuts, MI100/200/300 cadence is impressive.



Previous thread on CDNA2 and RDNA3 here

 
Last edited:

gaav87

Senior member
Apr 27, 2024
452
794
96
Chinese came to my twitter and said chiphell boss said i quote "i will chop my dk off if the price is above 5k yuan" and other chinese said its bad translated with google translate and it reads "no more than 5k yuan"
So the price is under 600$
 

ToTTenTranz

Senior member
Feb 4, 2021
277
522
136
$749 will be funny. I need it at $749.

$749 would be super funny. I'd get to watch no one buying RDNA4 because AMD suddenly got too high on the smell of their own farts.


Blackwell is indeed a failure on the hardware front, but on software (number of RTX optimized games, transformer-based DLSS4 looking spectacular, DLSS4 tool doing automatic DLL replacements, deals to get RTX Mega Geometry going, etc.) Nvidia has never been as strong.


AMD's leadership is mighty stupid and insane if they believe this is the time they can get away with the "Nvidia -15% price for -10% performance" strategy that's been failing for a decade.
 

adroc_thurston

Diamond Member
Jul 2, 2023
4,714
6,501
96
$749 would be super funny. I'd get to watch no one buying RDNA4 because AMD suddenly got too high on the smell of their own farts.
No that would be based.
(number of RTX optimized games, transformer-based DLSS4 looking spectacular, DLSS4 tool doing automatic DLL replacements, deals to get RTX Mega Geometry going, etc.) Nvidia has never been as strong.
meh.
AMD's leadership is mighty stupid and insane if they believe this is the time they can get away with the "Nvidia -15% price for -10% performance" strategy that's been failing for a decade.
It actually worked since they stopped selling boards at a loss (sans turds like Polaris).
 

carrotmania

Member
Oct 3, 2020
96
245
106
Blackwell is indeed a failure on the hardware front, but on software (number of RTX optimized games, transformer-based DLSS4 looking spectacular, DLSS4 tool doing automatic DLL replacements, deals to get RTX Mega Geometry going, etc.) Nvidia has never been as strong.

Transformer brings is own set of problems, it is no where near perfect. FSR4 is a bigger leap over FSR3 than Transformer is over CNN. All AMD has to do is come up with a fancy name for their new model. Go-bot Model!
 

Meteor Late

Senior member
Dec 15, 2023
266
292
96
Transformer brings is own set of problems, it is no where near perfect. FSR4 is a bigger leap over FSR3 than Transformer is over CNN. All AMD has to do is come up with a fancy name for their new model. Go-bot Model!

Yeah no shit, the difference between DLSS and FSR was like a galaxy apart, FSR was that bad. So FSR4 being a good improvement should indeed reduce the difference, because it was gigantic.
But there is no doubt AMD should be in a much better position software wise, or should I say feature-wise. RT difference looks like it will be fairly small this time.
 

basix

Member
Oct 4, 2024
41
75
51
I assume that FSR4 uses a Vision Transformer as well. The first ViT paper was published 2022, afaik. And when AMD did their option evaluation for FSR4 around a year ago, ViT was one of the options for sure, not only CNN.

But let's wait and see. The first footage from CES looks promising. But SR ist not the whole thing. RR is also in research at AMD. And what about an improved AFMF and RSR?
And because Nvidia has brought MFG to the table, AMD is likely to be "forced" to bring a MFG contender as well.

But it is true: FSR4 will likely close the gap to DLSS. Maybe not completely, but closer than before.
 

GTracing

Senior member
Aug 6, 2021
276
645
106
For those who missed it, Hardware Unboxed has a pretty good video of FSR4 from CES. It's off-screen, but you can still tell that it's a huge improvement. I'm not sure if it's as good on DLSS 3.8, but it's close.


Digital foundry has a video comparing FSR 3.1 to alternatives in the same game. It does a good job pointing out the flaws of FSR3

 

gaav87

Senior member
Apr 27, 2024
452
794
96
I assume that FSR4 uses a Vision Transformer as well. The first ViT paper was published 2022, afaik. And when AMD did their option evaluation for FSR4 around a year ago, ViT was one of the options for sure, not only CNN.

But let's wait and see. The first footage from CES looks promising. But SR ist not the whole thing. RR is also in research at AMD. And what about an improved AFMF and RSR?
And because Nvidia has brought MFG to the table, AMD is likely to be "forced" to bring a MFG contender as well.

But it is true: FSR4 will likely close the gap to DLSS. Maybe not completely, but closer than before.
AFMF 2.1 is in beta drivers already looks better than 2.0 that is for sure still some ghosting but way better.
 

Tup3x

Golden Member
Dec 31, 2016
1,180
1,249
136
It really wasn't. DLSS was craptastic to 3.5, not long ago. 3.8 is still awful in many cases. FSR4 looks to better than 3.8, whereas Transformer (D4) is better in some regards and actually worse in other cases.
No one has done proper comparison yet. All we know that FSR4 looks better than the previous version. Also it sounds like you haven't tested latest DLSS/DLAA (K preset).
 
Reactions: Tlh97 and Gideon

steen2

Junior Member
Aug 21, 2024
9
22
41
It really wasn't. DLSS was craptastic to 3.5, not long ago. 3.8 is still awful in many cases. FSR4 looks to better than 3.8, whereas Transformer (D4) is better in some regards and actually worse in other cases.
DLSS had the added benefit of looking good in static scenes. Motion is blur/occlusion fest. FSR is naturally worse.
And because Nvidia has brought MFG to the table, AMD is likely to be "forced" to bring a MFG contender as well.

But it is true: FSR4 will likely close the gap to DLSS. Maybe not completely, but closer than before.
FSR 3 has had the option for MFG, but not enabled in release drivers.
No one has done proper comparison yet. All we know that FSR4 looks better than the previous version. Also it sounds like you haven't tested latest DLSS/DLAA (K preset).
There's some regression J<->K. Work in progress with FP8 depending on E4M3/E5M2 datatypes for AD/GB.
 
Reactions: Tlh97 and coercitiv

branch_suggestion

Senior member
Aug 4, 2023
504
1,051
96

SpudLobby

Golden Member
May 18, 2022
1,027
695
106
BTW, the latest OpenAI O3, according to my understanding, the advances are mainly on inference. Like totally tipping the scales from biggest advances coming from training (in the past) to the biggest advances coming from inference (in o3).

And then, the best inference queries can consume huge amount of compute resources. Over and over...
Man you guys really are behind here. Adroc/Spec is the worst and had just straight up denied AI is useful and thinks the whole thing is a bubble but lol.

Test time compute is a more efficient way of doing inference as opposed to scaling inference laterally with much more parallel compute and memory (more parameters). That is literally the entire point — a bit more time as the model searches and reasons over the problem (which it is both trained and RL’d to do), the other beneficial side effect is that training (and/or using reinforcement learning for) a base model into a test time reasoning model is also less expensive to develop vs a vastly larger “equivalent” model (if such a thing can be said).

It trades off a good bit of space & energy with time at inference, which offers massively improved “efficiency” in the sense of performance/$/FLOP at complex tasks. Specifically, in one paper test time models beat 14x larger models by using a bit more time (being trained to reason). You can also just go see the DeepSeek R1 30-32B model on Macs blow everything else out for a local LLM. V3 base model was good, but this is about TTC.

The effective result here isn’t that larger models are going away, but rather that this is a new (efficient adjusting for the performance) scaling avenue across the board which makes even more expensive and humongous inference economically viable.

And O3-Mini is over an order of magnitude less expensive than O1, while offering more performance.
 
Reactions: xpea
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |