Discussion RDNA4 + CDNA3 Architectures Thread

Page 255 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

DisEnchantment

Golden Member
Mar 3, 2017
1,754
6,631
136





With the GFX940 patches in full swing since first week of March, it is looking like MI300 is not far in the distant future!
Usually AMD takes around 3Qs to get the support in LLVM and amdgpu. Lately, since RDNA2 the window they push to add support for new devices is much reduced to prevent leaks.
But looking at the flurry of code in LLVM, it is a lot of commits. Maybe because US Govt is starting to prepare the SW environment for El Capitan (Maybe to avoid slow bring up situation like Frontier for example)

See here for the GFX940 specific commits
Or Phoronix

There is a lot more if you know whom to follow in LLVM review chains (before getting merged to github), but I am not going to link AMD employees.

I am starting to think MI300 will launch around the same time like Hopper probably only a couple of months later!
Although I believe Hopper had problems not having a host CPU capable of doing PCIe 5 in the very near future therefore it might have gotten pushed back a bit until SPR and Genoa arrives later in 2022.
If PVC slips again I believe MI300 could launch before it

This is nuts, MI100/200/300 cadence is impressive.



Previous thread on CDNA2 and RDNA3 here

 
Last edited:

Keller_TT

Member
Jun 2, 2024
113
112
76
Don't remember, I just remember the graph and people talking about it.
I peg it at 4070 Ti for Raster and TiS for RT based on detailed specs. The charts seem to be in that range too.
Jensen couldn't go beyond 549 as it's 12 GB and AMD would be near enough for a $400 16GB 4070S level undercut even if it's not as efficient.
 

soresu

Diamond Member
Dec 19, 2014
3,501
2,782
136
There is also a 5070 compared to 4070 bar graph.

Given raster perf is generally improving significantly slower per gen than RT we can infer that it isn't going to be any great surprise as to how much it has changed.
 

Keller_TT

Member
Jun 2, 2024
113
112
76
Looks like Sony is the most judicious when it comes to maximizing PPA and PPW for an enthusiast console.
Their custom Navi2 6800 + Navi 48 uGPU paired with Zen2 is like a 3.6 GHz 3700X+3070 Ti with 16 Gigs VRAM on tap. For mid-range price conscious gamers, that's the best bang for the buck even if it isn't as subsidized as PS5.
It will improve optimization for the PC too.
 

Heartbreaker

Diamond Member
Apr 3, 2006
4,653
6,108
136
Nvidia is all AI focused. EDIT: Gaming is an afterthought.

Despite some political rhetoric. Corporations aren't people. They can focus on multiple things.

There are ZERO signs they are spending less on gaming R&D, and I would bet they are probably spending more. The people that work in those divisions are 100% focused on gaming.

They have more money from and for AI, but that hasn't taken anything away from their gaming division, which most likely better funded, and better staffed than it has ever been.
 

soresu

Diamond Member
Dec 19, 2014
3,501
2,782
136
Anyone have, any clue if RDNA4 has any cooperative vectors for neural rendering like nvidia and intel (vpu) have ?
Neural rendering in this context is just code for "uses ML ops to augment performance or image quality".

In which case yes, RDNA4 reportedly receives a major uptick for computing many ML data types/ops through improvements to its CUs.

Tensor cores, matrix cores, Intel VPU etc are just domain specific (tuned to/designed for a specific workload type) accelerators for AI/ML ops, whereas CUs and CUDA cores are general compute that work with everything, albeit not at its most optimal performance or efficiency.
 
Last edited:

poke01

Diamond Member
Mar 8, 2022
3,036
4,018
106
Anyone have, any clue if RDNA4 has any cooperative vectors for neural rendering like nvidia and intel (vpu) have ?

View attachment 114850
View attachment 114851
Neural rendering in this context is just code for "uses ML ops to augment performance or image quality".

In which case yes, RDNA4 reportedly receives a major uptick for computing many ML data types through improvements to its CUs.

Tensor cores, matrix cores, Intel VPU etc are just domain specific (tuned to/designed for a specific workload type) accelerators for AI/ML ops, whereas CUs and CUDA cores are general compute that work with everything, albeit not at its most optimal performance or efficiency.
AMD can support this with RDNA4
 
Reactions: gaav87

soresu

Diamond Member
Dec 19, 2014
3,501
2,782
136
AMD can support this with RDNA4
That would be my assumption yes.

At the end of the day nVidia have the problem of trying to create a walled garden in gaming due to AMD having sewn up the hi end in consoles, and therefore most games produced will need to be flexible in hardware implementation if they want said features to be actually used by game devs.

The same is true with nVidia doing research into speeding up path tracing with ReSTIR etc in order to further their push towards RT gaming.

At the end of the day it needs to be flexible enough to have wide hardware support so that devs don't have to make entirely redundant code to support multiple platforms.
 

Win2012R2

Senior member
Dec 5, 2024
647
609
96
AMD can support this with RDNA4
They could "support" ray tracing too in RDNA2, just very slowly... seems inevitable since Nvidia specifically said they've got some kind of "light" tensors very tightly integrated with shaders or something along those lines, so for light neural stuff it will be very efficient.

But in any case it will take Microsoft at least a year to push update out to DX, and then years before any major game starts supporting it, by that point DLSS 8 will be playing games for you...
 

gaav87

Senior member
Apr 27, 2024
452
794
96
AMD can support this with RDNA4
Neural rendering in this context is just code for "uses ML ops to augment performance or image quality".

In which case yes, RDNA4 reportedly receives a major uptick for computing many ML data types/ops through improvements to its CUs.

Tensor cores, matrix cores, Intel VPU etc are just domain specific (tuned to/designed for a specific workload type) accelerators for AI/ML ops, whereas CUs and CUDA cores are general compute that work with everything, albeit not at its most optimal performance or efficiency.

Hm this would mean they have separate hw for RT now ?
 

gaav87

Senior member
Apr 27, 2024
452
794
96
During the keynote what Jensen said was they're now running tensor ops in the cuda cores and not just the tensor cores, and that allows them to use AI in some game-specific instructions.
I don't think that means they can just add the tensor output from cuda/shader cores to the tensor cores, as they still share the same L1 and L2 IIRC.




Underwhelming or not, they're completely alone from the ~$500 up, which means they can ask how much ever they want for the 5090.
And what if they are running neural shader ops on tensor cores with the use of cooperative vectors (from the screenshot on nvidia website) ?
Why would they use expensive gddr7 just for +10% performance uplift ? And the +33-45% leatherjacket showed is correct and that made amd scared ? Fuk knows...

 

Keller_TT

Member
Jun 2, 2024
113
112
76
Nope RDNA4 RT is done on CUs.
RDNA 4 does add hardware specific for RT though for BVH traversal and RT calculations. RDNA 3 introduced WMMA support to speed up matrix multiplication which too was done on CUs on RDNA 2 and was a very inefficient resource hog. RT was an afterthought for RDNA2.
AMD just didn't add any special "matrix cores", but just beefed up the CUs with hardware acceleration for RT & ML data structures.

On paper, RDNA4 doubles RTops over RDNA 3, and AMD also had bottlenecks to unlock the full potential of WMMA in RDNA 3 which should be rectified in the new architecture. So, overall, I would definitely expect a 2-3x boost from RDNA2 based on the game. Except for heavy Path tracing, the gap to 5070 class Blackwell should be significantly less.
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |