Discussion RDNA4 + CDNA3 Architectures Thread

Page 337 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

DisEnchantment

Golden Member
Mar 3, 2017
1,773
6,750
136





With the GFX940 patches in full swing since first week of March, it is looking like MI300 is not far in the distant future!
Usually AMD takes around 3Qs to get the support in LLVM and amdgpu. Lately, since RDNA2 the window they push to add support for new devices is much reduced to prevent leaks.
But looking at the flurry of code in LLVM, it is a lot of commits. Maybe because US Govt is starting to prepare the SW environment for El Capitan (Maybe to avoid slow bring up situation like Frontier for example)

See here for the GFX940 specific commits
Or Phoronix

There is a lot more if you know whom to follow in LLVM review chains (before getting merged to github), but I am not going to link AMD employees.

I am starting to think MI300 will launch around the same time like Hopper probably only a couple of months later!
Although I believe Hopper had problems not having a host CPU capable of doing PCIe 5 in the very near future therefore it might have gotten pushed back a bit until SPR and Genoa arrives later in 2022.
If PVC slips again I believe MI300 could launch before it

This is nuts, MI100/200/300 cadence is impressive.



Previous thread on CDNA2 and RDNA3 here

 
Last edited:

poke01

Diamond Member
Mar 8, 2022
3,390
4,631
106
SLOW
This benchmarks are all over the place from 150k on 9070xt to 195k on 9070 non-xt
GB is so useless...
this big variance occurs when people run GB in the background while they run other GPU benchs.

Otherwise HOW do you explain this, 18729 score and other big differences!

I run GPU tests on GB on AMD, NV, Apple etc and they are always consistent.

 

branch_suggestion

Senior member
Aug 4, 2023
637
1,351
96
More WMMA details.
 

tsamolotoff

Senior member
May 19, 2019
246
503
136
This benchmarks are all over the place from 150k on 9070xt to 195k on 9070 non-xt


Same as CPU geekbench, workloads are so short that if you don't do some frequency fixing stuff GPU simply doesn't ramp up to full 3d clocks
 
Last edited:
Reactions: lightmanek

Kepler_L2

Senior member
Sep 6, 2020
782
3,169
136

Heartbreaker

Diamond Member
Apr 3, 2006
4,745
6,250
136
That's a lot of ifs for stuff that isn't baked into N48 already.

It's going to be a lot easier for them to just make a 6-700mm² 128 CU die if that's what they want than screw around with dual GCDs. Nvidia's hand was forced with Blackwell because it's impossible to manufacture a 1600mm² die, and even with Nvidia's resources they've had manufacturing issues getting B200 out the door.

Also splitting the main compute die, is MUCH easier for compute than it is for gaming. They have been doing multiple die for GPU compute with much slower interfaces with no issues.

Gaming needs to work precisely like one die, or you are hosed back into needing something like SLI/Crossfire software, and split memory pools.

I really liked the RDNA 3 Mem Controller/Cache chiplets as a good way to avoid the split compute problem.

Of course an even simpler way to avoid it, is just stay smaller and avoid MCM problems completely.
 

eek2121

Diamond Member
Aug 2, 2005
3,278
4,828
136
Also splitting the main compute die, is MUCH easier for compute than it is for gaming. They have been doing multiple die for GPU compute with much slower interfaces with no issues.

Gaming needs to work precisely like one die, or you are hosed back into needing something like SLI/Crossfire software, and split memory pools.

I really liked the RDNA 3 Mem Controller/Cache chiplets as a good way to avoid the split compute problem.

Of course an even simpler way to avoid it, is just stay smaller and avoid MCM problems completely.

Chasing clocks would be the sensible thing to do IMO. They also really need to get the chiplet thing figured out.

Unsure why they were unable to make chiplets work. I imagine synchronization is an issue, but it seems that should be solvable.

EDIT: Imagine if they could use 16-32CU CCDs and just chain them together to make up products. Clocks would also be up due to better heat distribution.

One day…
 
Reactions: lightmanek

CastleBravo

Member
Dec 6, 2019
185
424
136
Also splitting the main compute die, is MUCH easier for compute than it is for gaming. They have been doing multiple die for GPU compute with much slower interfaces with no issues.

Gaming needs to work precisely like one die, or you are hosed back into needing something like SLI/Crossfire software, and split memory pools.

I really liked the RDNA 3 Mem Controller/Cache chiplets as a good way to avoid the split compute problem.

Of course an even simpler way to avoid it, is just stay smaller and avoid MCM problems completely.

With zen6 x3d solving the 3d stacking thermal issue by putting the ccd on top of the cache, I bet they could do a 3d stacked gpu with CUs on top of cache/io die(s).
 

gaav87

Senior member
Apr 27, 2024
650
1,267
96
Chasing clocks would be the sensible thing to do IMO. They also really need to get the chiplet thing figured out.

Unsure why they were unable to make chiplets work. I imagine synchronization is an issue, but it seems that should be solvable.

EDIT: Imagine if they could use 16-32CU CCDs and just chain them together to make up products. Clocks would also be up due to better heat distribution.

One day…
Clocks are fine this is gigabyte 9070xt gaming so like, a mid tier
3250mhz at 329W

 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |