Discussion RDNA4 + CDNA3 Architectures Thread

Page 59 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

DisEnchantment

Golden Member
Mar 3, 2017
1,682
6,197
136





With the GFX940 patches in full swing since first week of March, it is looking like MI300 is not far in the distant future!
Usually AMD takes around 3Qs to get the support in LLVM and amdgpu. Lately, since RDNA2 the window they push to add support for new devices is much reduced to prevent leaks.
But looking at the flurry of code in LLVM, it is a lot of commits. Maybe because US Govt is starting to prepare the SW environment for El Capitan (Maybe to avoid slow bring up situation like Frontier for example)

See here for the GFX940 specific commits
Or Phoronix

There is a lot more if you know whom to follow in LLVM review chains (before getting merged to github), but I am not going to link AMD employees.

I am starting to think MI300 will launch around the same time like Hopper probably only a couple of months later!
Although I believe Hopper had problems not having a host CPU capable of doing PCIe 5 in the very near future therefore it might have gotten pushed back a bit until SPR and Genoa arrives later in 2022.
If PVC slips again I believe MI300 could launch before it

This is nuts, MI100/200/300 cadence is impressive.



Previous thread on CDNA2 and RDNA3 here

 
Last edited:

adroc_thurston

Diamond Member
Jul 2, 2023
3,298
4,721
96
chip design is freaking expensive.
Not really, no.
Taping out GPU derivatives is trivial since it's all automated SIMD machine layouts.
So, assuming this isn't just someone's hopeful fantasy, could it be that the ASIC known as "Navi4C" (or whatever it called now) was originally planned for the RDNA5 launch schedule?
no and he doesn't know anything
 

TESKATLIPOKA

Platinum Member
May 1, 2020
2,414
2,906
136
If I get it right, the rumoured config doesn't imply MCD being a separate chiplet anymore, instead it's logic and SRAM incorporated into interposer die to maintain coherence.
What I meant by that was that It maybe doesn't matter how many SEDs are actually used If they are placed in the correct position on that package, like It doesn't matter for functionality If you use less MCDs with RDNA3, because they are placed in the same place just one or two are missing.

Of course I could be very wrong with this considering under those SEDs are 3 active interposers, in other words 3 SEDs per a single active interposer.
And this is what I don't know, If It matters If you put 1 or 2 or 3 above that active interposer.
 

TESKATLIPOKA

Platinum Member
May 1, 2020
2,414
2,906
136
I presume you need coherency of data, so the layout has to be symmetric.

Potentially this may be requirement for scalable geometry, if you think about it.
If It needs to be symmetric, then as I said, you are limited to only 3 possible configs.
3SEDs or 6SEDs or 9SEDs.
And in this case there is no reason to have 3SEDs per AID, when you can't have less SEDs per AID than 3. In this case a single bigger SED per IAD would do the same thing.

edit:
The only reason for these small SEDs would be that there is another smaller interposer, where you can pack only 2SEDs per AID.
So you will end up with:
2SEDs + small AID
3SEDs + big AID
4SEDs + 2 small AIDs
6SEDs + either 2 big AIDs or 3 small AIDs
9SEDs + 3 big AIDs

From these 2 AIDs and 1 SED you can make a full lineup. Without XT are using cutdown SEDs.
2SED -> 36(40)WGP RX 9500(XT), 64MB IC
3SED -> 54(60)WGP RX 9600(XT), 96MB IC
4SED -> 72(80)WGP RX 9700(XT), 128MB IC
6SED -> 108(120)WGP RX 9800(XT), 192MB IC
9SED -> 162(180)WGP RX 9900(XT), 288MB IC
And you wouldn't even need designing another SED.

BTW, 20WGPs per SED looks like an overkill. SED alone should be pretty small(<50mm2), but how to feed 9SEDs with supposedly 180WGP->360CU, that's 3.75x more than N31 If clocks stay the same.
Even with 512-bit and 32gbps DDR7 you end up with only 2.13x more BW than N31.
You would need 896-bit bus paired with 32gbps GDDR7, or a big amount of Infinity cache.
Maybe 64MB of IC per smaller AID, and 96MB of IC per bigger AID.
You will end up with this amount of IC -> look above.

edit2: I think 15WGPs per SED looks more realistic.
 
Last edited:

PJVol

Senior member
May 25, 2020
616
547
136
no and he doesn't know anything
Of course he doesn't, if he referred to his "sources", although neither do you, based on your "very informative" comment.

Anyway, regardless of how far off the mark all of these configs are, assuming AMD keeps the naming scheme for the rdna 5 skus, it would be nice to see a history repeating of the 9700 Pro success.
 
Last edited:

Glo.

Diamond Member
Apr 25, 2015
5,759
4,666
136
If It needs to be symmetric, then as I said, you are limited to only 3 possible configs.
3SEDs or 6SEDs or 9SEDs.
And in this case there is no reason to have 3SEDs per AID, when you can't have less SEDs per AID than 3. In this case a single bigger SED per IAD would do the same thing.

edit:
The only reason for these small SEDs would be that there is another smaller interposer, where you can pack only 2SEDs per AID.
So you will end up with:
2SEDs + small AID
3SEDs + big AID
4SEDs + 2 small AIDs
6SEDs + either 2 big AIDs or 3 small AIDs
9SEDs + 3 big AIDs

From these 2 AIDs and 1 SED you can make a full lineup. Without XT are using cutdown SEDs.
2SED -> 36(40)WGP RX 9500(XT), 64MB IC
3SED -> 54(60)WGP RX 9600(XT), 96MB IC
4SED -> 72(80)WGP RX 9700(XT), 128MB IC
6SED -> 108(120)WGP RX 9800(XT), 192MB IC
9SED -> 162(180)WGP RX 9900(XT), 288MB IC
And you wouldn't even need designing another SED.

BTW, 20WGPs per SED looks like an overkill. SED alone should be pretty small(<50mm2), but how to feed 9SEDs with supposedly 180WGP->360CU, that's 3.75x more than N31 If clocks stay the same.
Even with 512-bit and 32gbps DDR7 you end up with only 2.13x more BW than N31.
You would need 896-bit bus paired with 32gbps GDDR7, or a big amount of Infinity cache.
Maybe 64MB of IC per smaller AID, and 96MB of IC per bigger AID.
You will end up with this amount of IC -> look above.

edit2: I think 15WGPs per SED looks more realistic.
No, you are not limited to three configs.

1, 2, 3, 4, 6, 9 SEDs are possible configurations. Each die is scheduling its own geometry, but it has to be processed in symmetric way.

So if we have a display, then each SED has to process part of the display, in similar fashion to Split Frame Rendering.
 

TESKATLIPOKA

Platinum Member
May 1, 2020
2,414
2,906
136
No, you are not limited to three configs.

1, 2, 3, 4, 6, 9 SEDs are possible configurations. Each die is scheduling its own geometry, but it has to be processed in symmetric way.

So if we have a display, then each SED has to process part of the display, in similar fashion to Split Frame Rendering.
You can's split the frame evenly between 3 or 9 SEDs, but you can split the frame evenly for 5 SEDs, which you didn't even write as a possible configuration.
 

itsmydamnation

Platinum Member
Feb 6, 2011
2,854
3,404
136
Why even bother making gpu for the DIY market then, if you can't make it cheap enough to compete?
Just drop all this RTG nonsense and voila.
but isn't that the point , you want AMD to compete to make you NV GPU cheaper. If AMD build a GPU that is way bigger and way faster they can charge way more and now they aren't used to make NV GPU's cheaper and thus make NV style margins.
 

adroc_thurston

Diamond Member
Jul 2, 2023
3,298
4,721
96
Why even bother making gpu for the DIY market then, if you can't make it cheap enough to compete?
Just drop all this RTG nonsense and voila.
Because you build The Stick which wins it all then sell cheaper parts down the stack leveraging the halo effect.
A $3k halo isn't for you, but a $999 high-end part might be.
 

PJVol

Senior member
May 25, 2020
616
547
136
Because you build The Stick which wins it all then sell cheaper parts down the stack leveraging the halo effect.
A $3k halo isn't for you, but a $999 high-end part might be.
You mean Sony and MS are likely to drop custom design approach and going MCM for their next gens? )
As for the "halo", it's hard to believe that AMD is gonna release a $3K SKU in Radeon lineup in the near future.
 
Last edited:

moinmoin

Diamond Member
Jun 1, 2017
4,993
7,763
136
I'd like to remind that RDNA3 was supposed to be that Chungus that will give Nvidia run for its money.
Well, every gen's goal is to create that chunggus. The question is which gen finally succeeds at that. Guess we won't get that info anytime soon, likely because unlike with MI300 for DC there simply isn't one at the horizon yet.
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |