Discussion RDNA4 + CDNA3 Architectures Thread

Page 137 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

DisEnchantment

Golden Member
Mar 3, 2017
1,666
6,125
136





With the GFX940 patches in full swing since first week of March, it is looking like MI300 is not far in the distant future!
Usually AMD takes around 3Qs to get the support in LLVM and amdgpu. Lately, since RDNA2 the window they push to add support for new devices is much reduced to prevent leaks.
But looking at the flurry of code in LLVM, it is a lot of commits. Maybe because US Govt is starting to prepare the SW environment for El Capitan (Maybe to avoid slow bring up situation like Frontier for example)

See here for the GFX940 specific commits
Or Phoronix

There is a lot more if you know whom to follow in LLVM review chains (before getting merged to github), but I am not going to link AMD employees.

I am starting to think MI300 will launch around the same time like Hopper probably only a couple of months later!
Although I believe Hopper had problems not having a host CPU capable of doing PCIe 5 in the very near future therefore it might have gotten pushed back a bit until SPR and Genoa arrives later in 2022.
If PVC slips again I believe MI300 could launch before it

This is nuts, MI100/200/300 cadence is impressive.



Previous thread on CDNA2 and RDNA3 here

 
Last edited:

soresu

Platinum Member
Dec 19, 2014
2,912
2,125
136
Yeah since its not a chiplet the BOM should be a good bit less than 7900xt.
Chiplets can also be multi process, as it is with Ryzen 7xxx, and even more so with 7xxxX3D.

With some of those chiplets being fabbed on a less advanced/older process, and thus cheaper to both design, produce masks and buy capacity for.

Consequently a chiplet package can be cheaper than an equivalent monolithic package depending on its size, even with the extra costs of MCM packaging.

The larger the overall size is, the more it relies on a chunky cache, the more it will benefit from chiplets.
 

blackangus

Member
Aug 5, 2022
115
161
76
Chiplets can also be multi process, as it is with Ryzen 7xxx, and even more so with 7xxxX3D.

With some of those chiplets being fabbed on a less advanced/older process, and thus cheaper to both design, produce masks and buy capacity for.

Consequently a chiplet package can be cheaper than an equivalent monolithic package depending on its size, even with the extra costs of MCM packaging.

The larger the overall size is, the more it relies on a chunky cache, the more it will benefit from chiplets.
The genius of this is not lost on me, but once you open this door there is a much larger conversation about complicated trade offs. So was just making a general statement, that I believe is applicable when comparing N48 vs N31.
 
Last edited:

Tuna-Fish

Golden Member
Mar 4, 2011
1,417
1,742
136
Thats interesting, why is that?
The memory is roughly
The node is slightly more expensive?
Both are monolithic.

Whats the story?

N32 is not monolithic. (N31 and N32 use the same MCD, different GDC). N33 is monolithic. I think it just comes down to total die size, N32 has 4x37.5mm² of N6 and 196mm² of N5. I don't think N4P is priced much above N5 anymore, so a small enough monolithic solution shouldn't have that hard a time in beating N32 in cost.
 

blackangus

Member
Aug 5, 2022
115
161
76
N32 is not monolithic. (N31 and N32 use the same MCD, different GDC). N33 is monolithic. I think it just comes down to total die size, N32 has 4x37.5mm² of N6 and 196mm² of N5. I don't think N4P is priced much above N5 anymore, so a small enough monolithic solution shouldn't have that hard a time in beating N32 in cost.
Ahh wait. So was N32 7900xt and N33 was the 7800xt?
If so I was confusing them , I thought Kepler was saying the BOM was going to be less than the 7800xt.
My bad!
(Now your going to tell me the 7800xt was also not monolithic! I will admit I wasnt in the market and didnt pay too close of attention. )
Just had surgery today... so Im blaming this on the meds!
 
Last edited:

Tuna-Fish

Golden Member
Mar 4, 2011
1,417
1,742
136

beginner99

Diamond Member
Jun 2, 2009
5,221
1,594
136
Why? It's lower BOM than N32.

Because it will still sell well enough at $600 as it would offer better value at $600 than any other offering (again this mean the performance predictions made are true, which I doubt). Of course this assume release in Q3 maybe Q4 before blackwell, if it is indeed 2025 and blackwell is on the market, then yeah $500 might be better price but given Nvidia, there is a rather relevant chance that even after blackwell release a $600 N48 would be best value.

It's not about BOM but what people are willing to pay.
 

Vattila

Senior member
Oct 22, 2004
805
1,394
136

Looks impressive and promising. I note that the authors are AMD superstars and Corporate Fellows Sam Naffziger and Michael Mantor, with colleagues Mark Fowler and Mark Leather. It seems they have cracked the scaling problem beyond reticle size.

"A graphics processing unit (GPU) of a processing system is partitioned into multiple dies (referred to as GPU chiplets) that are configurable to collectively function and interface with an application as a single GPU in a first mode and as multiple GPUs in a second mode. By dividing the GPU into multiple GPU chiplets, the processing system flexibly and cost-effectively configures an amount of active GPU physical resources based on an operating mode. In addition, a configurable number of GPU chiplets are assembled into a single GPU, such that multiple different GPUs having different numbers of GPU chiplets can be assembled using a small number of tape-outs and a multiple-die GPU can be constructed out of GPU chiplets that implement varying generations of technology."
 

Vattila

Senior member
Oct 22, 2004
805
1,394
136
So RX 9990 XTX is on then ?

It will be interesting to see what they can bring to market, and whether the design allows them to scale up beyond what Nvidia can do with monolithic designs (assuming they are not on similar chiplet designs already).

Noteworthy, like in MI300, hybrid bonding between GPU chiplets on top of front-end (FE) chiplets seems the obvious embodiment. And hybrid bonding is one important area were AMD is leading the industry.

For interconnecting the FE chiplets, the patent describes "bridges", but mentions both active and passive silicon as possible embodiments. MI300 uses a large silicon interposer to interconnect base dies and HBM, doesn't it? It seems the patent covers this embodiment, as well as elevated fanout bridges (EFB), like in MI200 (and Radeon 7000?). The latter may be more cost-effective for consumer products, perhaps.
 

soresu

Platinum Member
Dec 19, 2014
2,912
2,125
136
It seems they have cracked the scaling problem beyond reticle size
This is just a patent, and implementation at the silicon/metal level + overheads is another thing entirely.

It's not cracked until it's etched in silicon and performing to expectations.

Remember that Bulldozer's CMT architecture sounded plenty good on paper.
 
Last edited:

Mopetar

Diamond Member
Jan 31, 2011
7,988
6,395
136
Bulldozer's CMT wasn't that bad, but AMD trying to treat it like an 8-core CPU played hell with Window's scheduler. The other issue is that it was a much better design for server workloads, but AMD was at such a node disadvantage on top of other issues that no enterprise customers were interested. They might have actually faired better there if they had treated/sold the modules as a single core since anyone paying software costs per core wasn't going to want to use Bulldozer cores.
 
Reactions: Tlh97

soresu

Platinum Member
Dec 19, 2014
2,912
2,125
136
Bulldozer's CMT wasn't that bad, but AMD trying to treat it like an 8-core CPU played hell with Window's scheduler. The other issue is that it was a much better design for server workloads, but AMD was at such a node disadvantage on top of other issues that no enterprise customers were interested. They might have actually faired better there if they had treated/sold the modules as a single core since anyone paying software costs per core wasn't going to want to use Bulldozer cores.
Whatever it's advantages and disadvantages it effectively played like Itanium vs AMD64.

There's little point defending it as apparently even a new stab at the idea fro Cortex-A510 doesn't seem to be particularly impressive.

Don't get me wrong - I'd love to defend AMD, but they clearly bet on the wrong horse one way or another.

If I weren't so rigidly against going back to Intel I probably would have ditched my Piledriver setup for whatever the Core µArch of the time was long before Zen1 hit the ground.
 
Mar 11, 2004
23,148
5,615
146
Whatever it's advantages and disadvantages it effectively played like Itanium vs AMD64.

There's little point defending it as apparently even a new stab at the idea fro Cortex-A510 doesn't seem to be particularly impressive.

Don't get me wrong - I'd love to defend AMD, but they clearly bet on the wrong horse one way or another.

If I weren't so rigidly against going back to Intel I probably would have ditched my Piledriver setup for whatever the Core µArch of the time was long before Zen1 hit the ground.

What's missed is that Bulldozer was to be the start towards the entire reason they bought AMD, they were looking to integrate CPU and GPU into each other to leverage the strengths of each to make their heterogenous processing utopia. They never got close, because the mix of the AMD purchase, Intel's tactics, and other poor management crippled them and prevented AMD from even really attempting the idea (would be interesting to see if they had even gotten to the design phase). Which, someone there should have realized they didn't have the resources to get there and there was no way they'd have enough clout to get Microsoft to adopt it. Of course its fun to imagine what if AMD had, and they ended up with basically taking both Intel and Nvidia's ideas about GPUs (Nvidia making them highly programmable like CPUs, and Intel doing that via x86 cores in Larrabee; arguably AMD had the right idea, with using x86 cores embedded easing the development path towards that as Intel was aiming for, but then having the strengths of the GPU making it better suited than Intel's design).

But yes, this isn't the thread to dredge up that massive failure by AMD. And yes, patents are meaningless if they don't actually lead to something worthwhile. Heck, didn't AMD have patents pertaining to this exact issue before, which is why people were hyped about RDNA 3? Believe it only when you see it with AMD GPU.
 
Reactions: Tlh97 and marees

marees

Member
Apr 28, 2024
98
64
46
there was no way they'd have enough clout to get Microsoft to adopt it.
This is a major issue for AMD in being a leader in PC space. Getting microsoft to tailor their software for AMD when Intel & Nvidia have the major hardware market share

AMD can't be a first mover, if that depends on Microsoft implementing changes on their side to support the same
 
Reactions: Tlh97 and inquiss

marees

Member
Apr 28, 2024
98
64
46
Believe it only when you see it with AMD GPU.
Power consumption of RDNA 3 in gaming scenarios was completely unexpected & not in alignment with the leaks.

The gpu can go close to 4 ghz but not in real life gaming scenarios

Considering that navi 36, 41, 42, & 43 were scrapped with no replacement in sight or not even a refresh of navi 31, my guess is that whatever issues AMD has right now are unfixable for another year or more.
 
Reactions: Mopetar and Tlh97

Rekluse

Member
Sep 16, 2022
33
42
51
Power consumption of RDNA 3 in gaming scenarios was completely unexpected & not in alignment with the leaks.

The gpu can go close to 4 ghz but not in real life gaming scenarios

Considering that navi 36, 41, 42, & 43 were scrapped with no replacement in sight or not even a refresh of navi 31, my guess is that whatever issues AMD has right now are unfixable for another year or more.
My understanding is that the leap-frogging design teams mean that the RDNA 3 speed-power issues haven't touched RDNA 4 and it was rather the multi-chip tiling coherency being unworkable that prevented the top RDNA 4 chips from being taped out
 

soresu

Platinum Member
Dec 19, 2014
2,912
2,125
136
Considering that navi 36, 41, 42, & 43 were scrapped with no replacement in sight or not even a refresh of navi 31, my guess is that whatever issues AMD has right now are unfixable for another year or more.
More likely just concentrating all efforts on RDNA5 for a smooth rollout with a very complete stack of SKUs.
 

soresu

Platinum Member
Dec 19, 2014
2,912
2,125
136
My understanding is that the leap-frogging design teams mean that the RDNA 3 speed-power issues haven't touched RDNA 4 and it was rather the multi-chip tiling coherency being unworkable that prevented the top RDNA 4 chips from being taped out
The teams may be leap frogging, but that doesn't mean there is a Chinese wall separating the design process between them.
 

eek2121

Diamond Member
Aug 2, 2005
3,038
4,237
136
Power consumption of RDNA 3 in gaming scenarios was completely unexpected & not in alignment with the leaks.

The gpu can go close to 4 ghz but not in real life gaming scenarios

Considering that navi 36, 41, 42, & 43 were scrapped with no replacement in sight or not even a refresh of navi 31, my guess is that whatever issues AMD has right now are unfixable for another year or more.
nah, the clock thing is fixed. If RDNA4 launches, it will be clocked higher than RDNA3.
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |