Question Speculation: RDNA3 + CDNA2 Architectures Thread

Page 59 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

uzzi38

Platinum Member
Oct 16, 2019
2,702
6,405
146

Joe NYC

Platinum Member
Jun 26, 2021
2,324
2,929
106
That makes sense, but it's all predicated on whether or not AMD have solved the issue of getting multiple dies to behave as one, especially for gaming workloads. I won't believe it until I see it either.

The issue is connecting the 2 dies with (using round numbers) infinite bandwidth and zero latency and zero power overhead, which can only be achieved by stacking and hybrid bond connection.

AMD already showed how to go about it with CDNA 3. Stack 2 dies on top of common base.



We will see if this type of design will make it to RDNA 4.

In the picture it is the simplest CDNA 3 APU. For discrete GPU, it could have 2 GPU chiplets.

In MLID video, he showed how more of these can be connected (probably with interposer).
 

HurleyBird

Platinum Member
Apr 22, 2003
2,725
1,342
136
We saw it in Apple, that's exactly what the M1 Ultra already is, it's not new, AMD is behind.

Between RDNA 3, CDNA 3, and Zen-X at AMD, Sapphire Rapids, Meteor Lake, and Ponte Vecchio at Intel, and M1 Ultra/Max at Apple, all three companies are doing impressive work with advanced packaging. There are similarities and differences between them. I wouldn't say that AMD is behind Apple though, considering how revolutionary CDNA 3 looks to be.

The company that, at least on the surface level looks to be far behind (bit who knows all of what's going on in their R&D labs) is Nvidia.
 

Kaluan

Senior member
Jan 4, 2022
503
1,074
106

Some people 'PTSDing' it up that the rumored 8x PCIE5.0 on Navi33 will be a massive problem on platforms like Gen10 or 5000G or older stuff are wrong IMHO.

It's the 4GB framebuffer that did 6500XT dirty, not the 4x lanes (this only made it worse). Check RX 5500 XT with 4GB AND 8GB tests on 4x PCIE3.0 and 4.0, the 8GB version sees almost no perf degradation from 4x 3.0.

Same story with high end cards and 8x, 10GB 3080s works just fine, even over 8x PCIE3.0, performance is not optinal in 5 out of 10 titles (few % lower perf), but it's no way near as bad as 4GB 6500XTs over 3.0

Hopefully when Sapphire's 8GB 6500XT gets reviewd (on both 3.0 and 4.0), people will finally start to understand this.
(and to think some Negative Nancy's were criticizing 8GB being useless on a 6500XT as well lol... after most complained the other way around when it launched with just 4GB)

8x PCIE 3.0 on a RX 7600 XT will be fine, calm down people. You'll likely only see measurable degradation on VRAM heavy titles & native 4K, which shouldn't really be the targeted audience anyway. Just as we've seen 3070s s the bed at 4K in games like DOOM Eternal + RT (and be bested by 12GB 3060s lmao)
 
Last edited:
Reactions: Frenetic Pony

DeathReborn

Platinum Member
Oct 11, 2005
2,755
751
136
This is why I'm excited for chiplet GPUs. The design costs are less, the production risks are also less. If you say, split the highest end GPU in half then you can just allocate chiplets based on sales. If the big, headline worthy product doesn't sell a lot you just use the chiplets for the much more popular, half sized product.

It's why we should be seeing a big, ultra high performance product from AMD. Not only can they sell that for the biggest profit margin in the lineup, it's also relatively low risk compared to previous generations.



AMD's CDNA cards seem popular for HPC stuff. And the HPC market has gotten huge all of a sudden, AMD seems to be profiting a lot there. Even if CUDA is easier to work with it's not some magic for supercomputers, where AMD is getting a lot of its money from. Their "bigger numbers than the competition" products might be relatively specialized, but they seem to be selling, and CDNA3 seems set to continue the trend.

I do notice that whenever people compare die size they never seem to account for the huge piece of silicon under the GCD/MCD's. Yes it's cheap but it is quite large, in my opinion it should be counted as without it the GCD & MCD's can't communicate.

CDNA is priced to beat Nvidia not compete, when you are buying by the thousand the costs of Nvidia cuts you to 800 Nvidia for every 1000 AMD.
 

Zepp

Member
May 18, 2019
170
163
116
It's the 4GB framebuffer that did 6500XT dirty, not the 4x lanes (this only made it worse). Check RX 5500 XT with 4GB AND 8GB tests on 4x PCIE3.0 and 4.0, the 8GB version sees almost no perf degradation from 4x 3.0.
the reviews I saw indicated it was the 64bit bus width that was the hindrance. there was noticeable improvement going from pcie3.0 to 4.0
 

Stuka87

Diamond Member
Dec 10, 2010
6,240
2,559
136
We saw it in Apple, that's exactly what the M1 Ultra already is, it's not new, AMD is behind.

The M1 (in all its variations) does not use a chiplet design. Apple simply adds or removes cores from the GPU, which is exactly what AMD and nVidia have been doing for ages.

What AMD is attempting to do by using chiplets, is to use multiple discrete GPUs as one singular device, which is novel, and nobody else has done it.

EDIT: Quick corrections, the Ultra is two Max chips. Not just an increased GPU cores like the M1->Pro->Max is.
 
Last edited:

Stuka87

Diamond Member
Dec 10, 2010
6,240
2,559
136
The M1 Ultra is basically two M1 Max dies fused together.

Sorry, I was thinking of the Max.

However, while the Ultra is basically two max chips, its performance in gaming is far from two max chips. In Rise of the Tomb Raider, which is a MacOS Native game (no emulation), the Max gets 123fps, and the Ultra gets 143. Which is a worse improvement than what older SLI/CF systems would see.

The doubled GPU is a huge help in compute scenarios, where multiple GPUs works fine, and really where the system is intended to be used.

But I don't think Apple solved anything in terms of multiple GPU in gaming.
 

maddie

Diamond Member
Jul 18, 2010
4,787
4,771
136
I do notice that whenever people compare die size they never seem to account for the huge piece of silicon under the GCD/MCD's. Yes it's cheap but it is quite large, in my opinion it should be counted as without it the GCD & MCD's can't communicate.

CDNA is priced to beat Nvidia not compete, when you are buying by the thousand the costs of Nvidia cuts you to 800 Nvidia for every 1000 AMD.
There isn't any big piece of silicon underneath.
 

Kaluan

Senior member
Jan 4, 2022
503
1,074
106
the reviews I saw indicated it was the 64bit bus width that was the hindrance. there was noticeable improvement going from pcie3.0 to 4.0
I think that's a tertiary issue at best, 144GBps + 16MB IF does the job well enough for its performance class in the RDNA2 stack.

I think N24s issues are ranked like this ->
1. VRAM is too small
2. Just 4 PCIE lanes
3. Bandwidth

It's pretty clear from this RX 5500 XT 4GB v 8GB on 4x 3.0 v 4x/8x 4.0 test done by TechSpot/HU that VRAM size is it's (6500 XT's) Achilles heel. Everything falls apart because of it when on 3.0 (vs 8x 4.0 BTW):


Massive performance degradation on 4GB version

Tiny performance degradation on 8GB version


But like I said, we'll know for sure when (if?) Sapphire's custom 8GB RX 6500 XT gets reviewed by a competent tech outlet.

There's also tests of high end RX 6000 and RTX 3000 GPUs on 8x 3.0/8x 4.0/16x 4.0 and perf degradation is tiny. In no way shape or form the 30-60% the 4GB 6500 XT is seeing on 4x 3.0.

So I think even if N33 is 8x PCIE (still a rumor for now), it will be fine.
 

Aapje

Golden Member
Mar 21, 2022
1,467
2,031
106
Very interesting. My guess is that RNDA 4 will have multiple compute chiplets on the high end and that MI250 doubles as a testbed for RDNA 4. Latency is less of an issue for CDNA, so sub-optimal latency is acceptable. Then they can improve on this for RNDA 4.

Other than that I see a lot of overlap between the RNDA 3 rumors and these slides.
 

Mopetar

Diamond Member
Jan 31, 2011
8,004
6,446
136
Very interesting. My guess is that RNDA 4 will have multiple compute chiplets on the high end and that MI250 doubles as a testbed for RDNA 4. Latency is less of an issue for CDNA, so sub-optimal latency is acceptable. Then they can improve on this for RNDA 4.

Other than that I see a lot of overlap between the RNDA 3 rumors and these slides.

Explains a lot of the rumors and confusion around big Navi having multiple chiplets with shaders. Those parts existed, but were actually CDNA.
 

Frenetic Pony

Senior member
May 1, 2012
218
179
116
Explains a lot of the rumors and confusion around big Navi having multiple chiplets with shaders. Those parts existed, but were actually CDNA.

We've known CDNA2 was like this for a long, long time. Just because the last "big leak" was a on a wordpress blog doesn't mean it's actually true, those sites take like $20 and an hour to setup, for reference. The amount of "no wait it's actually" "revisions" the leaks have gone through should tell anyone that exactly 0 GPU leaks should be counted as reliable, no matter how many fancy words they use.

Hells I suspect that stupid thing outed itself right in the post. Out of order operations on a GPU? AFAIK RDNA2 has no such thing and that hasn't gotten past the theoretical stage for any GPU, but here it is claiming it was hidden there all along despite no searches turning up any such thing.
 
Reactions: GodisanAtheist

Mopetar

Diamond Member
Jan 31, 2011
8,004
6,446
136
Yep why try to explain the leaks with "oh there were multiple competing products and blah blah blah" when "people were just making **** up for clicks and money" works just as well and better fits the scatter gun leak approach we've gotten.

Because it's the same disingenuous type of claim without evidence that one would be making against leakers.

If someone were to accuse me of making up stuff for attention and money I'd want them to have some kind of evidence behind that accusation.

I think it would be far more fair to say that many leakers go to far in speculating what small pieces of information they might get actually mean, or that they'll gladly pass along anything that might be a leak without trying to verify it, but that's what's going to separate a leaker from a journalist.
 
Reactions: Tlh97 and maddie

maddie

Diamond Member
Jul 18, 2010
4,787
4,771
136
Well
Because it's the same disingenuous type of claim without evidence that one would be making against leakers.

If someone were to accuse me of making up stuff for attention and money I'd want them to have some kind of evidence behind that accusation.

I think it would be far more fair to say that many leakers go to far in speculating what small pieces of information they might get actually mean, or that they'll gladly pass along anything that might be a leak without trying to verify it, but that's what's going to separate a leaker from a journalist.
Well we know for sure, that many "journalists" outright lie or are propagandists. I guess I'll take the "leakers" first.

As an aside, being right or wrong is not 50-50, as you often hear from simple minds. There is one way to be right and infinite ways to be wrong.
 

xpea

Senior member
Feb 14, 2014
447
142
116
AMD's CDNA cards seem popular for HPC stuff. And the HPC market has gotten huge all of a sudden, AMD seems to be profiting a lot there. Even if CUDA is easier to work with it's not some magic for supercomputers, where AMD is getting a lot of its money from. Their "bigger numbers than the competition" products might be relatively specialized, but they seem to be selling, and CDNA3 seems set to continue the trend.
From own AMD financial number, their CDNA sales is insignificant compared to A100 (not even 10%). So no, except the exascale government deals and one EU supercomputer, AMD GPUs have zero presence in HPC (yet). CDNA3 may change it tho
 

xpea

Senior member
Feb 14, 2014
447
142
116
The company that, at least on the surface level looks to be far behind (bit who knows all of what's going on in their R&D labs) is Nvidia.
The M1 Ultra is basically two M1 Max dies fused together.
This is what Grace Superchip was until using NVLink-C2C. And Hopper had also this feature but it was canceled/postponed (the fused dies solution was called GH202)
 

Saylick

Diamond Member
Sep 10, 2012
3,385
7,149
136
yep exactly like Grace Superchip. And Hopper had also this feature but it was canceled/postponed (the fused dies solution was called GH202)
I wouldn't quite characterize Grace and/or Hopper "superchips" (not a fan of that term) at the same level as Apple's M1 Ultra in terms of die-to-die interconnect. Doesn't M1 Ultra use a silicon bridge? Grace Hopper is like Infinity Fabric in that the interconnect is through substrate, no?
 

desrever

Member
Nov 6, 2021
122
302
106
From own AMD financial number, their CDNA sales is insignificant compared to A100 (not even 10%). So no, except the exascale government deals and one EU supercomputer, AMD GPUs have zero presence in HPC (yet). CDNA3 may change it tho
There are multiple EU and US super computers with MI250X, more than a handful on the TOP500 list. It's a good start.
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |