Discussion RDNA4 + CDNA3 Architectures Thread

Page 256 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

DisEnchantment

Golden Member
Mar 3, 2017
1,754
6,631
136





With the GFX940 patches in full swing since first week of March, it is looking like MI300 is not far in the distant future!
Usually AMD takes around 3Qs to get the support in LLVM and amdgpu. Lately, since RDNA2 the window they push to add support for new devices is much reduced to prevent leaks.
But looking at the flurry of code in LLVM, it is a lot of commits. Maybe because US Govt is starting to prepare the SW environment for El Capitan (Maybe to avoid slow bring up situation like Frontier for example)

See here for the GFX940 specific commits
Or Phoronix

There is a lot more if you know whom to follow in LLVM review chains (before getting merged to github), but I am not going to link AMD employees.

I am starting to think MI300 will launch around the same time like Hopper probably only a couple of months later!
Although I believe Hopper had problems not having a host CPU capable of doing PCIe 5 in the very near future therefore it might have gotten pushed back a bit until SPR and Genoa arrives later in 2022.
If PVC slips again I believe MI300 could launch before it

This is nuts, MI100/200/300 cadence is impressive.



Previous thread on CDNA2 and RDNA3 here

 
Last edited:

adroc_thurston

Diamond Member
Jul 2, 2023
4,714
6,501
96
I didn't want to get too technical with ray-triangle intersects and all
That's also not a real metric, you're rarely if ever limited by ray-tri hit throughput in video games.
RTRT is basically an exercise in building the lowest latency tree-walking machine.
I.e. we should retvrn to Larrabee.
 

Keller_TT

Member
Jun 2, 2024
113
112
76
That's also not a real metric, you're rarely if ever limited by ray-tri hit throughput in video games.
RTRT is basically an exercise in building the lowest latency tree-walking machine.
I.e. we should retvrn to Larrabee.
Larrabee... That's a name that reverbs. I think I still have Larrabee technical papers uploaded in my cloud. There was talk that AMD could go that route with their Fusion APUs and I really dreaded it as I saw the whole thing as an Intel lock-in. Those were the days when Intel was at its anti-competitive best, and particularly against the green CPU boys.
AMD bet on OpenCL and it's just sad that it fell so far back with too many cooks.
But I'm in touch with Alma Mater fellows at Uni Heidelberg. They started this thing called hipSYCL(renamed OpenSYCL) and few other projects targeting AMD architecture to move beyond CUDA.
 

adroc_thurston

Diamond Member
Jul 2, 2023
4,714
6,501
96
AMD bet on OpenCL and it's just sad that it fell so far back with too many cooks
They actually bet on HSA but that gained zero traction (partly because AMD had no DC h/w to ship). By the time they had, ROCm aka not-CUDA was the only option.
Overshadowing Nvidia's showcase by....not showing performance numbers, release dates, price, or anything concrete
NVidia showcase will be extremely mediocre gains outside of gb202 so anything AMD does might be good nuff.
 

eek2121

Diamond Member
Aug 2, 2005
3,202
4,635
136
That's also not a real metric, you're rarely if ever limited by ray-tri hit throughput in video games.
RTRT is basically an exercise in building the lowest latency tree-walking machine.
I.e. we should retvrn to Larrabee.
Larrabee would have been amazing if they had made it work.

Even now I wonder what a modern day version would look like.

Oh if I were a billionaire…
 

eek2121

Diamond Member
Aug 2, 2005
3,202
4,635
136
Hell no, it's good at exactly one thing GPUs suck at (pointer chasing aka raytracing).
Come on, you gotta think bigger! If your GPU uses the same ISA as your CPU, why would you need a CPU? Imagine having 128-256 cores that you could dynamically allocate between graphics and non-graphics workloads. 🤣

I actually made a simple game engine that worked like this when I had my old 1950X system. It used all the cores and didn’t use a GPU at all. It could dynamically allocate as few or as many cores for logic as needed, the rest were for rendering. It was a prototype, of course, but the results were interesting. Forgot to back up the source code before FFR. 😭
 

Josh128

Senior member
Oct 14, 2022
612
1,001
106
So it seems, according to HXL on Xitter via Chiphell, that the yet unannounced RDNA 4 reveal / embargo lift is delayed again, to an undisclosed date. They say it is to let "Huang go first" again. Why though, if not to try and maximize pricing at the last minute? They already have 5000 series pricing and specs.

Seems they are back to their old trolling ways, thinking back to camping outside an Nvidia presentation with the RX 290X, and "Jebaiting" the price of the RX 5700XT, except even worse now, as they already have Nvidias pricing and specs.
 

CastleBravo

Member
Dec 6, 2019
174
405
136
So it seems, according to HXL on Xitter via Chiphell, that the yet unannounced RDNA 4 reveal / embargo lift is delayed again, to an undisclosed date. They say it is to let "Huang go first" again. Why though, if not to try and maximize pricing at the last minute? They already have 5000 series pricing and specs.

Seems they are back to their old trolling ways, thinking back to camping outside an Nvidia presentation with the RX 290X, and "Jebaiting" the price of the RX 5700XT, except even worse now, as they already have Nvidias pricing and specs.

Bummer. Guess I'll try for a 5080 on launch day then.
 
Reactions: blckgrffn

Heartbreaker

Diamond Member
Apr 3, 2006
4,653
6,108
136
So it seems, according to HXL on Xitter via Chiphell, that the yet unannounced RDNA 4 reveal / embargo lift is delayed again, to an undisclosed date. They say it is to let "Huang go first" again. Why though, if not to try and maximize pricing at the last minute? They already have 5000 series pricing and specs.

Seems they are back to their old trolling ways, thinking back to camping outside an Nvidia presentation with the RX 290X, and "Jebaiting" the price of the RX 5700XT, except even worse now, as they already have Nvidias pricing and specs.

IMO, this is nothing like that "Jebaited" nonsense. I don't see any trolling.

I think Frank Azor responded well in one of the last videos linked. They decided to wait and see what NVidia had, to better target there response.

That is just decent strategy when it's David vs Goliath. I don't blame them at all.

Their only mistake was not making that decision before CES. It's the last minute change that didn't look good.

Further, I could see them wanting a better idea of NVidias actual performance (since NVidia it's hidden with MFG 4X smoke and mirrors) before finalizing pricing.
 

Keller_TT

Member
Jun 2, 2024
113
112
76
Amidst the RDNA4 non-launch, and that there's no top tier Navi4 board, I was thinking why didn't AMD do a refresh of 7900XTX by fixing the silicon glitches that plagued RDNA3? Release it as 7950XTX on N4P that runs more cooler, draws say 10% less power and adds 15% more performance for $800? It can still remain relevant vs 5080 with that VRAM, bandwidth.
N4P is part of the same 5nm design stack and enhances efficiency notably. They do refresh of refresh for CPUs with misleading names just to sell cheap dies with good margins.
 

CastleBravo

Member
Dec 6, 2019
174
405
136
Amidst the RDNA4 non-launch, and that there's no top tier Navi4 board, I was thinking why didn't AMD do a refresh of 7900XTX by fixing the silicon glitches that plagued RDNA3? Release it as 7950XTX on N4P that runs more cooler, draws say 10% less power and adds 15% more performance for $800? It can still remain relevant vs 5080 with that VRAM, bandwidth.
N4P is part of the same 5nm design stack and enhances efficiency notably. They do refresh of refresh for CPUs with misleading names just to sell cheap dies with good margins.

Wouldn't that be roughly the same amount of work as scaling RDNA4 up to 5-600mm^2?
 

Keller_TT

Member
Jun 2, 2024
113
112
76
Wouldn't that be roughly the same amount of work as scaling RDNA4 up to 5-600mm^2?
Is it really?
They're not going to redesign and tape out.. I would think it's more like a further stepping to prune the bugs after tape out. RDNA3 launch looked like they were caught out by some bugs at a late stage that they couldn't fix in time for launch and there was internal dishonesty about it before Lisa went on stage with her slides. That's what came out from the insiders apparently, and that N32 had no problems. Well, that was even worse and deserves to be binned.
N4P can do a shrink for an existing 5nm design.
 

CastleBravo

Member
Dec 6, 2019
174
405
136
Is it really?
They're not going to redesign and tape out.. I would think it's more like a further stepping to prune the bugs after tape out. RDNA3 launch looked like they were caught out by some bugs at a late stage that they couldn't fix in time for launch and there was internal dishonesty about it before Lisa went on stage with her slides. That's what came out from the insiders apparently, and that N32 had no problems. Well, that was even worse and deserves to be binned.
N4P can do a shrink for an existing 5nm design.

They can fix a bug in the architecture and port it to a new process without needing to tape out again?
 

Josh128

Senior member
Oct 14, 2022
612
1,001
106
Is it really?
They're not going to redesign and tape out.. I would think it's more like a further stepping to prune the bugs after tape out. RDNA3 launch looked like they were caught out by some bugs at a late stage that they couldn't fix in time for launch and there was internal dishonesty about it before Lisa went on stage with her slides. That's what came out from the insiders apparently, and that N32 had no problems. Well, that was even worse and deserves to be binned.
N4P can do a shrink for an existing 5nm design.
100% much more complicated and expensive and time consuming than just making due with what was produced and focus all resources into the iterative architecture, which is what theyve done. No Halo card was coming regardless. People just dont buy them, they buy Nvidia instead.
 

Keller_TT

Member
Jun 2, 2024
113
112
76
They can fix a bug in the architecture and port it to a new process without needing to tape out again?
They're just moving to an improved 5nm process. It's not a whole new node step of a different generation. Intel would've called it 5+++.
It should be substantially more straight forward than even Zen to Zen+ from GloFo 14nm to 12nm because TSMC's process readiness and support is the industry's best.
You're making it look way bigger than it is.
 
Reactions: MarutTheSlayer

techjunkie123

Member
May 1, 2024
123
250
96
I'm wishing I could return my 4070, but it's a little late for that.

I think AMD would have a very good shot at getting my money if I was buying this generation.
If the performance is indeed 4080 in raster and 4070 Ti Super in RT, and the price is 600$ or less, I'm buying immediately. That's solid value IMO. That's plenty of card for 1440p w/o upscaling or 4K with upscaling.
 
Reactions: Tlh97 and Mopetar

Keller_TT

Member
Jun 2, 2024
113
112
76
100% much more complicated and expensive and time consuming than just making due with what was produced and focus all resources into the iterative architecture, which is what theyve done. No Halo card was coming regardless. People just dont buy them, they buy Nvidia instead.
I was only talking of a further stepping if they knew what the flaw was. Yes, they would tape out after the fix but my point was it's not a major redesign and tape out process all over. It's the same process that they do before any launch version after the design is sent for first tape out and testing the silicon.

Wasn't Navi 31 A0 silicon anyways? SkyJuice had held back publishing pre-launch performance results as something was amiss.

I would think the only reason is we have the benefit of hindsight of Blackwell 5080 ballpark and the 5070 Ti, and AMD didn't. They hugely overestimated it.
 

reaperrr3

Member
May 31, 2024
55
188
66
I was only talking of a further stepping if they knew what the flaw was. Yes, they would tape out after the fix but my point was it's not a major redesign and tape out process all over. It's the same process that they do before any launch version after the design is sent for first tape out and testing the silicon.
Even if they applied not just some bugfixes, but also RDNA3.5 improvements and shrank to N4P, I agree that should've been far less design- and validation work.

IIRC, there were rumors that an RDNA3 refresh with some improvements was under consideration until some point in 2022, but was dropped due to limited market prospects (splitting the market window between 2 RDNA3 gens would've resulted in poor ROI for both) and in favor of focusing on RDNA4 for time-to-market reasons (only to then cancel chiplet-RDNA4 as well...).

Having to sell off huge stockpiles of over-produced RDNA2 parts didn't help the RDNA3 refresh case either, at least for N32 and N33.

But yeah, I agree that an N31b on N4P with some RDNA3.5 improvements applied (or just finishing that 8 SE/128CU N36 they allegedly had in the works) might've been interesting.
Would've needed to come out a year ago, though, at this point the poor RT perf and lack of dedicated FSR4 acceleration would hurt it badly vs. both Nvidia and N48, even if it were relatively competitive in raster.

I would think the only reason is we have the benefit of hindsight of Blackwell 5080 ballpark and the 5070 Ti, and AMD didn't. They hugely overestimated it.
To be fair, considering how earlier rumors were pointing to 60 SM for GB205 and 96 SM for GB203, I wouldn't be surprised if Nvidia prepared multiple designs with different SM counts and Jensen decided on those cheaper configs at the last second, when it became apparent that there wouldn't be any bigger RDNA4 chips.

So who knows whether the full GB203 would've really been that low-specced if AMD had pulled through with the chiplet RDNA4s.
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |