Question Speculation: RDNA3 + CDNA2 Architectures Thread

Page 222 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

uzzi38

Platinum Member
Oct 16, 2019
2,703
6,405
146

Saylick

Diamond Member
Sep 10, 2012
3,385
7,151
136
IMO, AMD still doesn't have dedicated Machine Learning Tensor HW in RDNA3 cards. They are just using General FP Compute HW to brute force it, and RNDA 3 boosted FP compute a lot.

For AI, AMD is touting RDNA3 Bfloat16 improvements, over RDNA 2, but it's only proportional to their overall improvement in RNDA3 floating point improvements.

Here is the AMD AI improvement claim for RDNA3:


This is just the proportion of General FP compute performance, not some new dedicated HW.

IMO RDNA 4 will get the dedicated AI-ML HW, that Phoenix APU already appears to have.
An FPGA inside a GPU? Seems interesting considering that the FPGA could also be reconfigured to do other things, like encoding or signal processing.
 

Ajay

Lifer
Jan 8, 2001
16,094
8,106
136
An FPGA inside a GPU? Seems interesting considering that the FPGA could also be reconfigured to do other things, like encoding or signal processing.
I don’t know where you saw FPGA, but it will be fixed function units. FPGA makes no sense on a consumer GPU. Even on a server grade GPU, it would only make sense as an on package 'chiplet'.
 

Saylick

Diamond Member
Sep 10, 2012
3,385
7,151
136
I don’t know where you saw FPGA, but it will be fixed function units. FPGA makes no sense on a consumer GPU. Even on a server grade GPU, it would only make sense as an on package 'chiplet'.
Mmm, you're right. It's a tiled approach with each tile having vector units. I had the impression that since it was Xilinx tech, it was an FPGA by default. I guess I was wrong.
 

Heartbreaker

Diamond Member
Apr 3, 2006
4,262
5,259
136
Mmm, you're right. It's a tiled approach with each tile having vector units. I had the impression that since it was Xilinx tech, it was an FPGA by default. I guess I was wrong.

Xilinx has a lot more than just FPGA. Along with the dedicated AI cores, they also have great media encoder cores, that AMD GPUs could really use.

I'm expecting RDNA 4 will at minimum have the AI cores, but hopefully the Xilinx Media encoder as well.
 

adroc_thurston

Diamond Member
Jul 2, 2023
3,322
4,790
96
If they can add them to an APU, they can add them to a GPU.
Again, those are very very different things build for different reasons.
ML cores are table stakes now. If AMD can't figure out how to add ML cores to a GPU, they should just give up.
Is there a single workload in client GPUs that uses matrix cores?
Like, phones has matrix math acc piles for 5 years and workloads don't exist yet.
 

Heartbreaker

Diamond Member
Apr 3, 2006
4,262
5,259
136
Again, those are very very different things build for different reasons.

Not really. AMD is obviously recognizing the need for ML cores. If they are incorporating them in their APUs, the will definitely be adding them either to their Desktop CPUs or Desktop GPUs. I would bet on GPUs.

Is there a single workload in client GPUs that uses matrix cores?

ML cores are used for high quality temporal scaling (DLSS, XeSS), and there are more non gaming use cases that people want to use them for like image processing, or generative image creation with applications like Stable Diffusion.
 
Jul 27, 2020
17,967
11,710
116
If AMD can't figure out how to add ML cores to a GPU, they should just give up.

Stay tuned for future RDNA 3 WMMA support in rocWMMA. This library is portable with nvcuda::wmma and it supports MFMA and (soon) WMMA instructions, thus allowing your application to have hardware-accelerated ML in both RDNA 3 and CDNA 1/2 based systems.
Functionality is there. Software isn't ready yet.
 

Saylick

Diamond Member
Sep 10, 2012
3,385
7,151
136
You know, now that I've read a little bit more on Phoenix's AI Engine, I doubt it will be useful for RDNA4 when the GPU itself is already a wide vector engine. Nvidia specifically uses a tensor unit within each SM that can do low precision matrix math much faster than executing as multiple vector instructions. For RDNA4 to be comparable, it needs a method to fuse vector units into a single tensor unit that has much higher throughput. RDNA3 already has instructions that let's it do matrix math via vector instructions but it's not going to have the same throughput as a dedicated tensor path. There's a reason why CDNA has dedicated tensor units.
 

Heartbreaker

Diamond Member
Apr 3, 2006
4,262
5,259
136
AMD's problem is and seemingly always will be in the software, not the hardware.

Adding ML hardware to their dies should be a piece of cake for the HW team. Making sure anything actually uses it on the other hand is where the real rub is.

They can't exactly have DL Temporal scaling software, until they have HW that can do it fast enough, so it actually improves the frame rate a competitive amount.

My bet is RDNA 4 gets a big leap in ML performance, and new DL scaling software to go with it.
 

adroc_thurston

Diamond Member
Jul 2, 2023
3,322
4,790
96
AMD is obviously recognizing the need for ML cores.
Yes they need it for marketing.
This is a mere replica of 2018 phone AI rage.
Been there, seen that.
ML cores are used for high quality temporal scaling (DLSS, XeSS)
You can do it without ML just as well, see UE5 TSR.
and there are more non gaming use cases that people want to use them for like image processing, or generative image creation with applications like Stable Diffusion.
Margin of error percentage of relevant user base in client.
AMD's problem is and seemingly always will be in the software, not the hardware.
AMD's problem in DC GPUs until MI300 has explicitly been the hardware.
My bet is RDNA 4 gets a big leap in ML performance, and new DL scaling software to go with it.
Lol no.
 
Reactions: Tlh97 and Joe NYC

Heartbreaker

Diamond Member
Apr 3, 2006
4,262
5,259
136
Yes they need it for marketing.

Not that I buy that theory, but even if you did, why don't think it matters at least equally for GPU marketing?

Testing Stable diffusion is kind of Normal now, and there will only be ML applications and benchmarks going forward. It looks pretty bad, when AMD trails all NVidia and Intel cards.

Even if you believe it's only for marketing, they still need it.

You can do it without ML just as well, see UE5 TSR.

As with FSR, TSR produces inferior image quality.

Margin of error percentage of relevant user base in client.

It's more than that and growing. I'm definitely in the camp, that AMD's lack of ML capability has relegated them to: "Only buy if it's at STEEP discount to NVidia".
 

adroc_thurston

Diamond Member
Jul 2, 2023
3,322
4,790
96
Not that I buy that theory, but even if you did, why don't think it matters at least equally for GPU marketing?
Because no OEM and no ISV mandates marketing points in client dGPs, unlike in mobile.
Testing Stable diffusion is kind of Normal now, and there will only be ML applications and benchmarks going forward. It looks pretty bad, when AMD trails all NVidia and Intel cards
We're talking active userbase and not benchmarkers being silly.
As with FSR, TSR produces inferior image quality.
Nah; and again, not worth the area spent.
It's more than that and growing
No lol, NV only ever does it for cheapo CUDA devkit purposes.
I'm definitely in the camp, that AMD's lack of ML capability has relegated them to: "Only buy if it's at STEEP discount to NVidia".
Good that you've said it outright, AMD will never ever make your green stuff cheaper.
Enjoy!
 

Heartbreaker

Diamond Member
Apr 3, 2006
4,262
5,259
136
Because no OEM and no ISV mandates marketing points in client dGPs, unlike in mobile.

We're talking active userbase and not benchmarkers being silly.

Nah; and again, not worth the area spent.

No lol, NV only ever does it for cheapo CUDA devkit purposes.

Good that you've said it outright, AMD will never ever make your green stuff cheaper.
Enjoy!

If AMD listened to you, they would soon be in third place behind Intel. I think they are a little more interested in competing than that.
 

adroc_thurston

Diamond Member
Jul 2, 2023
3,322
4,790
96
If AMD listened to you
It's the opposite, I quote them most of the time.
Their cost analysis teams are some of the best in the industry so they definitely know it better.
they would soon be in third place behind Intel
Intel is considered a running joke in both NV and AMD GPU circles.
just look at ponte trainwreccio.
I think they are a little more interested in competing than that.
AMD is a margin-driven company, 'competing' in your definition is making NV stuff cheaper by shedding dGP gm%% which ain't happening.
 

Tuna-Fish

Golden Member
Mar 4, 2011
1,422
1,759
136
They've cleaned off the inventory long ago, listen to their earnings calls.
AMD might have cleared it off their balance sheets, but N22 and N21 are still being actively sold in significant volume. AMD can't launch a new product that obsoletes them until they are gone, unless they want to really hurt their board/channel partners. If they launch N32 right now, they have to do it at a price point where it is no more attractive than N22 and N21.
 
Reactions: insertcarehere

adroc_thurston

Diamond Member
Jul 2, 2023
3,322
4,790
96
AMD can't launch a new product that obsoletes them until they are gone
Yes they can, things take a while to ramp in the channel.
If they launch N32 right now, they have to do it at a price point where it is no more attractive than N22 and N21.
I mean they've launched N33 just fine into a ton of attractive N23 deals so...
 

RnR_au

Golden Member
Jun 6, 2021
1,822
4,454
106
Adding ML hardware to their dies should be a piece of cake for the HW team. Making sure anything actually uses it on the other hand is where the real rub is.
From my understanding, AMD is reluctant in adding 'tensor' silicon to consumer hardware. They would rather have flexible silicon that can be used in multiple ways. The Unreal engine via Lumen is showing how you can solve a tough global illumination problem without dedicated hardware. I believe other game engines are working on similar solutions.

It would be interesting to see what silicon could come out if AMD went to Epic and asked what they would prefer to get hardware accelerated. But maybe this is already coming via their FPGA on their cpu's... if that is still coming.

FPGA on gpu's? Why not.
 

adroc_thurston

Diamond Member
Jul 2, 2023
3,322
4,790
96
I still believe that the RX 7600 got a release because they overbought N33 wafers
Oh no, AMD is extremely careful on wafer allocation.
Whether there's still any demand from OEMs for N32 laptop will probably end up deciding if they go forward with N32's release.
It missed the cycle so the demand is none until the next year.
Better luck next time! haha
 
Reactions: Tlh97 and moinmoin
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |