Discussion RDNA4 + CDNA3 Architectures Thread

Page 251 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

DisEnchantment

Golden Member
Mar 3, 2017
1,754
6,631
136





With the GFX940 patches in full swing since first week of March, it is looking like MI300 is not far in the distant future!
Usually AMD takes around 3Qs to get the support in LLVM and amdgpu. Lately, since RDNA2 the window they push to add support for new devices is much reduced to prevent leaks.
But looking at the flurry of code in LLVM, it is a lot of commits. Maybe because US Govt is starting to prepare the SW environment for El Capitan (Maybe to avoid slow bring up situation like Frontier for example)

See here for the GFX940 specific commits
Or Phoronix

There is a lot more if you know whom to follow in LLVM review chains (before getting merged to github), but I am not going to link AMD employees.

I am starting to think MI300 will launch around the same time like Hopper probably only a couple of months later!
Although I believe Hopper had problems not having a host CPU capable of doing PCIe 5 in the very near future therefore it might have gotten pushed back a bit until SPR and Genoa arrives later in 2022.
If PVC slips again I believe MI300 could launch before it

This is nuts, MI100/200/300 cadence is impressive.



Previous thread on CDNA2 and RDNA3 here

 
Last edited:

Keller_TT

Member
Jun 2, 2024
113
112
76
All_The_Watts (formerly rogane I think) says 9070XT is on par or slightly faster than the 4070 Ti Super but not quite the 4080. It also hits >3 GHz with power above 3000W.

9070 non-XT is faster than 4070 Super, but slower than 4070 Ti Super. Clocks in the 2 GHz range with power also in the 200s W.
If the 9070 XT is >300W for ~7900 XT performance for MBA card, and less than 4080, then it is massively underwhelming, and so much will have to be under scrutiny. Die size, density, bulk for ML and RT cores, PPA, PPW.
I hope this leak is wrong at least in wattage figures for the reference cards.

Looks like the maximum AMD can do is barely catch up to ADA in performance and efficiency after 2 years, not better it. Then there's the software gap for gaming (FSR), design, engineering. AMD's doctor always orders for less hopium and more copium.
 

GTracing

Senior member
Aug 6, 2021
276
646
106
All_The_Watts (formerly rogane I think) says 9070XT is on par or slightly faster than the 4070 Ti Super but not quite the 4080. It also hits >3 GHz with power above 3000W.

9070 non-XT is faster than 4070 Super, but slower than 4070 Ti Super. Clocks in the 2 GHz range with power also in the 200s W.
How long does he spend with all these emojis?
 

iLLusiveMan

Member
Dec 13, 2022
33
58
61
All_The_Watts (formerly rogane I think) says 9070XT is on par or slightly faster than the 4070 Ti Super but not quite the 4080. It also hits >3 GHz with power above 3000W.

9070 non-XT is faster than 4070 Super, but slower than 4070 Ti Super. Clocks in the 2 GHz range with power also in the 200s W.
Didn't he also say that N48 is 240mm^2 ? That aged like milk
 

gaav87

Senior member
Apr 27, 2024
452
794
96
If the 9070 XT is >300W for ~7900 XT performance for MBA card, and less than 4080, then it is massively underwhelming, and so much will have to be under scrutiny. Die size, density, bulk for ML and RT cores, PPA, PPW.
I hope this leak is wrong at least in wattage figures for the reference cards.

Looks like the maximum AMD can do is barely catch up to ADA in performance and efficiency after 2 years, not better it. Then there's the software gap for gaming (FSR), design, engineering. AMD's doctor always orders for less hopium and more copium.
Worse he is saying 9070xt >300W OC (AIB) is >4070tiS not MBA.
 
Reactions: Tlh97 and Gideon

Heartbreaker

Diamond Member
Apr 3, 2006
4,653
6,108
136
8800 series doesn't long live, alot gpu was died.

Mine was also extremely durable. It in was my daily PC for 14 years. It was still running when I retired that PC, though when I tried to boot that PC month later to check something the PSU gave a squeal and died...

Though after about 5 years, it could only play old games.
 

Keller_TT

Member
Jun 2, 2024
113
112
76
Worse he is saying 9070xt >300W OC (AIB) is >4070tiS not MBA.
If that turns out true, then the MBA card might be rated for 265W, perform like 7900GRE + 5% (~4070 Ti), then a +15% TDP AIB card with good cooler could push TDP up to 310W, and make it like a 7900XT raster for the same PPW.

It's 7800 XT all over again then. So, can only be <$500 because it could well flunk against the 5070 in RT+DLSS. That's why the reports of $479 for the base XT perhaps.
Beating 5070 Ti and 4080 huh?
 

Attachments

  • AMD 7900GRE OC efficiency.jpg
    187.5 KB · Views: 12
Last edited:

adroc_thurston

Diamond Member
Jul 2, 2023
4,714
6,503
96
If that turns out true, then the MBA card might be rated for 265W, perform like 7900GRE + 5% (~4070 Ti), then a +15% TDP AIB card with good cooler could push TDP up to 310W, and make it like a 7900XT raster for the same PPW
All of that is wrong. Next.
That's why the reports of $479 for the base XT perhaps.
The price is last minute. Always. It's a GPU war classic.
Beating 5070 Ti and 4080 huh?
Well it depends.
 

soresu

Diamond Member
Dec 19, 2014
3,502
2,783
136
real Tensor core equivalents
RDNA4 has drastically increased the performance of certain ML data types, but it's still not the matrix cores CDNA is using.

Just overhauled CUs for now.

Presumably UDNA will change that, though hopefully not at the expense of CU level integration as I would imagine that would introduce latency to any ML based techniques enhancing perf or image quality like FSR4 or neural radiance cache.
 

Keller_TT

Member
Jun 2, 2024
113
112
76
All of that is wrong. Next.

The price is last minute. Always. It's a GPU war classic.

Well it depends.
Ha ha. At least you give me a chance to hope for something better. But now my base expectations have been revised and I can take a pleasant surprise if it comes along.

Ofc I'm still curious about the actual silicon budget, density and the μarch and what they did to the CUs for ML and RT. So, if that's where the main focus & budget went, then I hope it was worth it and not below potential.
 

adroc_thurston

Diamond Member
Jul 2, 2023
4,714
6,503
96
Dunno, but I do seem to remember Turing having some problems with latency due to jumping between CUDA cores and tensor cores.
It's not a problem with latency, just that the GEMM units hoards the VRF.
So, if that's where the main focus & budget went
They main focus went into bumping per-CU and per-bit oomph.
It's very evident too given the perf.
 

Keller_TT

Member
Jun 2, 2024
113
112
76
Dunno, but I do seem to remember Turing having some problems with latency due to jumping between CUDA cores and tensor cores.
This has been a focus area for Blackwell as it brings tighter integration between the CUDA cores and the Tensor cores. An overview is on their webpage, and the whitepaper is yet to be put up. But it felt like a 1st version of NV UDNA.
 
Reactions: Win2012R2

ToTTenTranz

Senior member
Feb 4, 2021
278
522
136
This has been a focus area for Blackwell as it brings tighter integration between the CUDA cores and the Tensor cores. An overview is on their webpage, and the whitepaper is yet to be put up. But it felt like a 1st version of NV UDNA.

During the keynote what Jensen said was they're now running tensor ops in the cuda cores and not just the tensor cores, and that allows them to use AI in some game-specific instructions.
I don't think that means they can just add the tensor output from cuda/shader cores to the tensor cores, as they still share the same L1 and L2 IIRC.



You're reading too much into puff marketing for a very very very underwhelming uarch.
Underwhelming or not, they're completely alone from the ~$500 up, which means they can ask how much ever they want for the 5090.
 
Reactions: Tlh97 and gaav87

adroc_thurston

Diamond Member
Jul 2, 2023
4,714
6,503
96
they're now running tensor ops in the cuda cores and not just the tensor cores
They could always do that? GEMM is GEMM.
A 'tensor core' is just a hardwired unprogrammable GEMM accelerator.
which means they can ask how much ever they want for the 5090
They very much can't which is why 5090 price is right in line with the pretty massive BOM bump.
 

gaav87

Senior member
Apr 27, 2024
452
794
96
Damn you are on point.
During the keynote what Jensen said was they're now running tensor ops in the cuda cores and not just the tensor cores, and that allows them to use AI in some game-specific instructions.
I don't think that means they can just add the tensor output from cuda/shader cores to the tensor cores, as they still share the same L1 and L2 IIRC.




Underwhelming or not, they're completely alone from the ~$500 up, which means they can ask how much ever they want for the 5090.
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |