Discussion RDNA4 + CDNA3 Architectures Thread

DisEnchantment · Mar 23, 2022

With the GFX940 patches in full swing since first week of March, it is looking like MI300 is not far in the distant future!
Usually AMD takes around 3Qs to get the support in LLVM and amdgpu. Lately, since RDNA2 the window they push to add support for new devices is much reduced to prevent leaks.
But looking at the flurry of code in LLVM, it is a lot of commits. Maybe because US Govt is starting to prepare the SW environment for El Capitan (Maybe to avoid slow bring up situation like Frontier for example)

See here for the GFX940 specific commits

History for llvm/lib/Target/AMDGPU - llvm/llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies. - History for llvm/lib/Target/AMDGPU - llvm/llvm-project

github.com

Or Phoronix

More AMD "GFX940" Enablement Work Landing In LLVM - Phoronix

www.phoronix.com

There is a lot more if you know whom to follow in LLVM review chains (before getting merged to github), but I am not going to link AMD employees.

I am starting to think MI300 will launch around the same time like Hopper probably only a couple of months later!
Although I believe Hopper had problems not having a host CPU capable of doing PCIe 5 in the very near future therefore it might have gotten pushed back a bit until SPR and Genoa arrives later in 2022.
If PVC slips again I believe MI300 could launch before it

This is nuts, MI100/200/300 cadence is impressive.

Previous thread on CDNA2 and RDNA3 here

Question - Speculation: RDNA3 + CDNA2 Architectures Thread

Man I have been dying to make this one for a while now. First rumours for RDNA3 are here so new thread time! Just going to start off with this one for now: kopite7kimi on Twitter: "@VideoCardz Ah, I mean a simple mcm design with 10240 cores is not enough. Because the lift from RDNA2 to RDNA3...

forums.anandtech.com

Hans Gruber · Jan 23, 2025

SolidQ said:
Some tweets from AMD

https://twitter.com/x/status/1882166390645203318

https://twitter.com/x/status/1882167126418665619

- YouTube

Auf YouTube findest du die angesagtesten Videos und Tracks. Außerdem kannst du eigene Inhalte hochladen und mit Freunden oder gleich der ganzen Welt teilen.

youtube.com

If you ain't first, you're last! There are no trophies for 2nd place. AMD's GPU team didn't get the memo. Nvidia has 90% of the GPU market. So please, take your time AMD. Don't rush out RDNA 4.

adroc_thurston · Jan 23, 2025

Hans Gruber said:
If you ain't first, you're last

Well yeah, they don't have a halo.

Timorous · Jan 23, 2025

branch_suggestion said:
The L3 including fabric was 90mm^2, 430mm^2 is about as low as N31 mono could go with Zen5 densified L3.
And N48 is ~350mm^2 based on more thorough measurements, smaller than GB203 and uses far cheaper and plentiful GDDR6.

Cache.

350mm is far more in line with what AMD have built for that middle of the road x7 part. It is a bit larger than what a monolithic N32 would be and what N22 was but not ridiculously so.

N32 by the same method is in the sub 300mm region as a monolith. 200 for the GCD, less 13mm for the no longer needed MCD PHYs add 100 for 64MB cache and 256bit bus and you are in the 290mm region. Add 20% for going to what seems like a 4SE of 16CU design and extra RT stuff and CU improvements and 350mm is far more in line with what one would expect.

SolidQ · Jan 23, 2025

Someone seems have real dream and launched it today!

https://videocardz.com/newz/somebody-didnt-get-a-memo-amd-ad-claims-you-can-enjoy-gaming-on-radeon-rx-9070-xt-already

poke01 · Jan 23, 2025

SolidQ said:
Someone seems have real dream and launched it today!

https://videocardz.com/newz/somebody-didnt-get-a-memo-amd-ad-claims-you-can-enjoy-gaming-on-radeon-rx-9070-xt-already

So it was supposed to launch today, if it wasn’t why are they promoting a product with no release date

Win2012R2 · Jan 23, 2025

poke01 said:
why are they promoting a product with no release date

RDNA4 has no release date, only a path to it...

dacostafilipe · Jan 23, 2025

Win2012R2 said:
It's not good because if it's in the shaders then upscaling of already done frame can't be done in parallel whilst new frame is being rendered in those shaders, specialised tensors like Nvidia did is clearly the way to go, but it's probably 5 cents more expensive than AMD's solution so we can't have that.

Yeah, no, that's not how it works. We can do shading and compute "at the same time" (!) without a problem for years now.

Win2012R2 · Jan 23, 2025

dacostafilipe said:
Yeah, no, that's not how it works. We can do shading and compute "at the same time" (!) without a problem for years now.

It might be cheaper, but not better - the undisputed market leader uses dedicated hardware for it and in 2025 neural based upscaling is clearly a must have feature that will be used by a lot of games going forward, this isn't 2018 when AMD could have gotten away with it (if they even had hardware capable enough).

itsmydamnation · Jan 23, 2025

Win2012R2 said:
It's not good because if it's in the shaders then upscaling of already done frame can't be done in parallel whilst new frame is being rendered in those shaders, specialised tensors like Nvidia did is clearly the way to go, but it's probably 5 cents more expensive than AMD's solution so we can't have that.

Wrong memory bandwidth will still limit you. If you have enough memory bandwidth to feed both the alu and the gemm engine at the same time you can probably make it all just work on shaders the same.

Assuming same format gemm engine is mostly a register read/write optimisation over a regular alu, which is really a saving in power at the cost of area.

Win2012R2 · Jan 23, 2025

itsmydamnation said:
you can probably make it all just work on shaders the same.

Ok, so Nvidia is dumb to have dedicated tensors, why do they do it?

dacostafilipe · Jan 23, 2025

Win2012R2 said:
Ok, so Nvidia is dumb to have dedicated tensors, why do they do it?

No, they just made a choice, implemented it and are now updating it gen after gen.

Power/clockgating could be a reason.

maddie · Jan 23, 2025

Win2012R2 said:
Ok, so Nvidia is dumb to have dedicated tensors, why do they do it?

Originally, in gaming cards, as distinct from compute, tensor complexes were a feature looking for a usage scenario.

Win2012R2 · Jan 23, 2025

dacostafilipe said:
No, they just made a choice, implemented it and are now updating it gen after gen.

Power/clockgating could be a reason.

But is it a better choice performance wise or not? It seems to me the answer is obvious yes

maddie said:
Originally, in gaming cards, as distinct from compute, tensor complexes were a feature looking for a usage scenario.

And they found it! And still keep them separate it seems, so it must be better performance option

yuri69 · Jan 23, 2025

SolidQ said:
Someone seems have real dream and launched it today!

https://videocardz.com/newz/somebody-didnt-get-a-memo-amd-ad-claims-you-can-enjoy-gaming-on-radeon-rx-9070-xt-already

AMD marketing at its best

igor_kavinski · Jan 23, 2025

Win2012R2 said:
RDNA4 has no release date, only a path to it...

Mark your calendars, folks!

https://www.bhphotovideo.com/c/product/1872844-REG/asus_prime_rx9070xt_o16g_radeon_rx_9070_xt.html

Yeah, end of March and maybe April if it takes off and stock is insufficient.

Win2012R2 · Jan 23, 2025

igor_kavinski said:
Yeah, end of March

There is no year!

maddie · Jan 23, 2025

igor_kavinski said:
Mark your calendars, folks!

https://www.bhphotovideo.com/c/product/1872844-REG/asus_prime_rx9070xt_o16g_radeon_rx_9070_xt.html

View attachment 115477

Yeah, end of March and maybe April if it takes off and stock is insufficient.

All hoping for April 1st? A fitting date for this launch.

igor_kavinski · Jan 23, 2025

maddie said:
All hoping for April 1st? A fitting date for this launch.

You get delivered an empty Radeon 9070 XT box with the message APRIL FOOL!

dacostafilipe · Jan 23, 2025

Win2012R2 said:
...

I mean, it's not like they use a single complex only for tensor cores, they are still split by SM. It seems that AMD has more fine-grained control over their matrix logic, but they also have bigger SEs, so in the end it should not really matter.

linkgoron · Jan 23, 2025

igor_kavinski said:
Mark your calendars, folks!

https://www.bhphotovideo.com/c/product/1872844-REG/asus_prime_rx9070xt_o16g_radeon_rx_9070_xt.html

View attachment 115477

Yeah, end of March and maybe April if it takes off and stock is insufficient.

This is such a pathetic launch by AMD. They have zero belief in their product. They scared themselves from having a true high-end product, and now can't even launch their mid/high card.

gdansk · Jan 23, 2025

linkgoron said:
This is such a pathetic launch by AMD. They have zero belief in their product. They scared themselves from having a true high-end product, and now can't even launch their mid/high card.

They saw @blckgrffn 's plans and decided to sabotage him in particular.

blckgrffn · Jan 23, 2025

gdansk said:
They saw @blckgrffn 's plans and decided to sabotage him in particular.

There are many not-safe-for-this-forum memes I'd love to post. (f that guy in particular ones)

I'm humbled that they have done so much to thwart my "safe" purchase!

I guess it underlines the fact that increasingly, just buy a good deal now when you see it. Yeah, timing it might still be good but realistically future progress is going to be slowed with process tech and most software advancements can/should be backwards compatible because to do otherwise alienates part of your customer base. The fact that even first gen RTX is getting the new Transformer model is cool.

All of the "bad timing" purchases I made last fall largely for nice builds- $370 6800, $699 7900 XT, $720 MSI Fancy 4070 Ti Super, $950 Zotac Amp+ 4080 Super and this fancy XFX XTX for $900 (all prices pre tax) are all.... fine. Its worth noting that with that 7900XT it was part of a bundle that featured a $225 7800X3D

Point is those builds are done and have been running as intended and been enjoyed for months now. If I was going to feel regret, I am over it now.

Figuring out how to set frame cap/use frame gen 2/low lag/power tuning to get an effective, smooth and low(er) power "160" fps in my games with this beastly card since all these features are just sitting there. This seems to be the new gpu tweaking frontier instead of searching for a few more mhz.

igor_kavinski · Jan 23, 2025

blckgrffn said:
The fact that even first gen RTX is getting the new Transformer model is cool.

That's a ploy. They want to test the patience of those users until they can't stand the slow response times anymore and give up and upgrade.

itsmydamnation · Jan 23, 2025

Win2012R2 said:
Ok, so Nvidia is dumb to have dedicated tensors, why do they do it?

No but you should lean to comprehend what I said , I gave all the key words needed to understand the advantage of a gemm unit.

You need to look at things holistically, nv like big gemm units because they sell cards who's workload is gemm only, in this situation the gemm unit performance advantage is the big reduction in register reads and writes.

But in a mixed workload there is more total computation power across alu and gemm then there is bandwidth. So bandwidth limits you because you still need your inputs and outputs.

There is still an advantage to the gemm unit in this case but it's just not as big of a win as marketing make out because you could have also just had more ALU.

gaav87 · Jan 23, 2025

Well 5090 is cpu limited even at 4k...
Just give us, a freakin 384bit 24gb glued 899$ card no need for more it will be cpu limited anyway.
Or even 9080xt with gddr7 24gb (3gb)...

Discussion RDNA4 + CDNA3 Architectures Thread

Golden Member

Platinum Member

Diamond Member

Golden Member

Golden Member

Diamond Member

Senior member

Senior member

Senior member

Platinum Member

Senior member

Senior member

Diamond Member

Senior member

Senior member

Lifer

Senior member

Diamond Member

Lifer

Senior member

Platinum Member

Diamond Member

Diamond Member

Lifer

Platinum Member

Senior member