Discussion RDNA4 + CDNA3 Architectures Thread

DisEnchantment · Mar 23, 2022

With the GFX940 patches in full swing since first week of March, it is looking like MI300 is not far in the distant future!
Usually AMD takes around 3Qs to get the support in LLVM and amdgpu. Lately, since RDNA2 the window they push to add support for new devices is much reduced to prevent leaks.
But looking at the flurry of code in LLVM, it is a lot of commits. Maybe because US Govt is starting to prepare the SW environment for El Capitan (Maybe to avoid slow bring up situation like Frontier for example)

See here for the GFX940 specific commits

History for llvm/lib/Target/AMDGPU - llvm/llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies. - History for llvm/lib/Target/AMDGPU - llvm/llvm-project

github.com

Or Phoronix

More AMD "GFX940" Enablement Work Landing In LLVM - Phoronix

www.phoronix.com

There is a lot more if you know whom to follow in LLVM review chains (before getting merged to github), but I am not going to link AMD employees.

I am starting to think MI300 will launch around the same time like Hopper probably only a couple of months later!
Although I believe Hopper had problems not having a host CPU capable of doing PCIe 5 in the very near future therefore it might have gotten pushed back a bit until SPR and Genoa arrives later in 2022.
If PVC slips again I believe MI300 could launch before it

This is nuts, MI100/200/300 cadence is impressive.

Previous thread on CDNA2 and RDNA3 here

Question - Speculation: RDNA3 + CDNA2 Architectures Thread

Man I have been dying to make this one for a while now. First rumours for RDNA3 are here so new thread time! Just going to start off with this one for now: kopite7kimi on Twitter: "@VideoCardz Ah, I mean a simple mcm design with 10240 cores is not enough. Because the lift from RDNA2 to RDNA3...

forums.anandtech.com

igor_kavinski · Mar 21, 2025

adroc_thurston said:
AMD isn't interested in doing FF slop.

Man, I need an adroc LLM to understand adroc speak

marees · Mar 21, 2025

RDNA 4 in the mix for desktop APUs

marees said:
Zen 6 olympic ridge iod to have 8 RDNA 4 CUs

AMD Zen 6 Powers "Medusa Point" Mobile and "Olympic Ridge" Desktop Processors

AMD is readying two important client segment processors powered by the next-generation "Zen 6" microarchitecture, according to a sensational new report by Moore's Law is Dead. These are the "Medusa Point" mobile processor, and the "Olympic Ridge" desktop. The former is a BGA roughly the size and...

www.techpowerup.com

Timorous · Mar 21, 2025

igor_kavinski said:
Man, I need an adroc LLM to understand adroc speak

Amd are not interested in wasting die space on fixed function units for a tech that is currently a moving target given how immature it is.

adroc_thurston · Mar 21, 2025

Timorous said:
given how immature it is

That's the funny bit, DXR is old. Almost 7 years old.
Pretty much an exact gap between the OG DirectX and DirectX9 (and DX9 and DX11). Too bad!

Gideon · Mar 21, 2025

adroc_thurston said:
That's the funny bit, DXR is old. Almost 7 years old.
Pretty much an exact gap between the OG DirectX and DirectX9 (and DX9 and DX11). Too bad!

I really wish they'd create a DX Next and a Vulkan 2.0. GPUs have come a looooong way since DX 12 / Mantle days and the APIs are still modelled based on 10 year old GPUs. Don't take my word for it, take sebbbi's

(Sebastian Aaltonen is a graphics developer with over 20 years of experience including being a principal engineer for Ubisoft and Unity):

https://twitter.com/x/status/1842987149182005387

Sebastian Aaltonen said:
I am again writing my "No graphics API" blog post. Wrote the first C/C++ example that creates a shader pipeline, texture, RT, buffer, does a dispatch with sampler, output texture, buffer and root data struct containing all of them. Total 84 lines. Much better than Vulkan

And I am confident this API would be faster than Vulkan.

It is still missing barriers and some other stuff. But I have a good plan for barriers. Today barriers can be simpler than what they had to be with 10 years old hardware (that's how old DX12 and Vulkan already are).

I am sure some hardware vendor would say that due to some bug we can't do it like this and due to some limitation somewhere we can't do exactly like this. But that doesn't stop me. This is a visionary piece. Food for thought. It's designed to be 99% implementable with current HW.

The biggest trade-off is that this API can't support 10 year old hardware like Vulkan, Metal, DX12 and WebGPU can. Requires bindless. I am also assuming PCI-E resizable bar and ability to read compressed data without decompress steps, etc. Stuff that latest gen GPUs have.

am also assuming hardware that could support all the latest Vulkan extensions such as the improved binding model extension, buffer ptr extension, 8/16 bit types in shader and data interfaces, etc, etc...

CLARIFICATION: I am not intending to ship a graphics API. I am only writing mock code to show how much simpler it would be to write GPU code without a complex graphics API.

I really hope the just do a new ground up API that only supports modern GPUs. Is a lot thinner and includes GPU-driven draw calls from the start. And among other things would allow to trace rays from anywhere (no arbitary stupid shader stages) and no pipelines (solving a respectable amount of "shader compilation stutter" issues)

SolidQ · Mar 21, 2025

Interesting video. I've noticed in different videos, 1% on NV is worse

poke01 · Mar 21, 2025

SolidQ said:
Interesting video. I've noticed in different videos, 1% on NV is worse

It makes sense when NV GPUs are not fundamentally made for gaming.

Win2012R2 · Mar 21, 2025

Gideon said:
no pipelines (solving a respectable amount of "shader compilation stutter" issues)

How's that going to help solve it?

Gideon said:
I really hope the just do a new ground up API that only supports modern GPUs.

Not going to happen

adroc_thurston · Mar 21, 2025

poke01 said:
It makes sense when NV GPUs are not fundamentally made for gaming.

They very much are.

Gideon · Mar 21, 2025

Win2012R2 said:
How's that going to help solve it?

I'll have to dig up some a couple of year old posts about it as I'm not an expert on subject matter. Essentially the Pipeline Stages of Vulkan are a rather poor fit for modern hardware and a bad abstraction. This cyber_kinetist comment agrees to what many devs have been saying for eons:

Calling Vulkan a failed design is selling it short. There are parts like image l... | Hacker News

news.ycombinator.com

As for the Shader Compilation that's due to usage of monolithic Pipeline State Objects (PSOs) that incorporate multiple shader stages and state configurations into a single objectpromise . Different combinations of render states and shaders create unique pipeline objects, leading to a combinatorial explosion of pipelines to compile.

You can actually use Vulkan without pipelines today:

You Can Use Vulkan Without Pipelines Today

Deploying and developing royalty-free open standards for 3D graphics, Virtual and Augmented Reality, Parallel Computing, Neural Networks, and Vision Processing

www.khronos.org

But rather than having some arbitrary extension, this should be a core feature at the heart of the new API.

The linked Casey Muratori's post from 2015 (Vulkan advisory board) is 100% on the money, still

misc/vulkan_dynamic_state.md at main · cmuratori/misc

Clippings. Contribute to cmuratori/misc development by creating an account on GitHub.

github.com

Win2012R2 said:
Not going to happen

Why not? It's not like DX12 or Vulkan 1.x is going away? Why must r we forever be stuck on legacy support?

Call it DX14, DX Next whatever ... It should be a brand new api. DX7 -> DX9 level transition with all modern features at it's core:

The biggest trade-off is that this API can't support 10 year old hardware like Vulkan, Metal, DX12 and WebGPU can. Requires bindless. I am also assuming PCI-E resizable bar and ability to read compressed data without decompress steps, etc. Stuff that latest gen GPUs have.

am also assuming hardware that could support all the latest Vulkan extensions such as the improved binding model extension, buffer ptr extension, 8/16 bit types in shader and data interfaces, etc, etc...

Also including GPU-driven draw calls, etc ...

adroc_thurston · Mar 21, 2025

Gideon said:
Why not?

a) little real world demand
b) just One More Codepath for devs to maintain is annoying

Win2012R2 · Mar 21, 2025

Gideon said:
I'll have to dig up some a couple of year old posts about it as I'm not an expert on subject matter. Essentially the Pipeline Stages of Vulkan are a rather poor fit for modern hardware and a bad abstraction.

But what's shader compilation got to do with it?

Shaders are typically written in higher level language that can only be pre-compiled (thus solving compile related shader stuttering) if the hardware target is known in advance and it won't change - like consoles. In case of PCs with multiple vendors and even different uarchs compilation of the shaders has to be done by the driver for that GPU, even new driver version can require recompile, nothing one can do about it apart from compiling all shaders before game starts proper, this should become quicker with DX adopting SPIR-V

Gideon said:
Why not? It's not like DX12 or Vulkan 1.x is going away? Why must r we forever be stuck on legacy support?

Intel made their GPU to be more modern so it worked better in DX12, however it sucked in all previous versions and that cost them a LOT to recover. If new GPU is made for new much better API then it's very likely suck in previous DX versions, who is going to buy it? It takes many years to get even small new hardware features incorporated into APIs and even longer for developers to bother using them. The only place where such thing can possibly work is in consoles, but even they need PC sales, so having radically new stuff is problematic.

For both AMD and Nvidia old APIs are their moats.

Tuna-Fish · Mar 21, 2025

Thunder 57 said:
I'm not talking about excessive undervolts, rather mild ones. "Gamer stable" is what some might be trying but when I had Polaris mine was rock steady with a mild undervolt.

Yes, but among people not you there is variability in temperatures, power supply and silicon quality. All of that influences stability, and if they push the voltages down some marginal ones start to become unstable and this is a huge problem for their business.

When you UV, you are extracting some of the safety margin that's there for people with worse cooling, worse PSU and/or worse silicon. AMD cannot use this margin, even if you can.

In2Photos · Mar 21, 2025

I'm sure that the success rate of the GPU UV/OC is similar to that of AMD CPUs. Some people can't even get -5 in curve optimizer when undervolting their CPU, while others get -40.

Gideon · Mar 21, 2025

Win2012R2 said:
But what's shader compilation got to do with it?

Shaders are typically written in higher level language that can only be pre-compiled (thus solving compile related shader stuttering) if the hardware target is known in advance and it won't change - like consoles. In case of PCs with multiple vendors and even different uarchs compilation of the shaders has to be done by the driver for that GPU, even new driver version can require recompile, nothing one can do about it apart from compiling all shaders before game starts proper, this should become quicker with DX adopting SPIR-V

All of this is correct, and SPIR-V will certainly help. But DX12 / Vulkan are strictly worse than DX11 and below due to PSOs and the fact that "different combinations of render states and shaders create unique pipeline objects, leading to a combinatorial explosion of pipelines to compile." This is strictly a DX12 / Vulkan generation issue.

You'll have less shaders to precompile, easier time figuring out wat needs to be compiled JIT (when not doing full shader compilation on initial load) etc. A thread discussing it among devs:

https://twitter.com/x/status/1642088911252910083

Win2012R2 said:
Intel made their GPU to be more modern so it worked better in DX12, however it sucked in all previous versions and that cost them a LOT to recover. If new GPU is made for new much better API then it's very likely suck in previous DX versions, who is going to buy it? It takes many years to get even small new hardware features incorporated into APIs and even longer for developers to bother using them. The only place where such thing can possibly work is in consoles, but even they need PC sales, so having radically new stuff is problematic.

For both AMD and Nvidia old APIs are their moats.

I agree to most of it ,but you missed a crucial point. Nothing new needs to be added to modern GPUs. (RDNA2, Turing and above should support everything in sebbbi's list). Currently there are 2 big abstraction layers that are very far from the actual metal implementation both in the API and in the Driver. All of these can be kept (for DX11 and DX12) but eventually new APIs must be created. DX12 is already nearly 11 years old.

It might be too soon now, but next gen console launch looks to be a good timing for a switch. Eventually the oldest legacy APIs can be rewritten with a wrapper that maps to the new paradigm. Just like DXVK is used for DX9 and DX10 (and as the IHVs themselves do internally for OpenGL and other ancient APIs)

Mopetar · Mar 21, 2025

Thunder 57 said:
No surprise. AMD has been juicing their cards since Polaris. I'm not sure why they can't figure out how to give them proper V/F curves.

It's because they only have two bins (there's probably a third, but that's for whatever isn't a complete brick that will be sold to some OEM to make limited run computers for the Asian market) and if they picked a more sane voltage setting there would be a lot fewer cards that qualify as a 9070 XT and would get some cores disabled. The other alternative is that they turn down the clock speeds and every 9070 XT performs say 5% worse out of the box.

If AMD were willing to make more bins they could have a 9070 (unchanged), a 9070 XT (full cores, reduced voltage and clocks), and a 9070 XTX (basically the 9070 XT as it is now) but most people would still just want the top card and AMD would probably just charge more for that premium model instead of letting everyone play the silicon lottery and maybe get a card that they can undervolt without losing any performance.

I'm not sure if they added time required to bin cards that way is something AMD or their board partners care about doing. Instead they'll pick whatever specs allow 75% (or whatever mix they're looking for) of silicon to qualify for and go with that.

Josh128 · Mar 21, 2025

marees said:
RDNA 4 in the mix for desktop APUs

Thats over a month old now. Since then, leakers have come out and said no RDNA 4, only RDNA 3.5 on Zen 6.

adroc_thurston · Mar 21, 2025

Josh128 said:
only RDNA 3.5 on Zen 6.

No one said that.

marees · Mar 22, 2025

Josh128 said:
Thats over a month old now. Since then, leakers have come out and said no RDNA 4, only RDNA 3.5 on Zen 6.

I don't necessarily think one set of leakers is more right than another set

As someone posted above it could be that RDNA 4 helps to meet MS TOPS requirements (tax) without wasting silicon on useless NPU

Gideon · Mar 22, 2025

marees said:
As someone posted above it could be that RDNA 4 helps to meet MS TOPS requirements (tax) without wasting silicon on useless NPU

That's a great idea, I really hope it ends up true.

8 RDNA4 CUs would probably be as good in gaming as 12 RDNA 3.5 CUs, so actually useful

Josh128 · Mar 22, 2025

adroc_thurston said:
No one said that.

Kepler 4 days ago is nobody?

https://twitter.com/x/status/1901998874127987108

adroc_thurston · Mar 22, 2025

Josh128 said:
Kepler 4 days ago is nobody?

Doesn't mean they're 3.5 either.

eek2121 · Mar 22, 2025

Hans Gruber said:
If you are running Windows 11 24H2, that will screw up your gaming experience with Ryzen CPU's. Microsoft has not fixed this issue. AMD GPU's drivers were buttery smooth before 24H2.

Microsoft finally pushed it down to me a couple weeks ago. I've had zero issues with it. I've done a lot of gaming as of late because I am temporarily out of work (it was supposed to be permanent, but I enrolled in a medical study and the meds they gave me are changing my life...so I'm probably going back to work).

adroc_thurston said:
That's the funny bit, DXR is old. Almost 7 years old.
Pretty much an exact gap between the OG DirectX and DirectX9 (and DX9 and DX11). Too bad!

DirectX has taken years to get to the point where it is at. Progress takes time. It started with FF and things only gradual become more and more programmable.

Gideon said:
I really wish they'd create a DX Next and a Vulkan 2.0. GPUs have come a looooong way since DX 12 / Mantle days and the APIs are still modelled based on 10 year old GPUs. Don't take my word for it, take sebbbi's

(Sebastian Aaltonen is a graphics developer with over 20 years of experience including being a principal engineer for Ubisoft and Unity):

https://twitter.com/x/status/1842987149182005387

I really hope the just do a new ground up API that only supports modern GPUs. Is a lot thinner and includes GPU-driven draw calls from the start. And among other things would allow to trace rays from anywhere (no arbitary stupid shader stages) and no pipelines (solving a respectable amount of "shader compilation stutter" issues)

Funny, I was thinking this the other day. If Microsoft would actually take control of progress and help push it forward, we'd be a lot better off. I feel like they really messed up when they decided DirectX 12 would be the last version ever. Programmable RT pipelines and a bunch of other stuff is still in the future, and a DirectX 13 could bridge the gap between 12 and 14. While I think they should absolutely have a conversation with the big hardware players, they should not be building the API strictly based off said input. If NVIDIA had it their way...oh boy.

There is also still a lot that can be done to simplify the API itself. If you've ever used DirectX 7 or below in a project, you'd love how simple it is to use. 12 is a beast.

adroc_thurston · Mar 22, 2025

eek2121 said:
DirectX has taken years to get to the point where it is at.

Well it's been years. There in no progress.

SolidQ · Mar 22, 2025

adroc_thurston said:
There in no progress.

So people claiming to me. only NVIDIA drives the industry forward, the rest companies just copy!

Discussion RDNA4 + CDNA3 Architectures Thread

Golden Member

Lifer

Senior member

Golden Member

Diamond Member

Golden Member

Golden Member

Diamond Member

Senior member

Diamond Member

Golden Member

Diamond Member

Senior member

Golden Member

Platinum Member

Golden Member

Diamond Member

Senior member

Diamond Member

Senior member

Golden Member

Senior member

Diamond Member

Diamond Member

Diamond Member

Golden Member