Question Speculation: RDNA3 + CDNA2 Architectures Thread

uzzi38 · Jan 23, 2021

Man I have been dying to make this one for a while now.

First rumours for RDNA3 are here so new thread time!

Just going to start off with this one for now: kopite7kimi on Twitter: "@VideoCardz Ah, I mean a simple mcm design with 10240 cores is not enough. Because the lift from RDNA2 to RDNA3 is much bigger than from RDNA1 to RDNA2. We should expect many big improvements in GFX11. 🤔" / Twitter

Ajay · Apr 9, 2022

jpiniero said:
I doubt N6 is cheaper than N7. You might get more wafers out of it. The density bump would make it cheaper per transistor all other things considered.

Apparently there are cost savings due to the number of steps (few masks). The density difference is pretty small, but the N7->N6 is supposedly a 'cheap' transition engineering wise, so that's a plus. TSMC, reportedly, was trying to push customers to N6 - that just wouldn't work out unless there was a net cost advantage in good dies per wafer.

maddie · Apr 9, 2022

jpiniero said:
I doubt N6 is cheaper than N7. You might get more wafers out of it. The density bump would make it cheaper per transistor all other things considered.

Say you took Navi 21, shrank it to N6, cut the lanes down to 8 and the bus to 128 bit... you would think that would still be bigger than Navi 22. Maybe you could get close. Throw RDNA3 as well and it'd be tough for the die to not bloat. RDNA3 would have to be pretty die efficient to still be ~6900 XT performance.

Some rumors to reconcile. Not even considering N6 is at least equivalent to N7 and most probably cheaper in cost.

N33 has 128 bus
Higher clocks and "IPC" than N2x series
Higher perf/W than N2x series
Better RT performance

This leads to, for equal performance, smaller die, cheaper board and cooling.

How do you see a 6900XT as being cheaper to make?

biostud · Apr 9, 2022

Glo. said:
RGT gave more details: like Shader count, a week ago:

https://twitter.com/x/status/1512739096581521408
The twitt is from today, but the video link is from almost a week ago.

So why didn't you post it here before today?

Glo. · Apr 9, 2022

biostud said:
So why didn't you post it here before today?

I forgot that this thread exists...

jpiniero · Apr 9, 2022

maddie said:
How do you see a 6900XT as being cheaper to make?

Perhaps I should rephrase - people would absolutely buy a used 6900 XT for $500 rather than this. That's the fear.

I'm unconvinced AMD would spend the effort to do the entire RDNA3 IP on N6 if Navi 33 is the only product that will launch with it. AFAIK Phoenix is N5 monolithic.

maddie · Apr 9, 2022

jpiniero said:
Perhaps I should rephrase - people would absolutely buy a used 6900 XT for $500 rather than this. That's the fear.

I'm unconvinced AMD would spend the effort to do the entire RDNA3 IP on N6 if Navi 33 is the only product that will launch with it. AFAIK Phoenix is N5 monolithic.

Why? 8G vs 16G, higher power, lower vs higher RT? What would decide the issue? What if N33 is around $500, then what?
You also appear to think no RDNA3 GPUs for mid-lower end range? Where is this coming from?

Your arguments also appear to be contradictory. N3x is high cost & price, and yet, we're expected to believe that used GPUs will be cheap. In most scenarios, used GPUs and new GPUs correlate in price.

jpiniero · Apr 9, 2022

maddie said:
Why? 8G vs 16G, higher power, lower vs higher RT? What would decide the issue? What if N33 is around $500, then what?
You also appear to think no RDNA3 GPUs for mid-lower end range? Where is this coming from?

Yep. No mid range and below. Maybe later I guess.

Your arguments also appear to be contradictory. N3x is high cost & price, and yet, we're expected to believe that used GPUs will be cheap. In most scenarios, used GPUs and new GPUs correlate in price.

In a theoretical mining collapse, the market would be flooded with used GPUs. AMD's been burnt on this (twice?). I don't think they want to get burnt again.

maddie · Apr 9, 2022

jpiniero said:
Yep. No mid range and below. Maybe later I guess.

In a theoretical mining collapse, the market would be flooded with used GPUs. AMD's been burnt on this (twice?). I don't think they want to get burnt again.

Do you plan for a mining collapse and refuse to design whole classes of GPUs. This, in effect, is what you're saying. What if it doesn't happen, do you forgo the largest part of the market? Unrealistic, in my view.

beginner99 · Apr 9, 2022

jpiniero said:
In a theoretical mining collapse, the market would be flooded with used GPUs

This isn't theoretical at all. Ethereum will move to proof-of-stake soon, like June/July soon. Wonder why used market suddenly has a lot more GPUs for reasonable prices? Yeah, from miners, usually smaller ones. They bet on getting out earlier to get back some of the cards costs. In 2-3 months? used prices will be down by 50%.

jpiniero · Apr 9, 2022

maddie said:
Do you plan for a mining collapse and refuse to design whole classes of GPUs. This, in effect, is what you're saying. What if it doesn't happen, do you forgo the largest part of the market? Unrealistic, in my view.

You rebrand Navi 2x instead.

maddie · Apr 9, 2022

jpiniero said:
You rebrand Navi 2x instead.

N6 less than N7, production cost wise. You keep ignoring this. Simpler N3x cards also have less BOM than equivalent performance N2x cards. The amortization cost of new N3x designs on N6 is the factor, and if you do one, all the blocks, then the rest is a marginal cost. Where is the crossover point in total sales? I don't know.

You either do all or none on N6.

jpiniero · Apr 9, 2022

maddie said:
You either do all or none on N6.

That's what I mean. Doing RDNA3 on N6 makes no sense. A shrink of RDNA2 to N6 with some gutting would make more sense (and that doesn't appear to be what they are doing). But either of these isn't going to work if mining collapses.

Glo. · Apr 9, 2022

beginner99 said:
This isn't theoretical at all. Ethereum will move to proof-of-stake soon, like June/July soon. Wonder why used market suddenly has a lot more GPUs for reasonable prices? Yeah, from miners, usually smaller ones. They bet on getting out earlier to get back some of the cards costs. In 2-3 months? used prices will be down by 50%.

I believe quite some hefty portion of miners will move to Ergo, Ethereum Classic mining. Smaller miners which will not get immediate ROI on their investments will sell their GPUs.

maddie · Apr 9, 2022

jpiniero said:
That's what I mean. Doing RDNA3 on N6 makes no sense. A shrink of RDNA2 to N6 with some gutting would make more sense (and that doesn't appear to be what they are doing). But either of these isn't going to work if mining collapses.

Well then, we will have to disagree until events prove one position wrong. If AMD operated in a sole supplier position, this could work, but they don't.

jpiniero · Apr 9, 2022

maddie said:
Well then, we will have to disagree until events prove one position wrong. If AMD operated in a sole supplier position, this could work, but they don't.

nVidia has the same problem. Ada costs are going way up too. They are however better positioned to deal with a mining collapse because marketing and OEM deals.

And it's only possibly a problem if this thing isn't faster than the 6900 XT.

Frenetic Pony · Apr 9, 2022

None of these rumors seem very logical. Denser nodes like N5 and chiplets are both there to shrink giant monolithic chips and so save money. A 520mm chip is quite big, exactly the sort of target you'd want for chiplets and denser nodes both. And how would this possibly be a mid range chip, none of this looks like the BOM would be much lower than a 6900xt, which is probably $300 or more. Unless AMD has suddenly decided to drop almost its entire profit margin this doesn't make a lot of sense.

Nor does the tiny bus. I still don't see how that works. Some sort of magical prefetch based on static code analysis into a giant prefetch cache is the only thing I can think of, but that would require giant bubbles of available bandwidth.

I get the N6 idea at least a little, which seems accurate based on the LinkdIn profile I guess? (Unless that was a plant, which sounds ludicrous but I'll not underestimate people with too much time on the internet). Either way the only possible use case for one chip being on N6 is if it's tiny enough to justify monolithic in the first place.

It'd make way more sense if AMD had managed to design a single GPU compute chiplet with like 40CUs on N5, and then linked more and more of them together (possible with a separate IO die) like they did with their CPU chiplets. You'd only have to design one GPU chiplet, and just like with CPUs they could scale that chiplet throughout the entire lineup by just linking more together.

maddie · Apr 9, 2022

Frenetic Pony said:
None of these rumors seem very logical. Denser nodes like N5 and chiplets are both there to shrink giant monolithic chips and so save money. A 520mm chip is quite big, exactly the sort of target you'd want for chiplets and denser nodes both. And how would this possibly be a mid range chip, none of this looks like the BOM would be much lower than a 6900xt, which is probably $300 or more. Unless AMD has suddenly decided to drop almost its entire profit margin this doesn't make a lot of sense.

Nor does the tiny bus. I still don't see how that works. Some sort of magical prefetch based on static code analysis into a giant prefetch cache is the only thing I can think of, but that would require giant bubbles of available bandwidth.

I get the N6 idea at least a little, which seems accurate based on the LinkdIn profile I guess? (Unless that was a plant, which sounds ludicrous but I'll not underestimate people with too much time on the internet). Either way the only possible use case for one chip being on N6 is if it's tiny enough to justify monolithic in the first place.

It'd make way more sense if AMD had managed to design a single GPU compute chiplet with like 40CUs on N5, and then linked more and more of them together (possible with a separate IO die) like they did with their CPU chiplets. You'd only have to design one GPU chiplet, and just like with CPUs they could scale that chiplet throughout the entire lineup by just linking more together.

I'm curious.What were you thinking when the rumors of a 256bit bus for N21 first arose?

As to lower costs, some possible BOM reductions in comparison to the 6900XT

Die itself = smaller die.
N6 vs N7
Less die area due to less memory controllers
Less shaders due to higher clocks and perf/clock

Card.
1/2 memory chips needed
lower capacity cooling system
less power circuitry on card
Simpler PCB (X8 PCie)

I can only imagine that the SoIC tech is more difficult/expensive than I assumed. CPUs are a lot easier to connect from an efficiency standpoint. Data movement is an order or more of magnitude greater. Check out the bandwidth of the internal connections in a GPU.

Kepler_L2 · Apr 9, 2022

jpiniero said:
Perhaps I should rephrase - people would absolutely buy a used 6900 XT for $500 rather than this. That's the fear.

I'm unconvinced AMD would spend the effort to do the entire RDNA3 IP on N6 if Navi 33 is the only product that will launch with it. AFAIK Phoenix is N5 monolithic.

Phoenix isn't RDNA3 anyway.

jpiniero · Apr 9, 2022

Kepler_L2 said:
Phoenix isn't RDNA3 anyway.

It's not? Strange. What is it, RDNA2?

Frenetic Pony · Apr 9, 2022

maddie said:
I'm curious.What were you thinking when the rumors of a 256bit bus for N21 first arose?

As to lower costs, some possible BOM reductions in comparison to the 6900XT

Die itself = smaller die.
N6 vs N7
Less die area due to less memory controllers
Less shaders due to higher clocks and perf/clock

Card.
1/2 memory chips needed
lower capacity cooling system
less power circuitry on card
Simpler PCB (X8 PCie)

I can only imagine that the SoIC tech is more difficult/expensive than I assumed. CPUs are a lot easier to connect from an efficiency standpoint. Data movement is an order or more of magnitude greater. Check out the bandwidth of the internal connections in a GPU.

I was thinking "Oh that's cool, they solved something". I get why the LLC works, you store your current frame buffers in there, that's the most commonly accessed data for graphics in a GPU. You need to read and write to that a ton. But you also need to read out a lot of your main memory data every frame. Developers don't fill those multiple gigs of memory up with nothing, and this is especially true of deferred rendering (which is most engines today). They need to read then copy out uncompressed g-buffers every frame, and while developers are solving this GPU makers can't assume they have.

LLC and cache structure was, and is, a known commodity. The bottleneck between memory and processor is well understood and has been an optimization target for everything for quite a while now. But there's not much left to do there, everything is compressed, everything is cached up to the eyeballs, standard pre-fetch has been around for decades now on the GPU. Unlike LLC, which was similar to something first shown off elsewhere, or delta compression, another topic that had papers out before anyone implemented it, there's no publicly known solution to how AMD could squeeze yet more out of their bus that I'm aware of. And all I can come up with is the assumption, possibly a dangerous one, that there's bubbles of bandwidth availability in those buses on all titles. Not just some, not just most, all titles, that could be used to prefetch assets such that the bubbles are smoothed out. That is literally the only physical way this could work, and it's a very overarching assumption.

The cost doesn't go down almost at all either. Your assumptions are wrong, the rumor directly points to this card being nigh as big as a 6900xt. The only real savings are less ram, and that doesn't matter. What matters is the point that it would cost less if this card were on N5 and a chiplet based card. There is no "well it's good enough" here, if AMD can just make more money in a really obvious way like putting the card on a better node for it then they are nigh guaranteed to do that.

maddie · Apr 9, 2022

Frenetic Pony said:
I was thinking "Oh that's cool, they solved something". I get why the LLC works, you store your current frame buffers in there, that's the most commonly accessed data for graphics in a GPU. You need to read and write to that a ton. But you also need to read out a lot of your main memory data every frame. Developers don't fill those multiple gigs of memory up with nothing, and this is especially true of deferred rendering (which is most engines today). They need to read then copy out uncompressed g-buffers every frame, and while developers are solving this GPU makers can't assume they have.

LLC and cache structure was, and is, a known commodity. The bottleneck between memory and processor is well understood and has been an optimization target for everything for quite a while now. But there's not much left to do there, everything is compressed, everything is cached up to the eyeballs, standard pre-fetch has been around for decades now on the GPU. Unlike LLC, which was similar to something first shown off elsewhere, or delta compression, another topic that had papers out before anyone implemented it, there's no publicly known solution to how AMD could squeeze yet more out of their bus that I'm aware of. And all I can come up with is the assumption, possibly a dangerous one, that there's bubbles of bandwidth availability in those buses on all titles. Not just some, not just most, all titles, that could be used to prefetch assets such that the bubbles are smoothed out. That is literally the only physical way this could work, and it's a very overarching assumption.

The cost doesn't go down almost at all either. Your assumptions are wrong, the rumor directly points to this card being nigh as big as a 6900xt. The only real savings are less ram, and that doesn't matter. What matters is the point that it would cost less if this card were on N5 and a chiplet based card. There is no "well it's good enough" here, if AMD can just make more money in a really obvious way like putting the card on a better node for it then they are nigh guaranteed to do that.

I asked what did you think before we knew it's performance on a 256 bit bus. Rumors of?

News to me. What rumors indicate N33 = 6900XT card size?

16 vs 8 GB doesn't save money?
Less power doesn't save money?
8X PCie doesn't save money?
Smaller die doesn't save money?

I can't write what I think.

Kepler_L2 · Apr 9, 2022

jpiniero said:
It's not? Strange. What is it, RDNA2?

Yep

Timorous · Apr 10, 2022

jpiniero said:
That's what I mean. Doing RDNA3 on N6 makes no sense. A shrink of RDNA2 to N6 with some gutting would make more sense (and that doesn't appear to be what they are doing). But either of these isn't going to work if mining collapses.

Supply.

Splitting across nodes means more supply for the chip that will likely be the most popular desktop SKU and used in mobile.

If AMD made all RDNA3 on N5 then it would constrain supply due to Genoa and Ryzen being the better margin products.

Having mainstream N33 on N6 means there is less juggling so AMD can manufacture more units.

xpea · Apr 10, 2022

Timorous said:
Supply.

Splitting across nodes means more supply for the chip that will likely be the most popular desktop SKU and used in mobile.

If AMD made all RDNA3 on N5 then it would constrain supply due to Genoa and Ryzen being the better margin products.

Having mainstream N33 on N6 means there is less juggling so AMD can manufacture more units.

As a reminder, TSMC N7 and N6 use same equipment and same production lines.
N5 use different line that is shared with N4

Mopetar · Apr 10, 2022

xpea said:
As a reminder, TSMC N7 and N6 use same equipment and same production lines.
N5 use different line that is shared with N4

N6 uses some EUV equipment I believe, but it's otherwise compatible with N7 designs even though it requires new masks. It has fewer production steps so there's greater throughput.

There's no reason to make anything new on N7 since you'd save money just moving to N6 with mask cost. However for anything that's close to the end of production it's probably cheaper just to keep it on N7.

Question Speculation: RDNA3 + CDNA2 Architectures Thread

Platinum Member

Lifer

Diamond Member

Lifer

Diamond Member

Lifer

Diamond Member

Lifer

Diamond Member

Diamond Member

Lifer

Diamond Member

Lifer

Diamond Member

Diamond Member

Lifer

Senior member

Diamond Member

Senior member

Lifer

Senior member

Diamond Member

Senior member

Golden Member

Senior member

Diamond Member