Discussion RDNA4 + CDNA3 Architectures Thread

Page 42 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

DisEnchantment

Golden Member
Mar 3, 2017
1,770
6,719
136





With the GFX940 patches in full swing since first week of March, it is looking like MI300 is not far in the distant future!
Usually AMD takes around 3Qs to get the support in LLVM and amdgpu. Lately, since RDNA2 the window they push to add support for new devices is much reduced to prevent leaks.
But looking at the flurry of code in LLVM, it is a lot of commits. Maybe because US Govt is starting to prepare the SW environment for El Capitan (Maybe to avoid slow bring up situation like Frontier for example)

See here for the GFX940 specific commits
Or Phoronix

There is a lot more if you know whom to follow in LLVM review chains (before getting merged to github), but I am not going to link AMD employees.

I am starting to think MI300 will launch around the same time like Hopper probably only a couple of months later!
Although I believe Hopper had problems not having a host CPU capable of doing PCIe 5 in the very near future therefore it might have gotten pushed back a bit until SPR and Genoa arrives later in 2022.
If PVC slips again I believe MI300 could launch before it

This is nuts, MI100/200/300 cadence is impressive.



Previous thread on CDNA2 and RDNA3 here

 
Last edited:

Frenetic Pony

Senior member
May 1, 2012
218
179
116
It's really not. And with minimal density gain, it's going to be stupidly expensive. You'd have to have a product where customers would gladly pay for the small power savings.

Power savings is projected to be 25-30% vs N3E by TSMC. Sure that's TSMC optimism numbers, but if that's "small power savings" then what on earth is large power savings?
 

Tigerick

Senior member
Apr 1, 2022
715
667
106
Oh, finally ran across that comment (which I think was removed) for the AMD engineer:

@BrockSuire75

I want to put out something about AMD GPU s. AMD is not leaving the high end market. As some have seen all over the internet about AMD to stop making high end GPU s. This is 1000% false.

I think what the engineer said has two contexts. First one is what we know about RDNA5; AMD is betting heavily on chiplets design with RDNA5; maybe Navi4c will see successor as Navi5c with 512-bit GDDR7 memory support if power permitted.

AMD has next Gen cards being validating as we speak. I have a few engineering samples I'm evaluating. Keep dreaming, never let anyone stop you!
That was a few months ago, so I doubt he was talking about RDNA5.
The second part is actually referred to RDNA4 cards; He was actually talking about next Gen cards (with s) without mentioning high performance, cause as I speculated, these two cards are low to mid end cards.

As for release timings, we might be seeing the cards being announced as soon as CES 2024. According to RGT, AMD going to tease Zen5 and RDNA4 at CES next year. I know his creditability sucks; but remember Zen5 and RDNA4 are both manufactured by TSMC's N4P process and we have already seen first wave of SoC with N4P coming by end of the year.

If AMD does indeed release RDNA4 cards in the beginning of 2024, then there is no chance of them using GDDR7 as shown by Samsung roadmap below:



All I want to say is: Think Polaris, baby....

PS: He also said the cards will perform below N31 and price below N32, you would have to judge yourself what to believe...Remember AMD won't ditch N32 cards which are just released half a year ago....
 
Last edited:

Joe NYC

Platinum Member
Jun 26, 2021
2,909
4,279
106
I think what the engineer said has two contexts. First one is what we know about RDNA5; AMD is betting heavily on chiplets design with RDNA5; maybe Navi4c will see successor as Navi5c with 512-bit GDDR7 memory support if power permitted.



The second part is actually referred to RDNA4 cards; He was actually talking about next Gen cards (with s) without mentioning high performance, cause as I speculated, these two cards are low to mid end cards.

As for release timings, we might be seeing the cards being announced as soon as CES 2024. According to RGT, AMD going to tease Zen5 and RDNA4 at CES next year. I know his creditability sucks; but remember Zen5 and RDNA4 are both manufactured by TSMC's N4P process and we have already seen first wave of SoC with N4P coming by end of the year.

If AMD does indeed release RDNA4 cards in the beginning of 2024, then there is no chance of them using GDDR7 as shown by Samsung roadmap below:

View attachment 87964

All I want to say is: Think Polaris, baby....

PS: He also said the cards will perform below N31 and price below N32, you would have to judge yourself what to believe...Remember AMD won't ditch N32 cards which are just released half a year ago....

With chiplets and a relatively inexpensive N6 silicon for the base die, AMD can finally add generous amount of infinity cache for greater bandwidth, while keeping power consumption low.

We will find out soon how well the MALL cache performs and what are its best uses - when Mi300 is unveiled. Probably within a month
 

Tigerick

Senior member
Apr 1, 2022
715
667
106
With chiplets and a relatively inexpensive N6 silicon for the base die, AMD can finally add generous amount of infinity cache for greater bandwidth, while keeping power consumption low.

We will find out soon how well the MALL cache performs and what are its best uses - when Mi300 is unveiled. Probably within a month
Let's hope RDNA5 with 2nd gen chiplets design performs better. By combining multiple MCD into one base die definitely provides better power efficiency. The same should be apply to GCX, AMD should/could combine multiple SED into single die of GCX, that saves lots of power rails while simply the packaging.

With one GCX, I could estimate single die of GCX to have around 70 CU and 150mm2 die size with N3E process. Each GCX with base die should connect to 128-bit of GDDR7 memory support.

When I was searching for Navi4C design photo, I was shocked to find out the pattern photo which explain above design:-



Yeap, the design of Navi4C was definitely too complicated and power hungry; RDNA5 would most likely using above designs with bridge chip to communicate between each die. Too bad we won't be seeing it until 2025 but if AMD got it right, then AMD has the upper hand of future GPU design which is definitely heading towards chiplets...
 
Last edited:

Joe NYC

Platinum Member
Jun 26, 2021
2,909
4,279
106
Let's hope RDNA5 with 2nd gen chiplets design performs better. By combining multiple MCD into one base tile definitely provides better power efficiency. The same should be apply to GCX, AMD should/could combine multiple SED into single die of GCX, that saves lots of power rails while simply the packaging.

I was also wondering about that. The only reason of having multiple SEDs stacked on top of a single die (AID) would be if AMD had plans to somehow reuse the SEDs or to stack different number for segmentation.

Otherwise, it seems it would be cheaper to have a single bigger GCX rather than multiple SEDs

With one GCX, I could estimate single die of GCX to have around 70 CU and 150mm2 die size with N3E process. Each GCX with base tile should connect to 128-bit of GDDR7 memory support.

When I was searching for Navi4C design photo, I was shocked to find out the pattern photo which explain above design:-

View attachment 87966

Yeap, the design of Navi4C was definitely too complicated and power hungry; RDNA5 would most likely using above designs with bridge chip to communicate between each tile. Too bad we won't be seeing it until 2025 but if AMD got it right, then AMD has the upper hand of future GPU design which is definitely heading towards chiplets...

I don't think it is that complicated, once you get over the challenge that they can bridge 2 chips and have silicon bridge that uses hybrid bond. I thought challenge of this might be thermal expansion, but if they have it figured out then it is just a new way to put the Legos together.

With the chiplet approach, with 4 dies, they can assemble large number of SKUs, while optimizing the silicon for its optimal use.

Logic scaling is still continuing, so taking analog and most of SRAM out of the SED/GCD, AMD should achieve excellent transistor densities. While being able to add a lot of SRAM on the least expensive silicon.

As far as power hungry, all of the hybrid bond connections have extremely low power overhead. Almost like being a single die.

If AMD does add a substantial amount of cache, such as 2x of Navi 2x equivalent, that actually saves power, because the cache hits would eliminate memory accesses, which are a lot more expensive.
 

Tigerick

Senior member
Apr 1, 2022
715
667
106
I was also wondering about that. The only reason of having multiple SEDs stacked on top of a single die (AID) would be if AMD had plans to somehow reuse the SEDs or to stack different number for segmentation.

Otherwise, it seems it would be cheaper to have a single bigger GCX rather than multiple SEDs



I don't think it is that complicated, once you get over the challenge that they can bridge 2 chips and have silicon bridge that uses hybrid bond. I thought challenge of this might be thermal expansion, but if they have it figured out then it is just a new way to put the Legos together.

With the chiplet approach, with 4 dies, they can assemble large number of SKUs, while optimizing the silicon for its optimal use.

Logic scaling is still continuing, so taking analog and most of SRAM out of the SED/GCD, AMD should achieve excellent transistor densities. While being able to add a lot of SRAM on the least expensive silicon.

As far as power hungry, all of the hybrid bond connections have extremely low power overhead. Almost like being a single die.

If AMD does add a substantial amount of cache, such as 2x of Navi 2x equivalent, that actually saves power, because the cache hits would eliminate memory accesses, which are a lot more expensive.
Well, it is definitely more complicated than new design. Anyhow, with Mi-300 like design, I could see AMD working on the successor of Navi4c, let's call it Navi5c for the moment. If AMD decided to make mega-GPU with 4 base dies, then they might need to add additional bridge chip to connect other dies like "X" connectivity between 4 dies...hmm, so each base die required to have 2 bridge chips??

As for amount of cache, if each base die have 32MB of cache, that would require die size of around 70mm2 with N6 process. So I am not sure AMD would scale up much on the amount of cache???

Anyway, I could see how much technical challenge AMD is facing, that explains the delays into 2025. But the most time consuming would be to validate design of Navi5c. If AMD managed to pull it off, then AMD should be able to launch Navi5c, N51, N52 and N53 pretty quickly.
 

Tigerick

Senior member
Apr 1, 2022
715
667
106
Unless N32 is just a limited production run.
Unless AMD wants to throw millions of dollars away and throws another millions to design slower chips. (the monolithic die is only around 250mm2)

Does it make business sense to you?

Look man, I may not be 100% correct but I am pretty confident about the signs I seen. If I am wrong, I am wrong. But if I am right, hehe
 
Last edited:

Joe NYC

Platinum Member
Jun 26, 2021
2,909
4,279
106
Well, it is definitely more complicated than new design. Anyhow, with Mi-300 like design, I could see AMD working on the successor of Navi4c, let's call it Navi5c for the moment. If AMD decided to make mega-GPU with 4 base dies, then they might need to add additional bridge chip to connect other dies like "X" connectivity between 4 dies...hmm, so each base die required to have 2 bridge chips??

For n number of base dies, there would be n-1 bridges.

So, if there were 4 base die, that would mean 3 bridges.

+1 bridge to connect the I/O chip.

As for amount of cache, if each base die have 32MB of cache, that would require die size of around 70mm2 with N6 process. So I am not sure AMD would scale up much on the amount of cache???

64 MB of SRAM = 36 mm2 (for Ryzen chips with V-cache)

So in theory, there would be room for a lot of SRAM on the base die.

But curiously, the size of the MALL cache AMD has used (or is rumored to be using in future products) is very low. Including Mi300, which may have > 1,000 mm2 of silicon on the base dies

I won't have an explanation why it is low...

Anyway, I could see how much technical challenge AMD is facing, that explains the delays into 2025. But the most time consuming would be to validate design of Navi5c. If AMD managed to pull it off, then AMD should be able to launch Navi5c, N51, N52 and N53 pretty quickly.

I wonder if the costs by that point are low enough for all of these cards to be chiplet based, from low end to high end.

I think the odds are better than 50:50 in favor.
 

Tigerick

Senior member
Apr 1, 2022
715
667
106
For n number of base dies, there would be n-1 bridges.

So, if there were 4 base die, that would mean 3 bridges.

+1 bridge to connect the I/O chip.

Yeah, didn't know we should count like that...


64 MB of SRAM = 36 mm2 (for Ryzen chips with V-cache)

So in theory, there would be room for a lot of SRAM on the base die.

But curiously, the size of the MALL cache AMD has used (or is rumored to be using in future products) is very low. Including Mi300, which may have > 1,000 mm2 of silicon on the base dies

I won't have an explanation why it is low...

Ryzen's V-cache consists of cache only. You need to count in cache + GDDR7 memory controllers, that's why I am using die size of N32's MCD. Each MCD consists of 16MB IC and 64-bit GDDR6 memory controller with die size of 37 mm2. Thus, my calculation of 70mm2 provided that GDDR7 memory controller won't be much larger....

I wonder if the costs by that point are low enough for all of these cards to be chiplet based, from low end to high end.

I think the odds are better than 50:50 in favor.

I have actually made a speculation table with upcoming RDNA4 & RDNA5 cards with price estimates:-


The total package of GCX + Base + bridges won't be cheap especially with GDDR7 memory. That's why RDNA5 GPUs are mostly covered from $500 and up. I have included the potential prices of Navi5c, it should be in range of upcoming RTX5090 ...

As I mentioned before, RDNA4 GPUs are most likely to be available under sub-$400 market segment. And they are going to live through 2026; approximate two-year shelf life. As with N32; they are going to be replaced by N53 two years later...

N33 will live longer than 2 years cause I am expecting 7600's price to drop below $200

N31 may have to drop prices sooner than we think, want to guess how much???
 
Last edited:

Joe NYC

Platinum Member
Jun 26, 2021
2,909
4,279
106
Ryzen's V-cache consists of cache only. You need to count in cache + GDDR7 memory controllers, that's why I am using die size of N32's MCD. Each MCD consists of 16MB IC and 64-bit GDDR6 memory controller with die size of 37 mm2. Thus, my calculation of 70mm2 provided that GDDR7 memory controller won't be much larger....

In N32 MCD, we don't know what percentage of die size is the memory controller and what percentage is SRAM.

But we do in V-Cache: 100% SRAM. So if 64 MB is 36 mm2, 16 MB in that MCD is 9 mm2. The rest (28mm2) of the 37mm2 is memory controller + some other overhead.

As far as the base die of Navi 4c or possible Navi 5c, the die size of the AID is probably in 100 to 150 mm2 range.

So, if it has equivalent of 2 MCDs, that would be 2 x 28mm2 = 56 mm2 for memory controllers and 44 to 94 mm2 for SRAM. So, theoretically, there could be 64MB to 128MB per AIB.

IIRC, Mi300 has 64 MB SRAM per AID - which is something I am puzzled about, why there is not a lot more in such a large AID (300+ mm2)

I have actually made a speculation table with upcoming RDNA4 & RDNA5 cards with price estimates:-
View attachment 87995

The total package of GCX + Base + bridges won't be cheap especially with GDDR7 memory. That's why RDNA5 GPUs are mostly covered from $500 and up. I have included the potential prices of Navi5c, it should be in range of upcoming RTX5090 ...

As I mentioned before, RDNA4 GPUs are most likely to be available under sub-$400 market segment. And they are going to live through 2026; approximate two-year shelf life. As with N32; they are going to be replaced by N53 two years later...

N33 will live longer than 2 years cause I am expecting 7600's price to drop below $200

N31 may have to drop prices sooner than we think, want to guess how much???

While Navi 4c was the only model to be using chiplets connected with Hybrid Bond, I think with Navi 5x generation, it will proliferate to the entire line-up, except maybe a single low end model equivalent of Navi 33 could be monolithic. Or, it may not even be worth fielding a monolithic GPU with Strix Halo covering the same market.
 
Reactions: Tlh97

Joe NYC

Platinum Member
Jun 26, 2021
2,909
4,279
106
Very high bandwidth fabric and all the shoreline PHYs aren't free, bossman.

Not too long before the December 6 - AMD AI day, which is also Saint Nicholas Day - when AMD will let us see what inside the stockings - inside the AID die.

I am guessing there is going to be more PCIe lanes, maybe distributed among the 4 AIDs. But still, a lot of unaccounted for die area in 300+ mm2 AID.
 

adroc_thurston

Diamond Member
Jul 2, 2023
5,236
7,312
96
Not too long before the December 6 - AMD AI day, which is also Saint Nicholas Day - when AMD will let us see what inside the stockings - inside the AID die.
They hate dropping any relevant technical details outside of ISSCC so more waiting inbound.
kek.
I am guessing there is going to be more PCIe lanes, maybe distributed among the 4 AIDs. But still, a lot of unaccounted for die area in 300+ mm2 AID.
It's 144 HSIO lanes total for 4 AID iirc.
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |