Discussion RDNA4 + CDNA3 Architectures Thread

Page 41 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

DisEnchantment

Golden Member
Mar 3, 2017
1,770
6,719
136





With the GFX940 patches in full swing since first week of March, it is looking like MI300 is not far in the distant future!
Usually AMD takes around 3Qs to get the support in LLVM and amdgpu. Lately, since RDNA2 the window they push to add support for new devices is much reduced to prevent leaks.
But looking at the flurry of code in LLVM, it is a lot of commits. Maybe because US Govt is starting to prepare the SW environment for El Capitan (Maybe to avoid slow bring up situation like Frontier for example)

See here for the GFX940 specific commits
Or Phoronix

There is a lot more if you know whom to follow in LLVM review chains (before getting merged to github), but I am not going to link AMD employees.

I am starting to think MI300 will launch around the same time like Hopper probably only a couple of months later!
Although I believe Hopper had problems not having a host CPU capable of doing PCIe 5 in the very near future therefore it might have gotten pushed back a bit until SPR and Genoa arrives later in 2022.
If PVC slips again I believe MI300 could launch before it

This is nuts, MI100/200/300 cadence is impressive.



Previous thread on CDNA2 and RDNA3 here

 
Last edited:

Ajay

Lifer
Jan 8, 2001
16,094
8,111
136
I'm glad AMD finally kicked their (department) head out, they've really needed someone that knows how to compete with Nvidia in mindshare terms and that guy clearly could not.
Well, they need a good Halo product each generation. It grabs headlines. They were shooting for that with Big RDNA4, unfortunately, they got behind the Eightball on that.

Oh, finally ran across that comment (which I think was removed) for the AMD engineer:

@BrockSuire75

I want to put out something about AMD GPU s. AMD is not leaving the high end market. As some have seen all over the internet about AMD to stop making high end GPU s. This is 1000% false. AMD has next Gen cards being validating as we speak. I have a few engineering samples I'm evaluating. Keep dreaming, never let anyone stop you!

That was a few months ago, so I doubt he was talking about RDNA5.
 
Reactions: Tlh97

branch_suggestion

Senior member
Aug 4, 2023
610
1,323
96
Thanks for presenting Nvidia's superiority in a thread mostly about RDNA4.
I really appreciate your and the others' effort in spamming this thread with pretty much unrelated stuff.
NV astroturfers already forced the closure of the B3D HW forum. There is basically nowhere that AMD GPUs are discussed that isn't eventually harassed by the usual suspects heralding how RTG can never compete.
 

TESKATLIPOKA

Platinum Member
May 1, 2020
2,600
3,160
136
Anyway, ticking it over in my head, CU counts are probably
40CU/128bit
60CU/192bit
80CU/256bit
120CU/384bit

160 is too big and 100 is too odd, and this setup matches CU to bandwidth perfectly.
You forgot about Infinity cache, which would skewer that CU/BW ratio heavily.

Effective Infinity cache 2 Bandwidth amplification.
96MB: 2304B/clock * 1.88GHz = 4332GB/s * 0.53 = 2296GB/s
76MB: 1824B/clock * 1.88GHz = 3429GB/s * 0.46 = 1577GB/s
64MB: 1536B/clock * 1.88GHz = 2888GB/s * 0.42 = 1213GB/s
48MB: 1152B/clock * 1.88GHz = 2166GB/s * 0.35 = 758GB/s
32MB: 768B/clock * 1.88GHz = 1444GB/s * 0.27 = 390GB/s

If you wanted almost linear BW increase from IC, then you would need:
120CU -> 96MB
80CU -> 76MB
60CU-> 64MB
40CU -> 48MB

Then the next question is the clockspeed for those GPUs.
Even If you kept the clockspeed at RDNA3 level, you would need faster Vram for everything, so GDDR6 is out of the question.
And I would like to see 24gb modules being used.
 
Last edited:
Reactions: Tlh97 and Joe NYC

Kepler_L2

Senior member
Sep 6, 2020
764
3,087
136
MI300 is coming out too late.

Particularly if you look at Nvidia latest roadmap.


I suspect most of the mi300 this year is going to EL Capitan.

Meaning most of the MI300 being sold is being done in 2024.

This is going to be compared against Blackwell which is being unveiled in March of 2024. With Nvidia going to a yearly release for datacenter, it gives them the ability to use the fastest Memory, manufacturing tech giving AMD a brutal competition. Rumors point to Blackwell selling in high volume in 2nd of of 2024.

This has other consequences for AMD because this type of volume and money may cause AMD to have supply issues with TSMC.

With Nvidia likely being a 100 billion revenue company next year and net profit in the 40 to 50 billion range(a quarter being greater than AMD total profit in the last decade), they will have the grunt to wipe out AMD's AI data center plans through strangulation in the supply chains, developers support and accelerated road maps.

Hopper is sold out of 2024 which translates into about 80 billion dollars(2 million units at 40k each). Add in other Nvidia revenue like Blackwell and gaming and it is simply a monstrous amount of revenue. This gives Nvidia the financial horsepower to produce a 3nm data center chip in 2024 which is going to be compared against 5nm mi300.

By the time AMD get's to 3nm, Nvidia will be on 2nm. AMD is losing it's position at TSMC.


This will have consequences for AMD in the rest of it's product roadmaps as Intel uses TSMC more and it loses clout at TSMC. AMD has a hard choice ahead which division is going to be sacrificed in order for the rest of the products to succeed. I think consumer graphics is going to be that item that gets heavy cuts again like in the past. The cancelation of Navi 41 was a prelude to this I think.

AMD needs to have more forward thinking. Nvidia dominated supercomputers in the past and still has 5 of the top 10 super computers in the world.

https://www.top500.org/lists/top500/list/2023/06/

Look at the rest of the list and it is still dominated by Nvidia. Nvidia has just moved on to bigger and better things. A couple super computer from the US Goverment worth 1.2 billion dollars every 5 years is decent money for AMD..... but for Nvidia that's soon to be the weekly sales of H100, with better margins to boot.
Cool story but AMD is the first N3E and N2 customer.
 

TESKATLIPOKA

Platinum Member
May 1, 2020
2,600
3,160
136
As a continuation to my previous post, I would like to see these specs for GPUs(N44, N43, N42, N41).
It's just a speculation and I calculated everything as a chiplet.
MCD is now 12MB + 32-bit GDDR7 paired to a 24gbps module, just so I can make easy cut-down chips.

GPUShader EngineCU(WGP)ROPsFrequencyInfinity CacheMemory widthMemory speedVramTBPPerformance 4K (TPU)
RX 7600232(16)642655 MHz32 MB128-bit GDDR618 gbps8 GB165 W100%
RX 86002 (N44)36(18)723000 MHz36 MB96-bit
GDDR7
30 gbps9 GB125 W~127%
RX 8600XT2 (N44)40(20)803200 MHz48 MB128-bit
GDDR7
27 gbps12 GB150 W ~151%
RX 7700XT354(27)962544 MHz48 MB192-bit GDDR618 gbps12 GB245 W169%
RX 7800XT360(30)962430 MHz64 MB256-bit GDDR619.5 gbps16 GB263 W204%
RX 87003 (N43)54(27)1083000 MHz60 MB160-bit
GDDR7
32 gbps15 GB190 W~227%
RX 8700XT3 (N43)60(30)1203200 MHz72 MB192-bit
GDDR7
32 gbps18 GB225 W~269%
RX 7900XT684(42)1922400 MHz80 MB320-bit GDDR620 gbps20 GB315 W261%
RX 7900XTX696(48)1922500 MHz96 MB384-bit GDDR620 gbps24 GB355 W311%
RX 88004 (N42)72(36)1443000 MHz84 MB224-bit
GDDR7
31 gbps21 GB250 W~280%
RX 8800XT4 (N42)80(40)1603200 MHz 96 MB256-bit
GDDR7
32 gbps24 GB300 W~332%
RX 89006 (N41)108(54)2163000 MHz120 MB320-bit
GDDR7
31 gbps30 GB375 W~420%
RX 8900XT6 (N41)120(60)2403200 MHz144 MB384-bit GDDR732 gbps36 GB450 W~498%
Yeah, I know both BW+IC BW is in some cases low and performance estimate is based on Flops increase compared to previous generation, so in real life It would be less.
 
Last edited:

Joe NYC

Platinum Member
Jun 26, 2021
2,909
4,279
106
This has other consequences for AMD because this type of volume and money may cause AMD to have supply issues with TSMC.

With Nvidia likely being a 100 billion revenue company next year and net profit in the 40 to 50 billion range(a quarter being greater than AMD total profit in the last decade), they will have the grunt to wipe out AMD's AI data center plans through strangulation in the supply chains, developers support and accelerated road maps.

Of all of the possible challenges AMD might face in challenging NVidia in datacenter GPU, strangulation in the supply chains (by NVidia) is not something that the supply chains will get fooled into (by NVidia).

The major capacity constraint now is CoWoS, but the supply is likely going to outpace the AI hype by 2025, and by this time, AMD will likely not even be using CoWoS for Mi400.
 
Reactions: moinmoin

jpiniero

Lifer
Oct 1, 2010
15,904
6,391
136
As a continuation to my previous post, I would like to see these specs for GPUs(N44, N43, N42, N41).
It's just a speculation and I calculated everything as a chiplet.
MCD is now 12MB + 32-bit GDDR7 paired to a 24gbps module, just so I can make easy cut-down chips.

Yeah but the chiplet is dead. At least for RDNA4.

I was gonna say for the faster one being 40 CUs, 12 GB 128-bit GDDR7, perhaps 10% slower than the 7700 XT for $399. Which even that should be very competitive with Blackwell in raster.

AMD could release new N31 before then with faster GDDR6 if it exists and call that high end.

IMO, where Blackwell will get faster is (well over) a grand.
 

Timorous

Golden Member
Oct 27, 2008
1,883
3,616
136
Yeah but the chiplet is dead. At least for RDNA4.

I was gonna say for the faster one being 40 CUs, 12 GB 128-bit GDDR7, perhaps 10% slower than the 7700 XT for $399. Which even that should be very competitive with Blackwell in raster.

AMD could release new N31 before then with faster GDDR6 if it exists and call that high end.

IMO, where Blackwell will get faster is (well over) a grand.

Allegedly and it makes zero sense.

Dropping the MI300 style monster halo part, sure I get but I don't see that being cheap enough to be useful in the typical high and upper mid range part of the market. I also don't see monolithic being cheap enough to make dies bigger than 250mm or so viable either, not for AMD atleast who can't sell them in professional products like NV can, especially with how poorly cache and IO shrinks.

So that still leaves a part of the market that would be best served by chiplets.
 
Reactions: Tlh97 and Joe NYC

Joe NYC

Platinum Member
Jun 26, 2021
2,909
4,279
106
Allegedly and it makes zero sense.

Dropping the MI300 style monster halo part, sure I get but I don't see that being cheap enough to be useful in the typical high and upper mid range part of the market. I also don't see monolithic being cheap enough to make dies bigger than 250mm or so viable either, not for AMD atleast who can't sell them in professional products like NV can, especially with how poorly cache and IO shrinks.

So that still leaves a part of the market that would be best served by chiplets.

The pressure to go with chiplets is only going to grow. RDNA5 will likely be at least N3E, so the difference between the cost of the compute silicon on N3E and I/O + cache on N6 is going to grow further.

BTW, I wonder what the process node was supposed to in the cancelled Navi 4c, if by any change it was N3B, and if that might have been another reason to cancel it..
 
Reactions: Tlh97

branch_suggestion

Senior member
Aug 4, 2023
610
1,323
96
The pressure to go with chiplets is only going to grow. RDNA5 will likely be at least N3E, so the difference between the cost of the compute silicon on N3E and I/O + cache on N6 is going to grow further.

BTW, I wonder what the process node was supposed to in the cancelled Navi 4c, if by any change it was N3B, and if that might have been another reason to cancel it..
RDNA5 is N3P or N2, maybe a mix.
RDNA4 is likely the same deal as Zen5, originally N3B, backported to N4P.
 

Joe NYC

Platinum Member
Jun 26, 2021
2,909
4,279
106
RDNA5 is N3P or N2, maybe a mix.

I guess depends on the release date.

But if it follows the same approach as Navi 4c, with only small part of the overall GPU being on the advanced node, the cost of that node is less of an obstacle.

I wonder what the next version of Strix Halo brings. Maybe a similar approach as Navi 4C, splitting the large SOC on advanced node (N3E?) to AID on N6 + SED on advanced node

RDNA4 is likely the same deal as Zen5, originally N3B, backported to N4P.

I wonder if that (having to backport to N4P wasn't one of the contributing factors to cancellation of Navi 4c
 

MoogleW

Member
May 1, 2022
95
44
61
Considering the timelines of RDNA5 apparently moved over, could it be possible RDNA5 is backported from an orignal N2 design to N3E? I don't think the RDNA5 we will get is the same RDNA5 that would have existed in the future.

Would be interested if the core design is intact or some changes will be postponed quietly to RDNA6 so as not to have too many new changes in a short time frame
 
Last edited:

MoogleW

Member
May 1, 2022
95
44
61
I guess depends on the release date.

But if it follows the same approach as Navi 4c, with only small part of the overall GPU being on the advanced node, the cost of that node is less of an obstacle.

I wonder what the next version of Strix Halo brings. Maybe a similar approach as Navi 4C, splitting the large SOC on advanced node (N3E?) to AID on N6 + SED on advanced node



I wonder if that (having to backport to N4P wasn't one of the contributing factors to cancellation of Navi 4c
Surely not, the available process to use should be amongst the first things IHVs factor in to an architecture. This influences design rules, target efficiency, etc and these are booked in advanced
 
Reactions: Tlh97 and Joe NYC

moinmoin

Diamond Member
Jun 1, 2017
5,193
8,330
136
Surely not, the available process to use should be amongst the first things IHVs factor in to an architecture. This influences design rules, target efficiency, etc and these are booked in advanced
While true AMD likely is used to targeting multiple possible nodes at once. Remember that Zen 2 originally was intended to use GloFo's later cancelled 7nm node before it was eventually revealed that it will use TSMC's N7, seemingly so without any delay.
 

Frenetic Pony

Senior member
May 1, 2012
218
179
116
N3B is awful, so I doubt RDNA4 was ever intended for it, there's a reason Apple is the only major customer for it and N3E was rushed out asap. N3E isn't terribly better, but just better enough to be worth it for the most competitive markets. It's not until N2 that TSMC will get back on track with something like a worthwhile advance.

It'd be more interesting to know how much AMD is looking at Samsung, or even (gasp) Intel! Intel right now is advancing rapidly in silicon foundry tech, not so much in design. Meteor Lake doesn't appear particularly competitive still with AMD, and this is the first one that Pat's been involved with in any meaningful way. If AMD can muscle Intel out of silicon design but use their foundry tech that would be the most optimal outcome for them.
 

Joe NYC

Platinum Member
Jun 26, 2021
2,909
4,279
106
Considering the timelines of RDNA5 apparently moved over, could it be possible RDNA5 is backported from an orignal N2 design to N3E? I don't think the RDNA5 we will get is the same RDNA5 that would have existed in the future.

Would be interested if the core design is intact or some changes will be postponed quietly to RDNA6 so as not to have too many new changes in a short time frame

We don't know if and how much RDNA5 is going to be pulled in.

Maybe only RDNA4 is going to have earlier availability. Marketing would force AMD to hold back RDNA4 until the top SKU was ready and released. Since top SKU was the problematic one, holding RDNA4, cancelling it may have moved up the schedule of the other RDNA4 parts.
 
Reactions: Tlh97

Joe NYC

Platinum Member
Jun 26, 2021
2,909
4,279
106
N3B is awful, so I doubt RDNA4 was ever intended for it, there's a reason Apple is the only major customer for it and N3E was rushed out asap. N3E isn't terribly better, but just better enough to be worth it for the most competitive markets. It's not until N2 that TSMC will get back on track with something like a worthwhile advance.

It'd be more interesting to know how much AMD is looking at Samsung, or even (gasp) Intel! Intel right now is advancing rapidly in silicon foundry tech, not so much in design. Meteor Lake doesn't appear particularly competitive still with AMD, and this is the first one that Pat's been involved with in any meaningful way. If AMD can muscle Intel out of silicon design but use their foundry tech that would be the most optimal outcome for them.
Intel would have to completely divest itself from any control of their foundry (+ couple of years of good behavior) for AMD and also any major customers to have trust in this foundry.

Samsung seems to be behind in hybrid bond packaging - but needs to catch up in hurry, in order to be able to make HBM4.

So unless / until Samsung can do 3D stacking of chiplets, Samsung would only be an option for monolithic packages...

As far as TSMC, as long as AMD is on par or ahead in design and AMD continues to be ahead in chiplets, there is not much of a reason for AMD to consider alternative.

AMD is moving to the top of the line with the #1 foundry, has strategic relationship with TSMC. No reason for AMD to ruin this.
 
Reactions: MangoX and Tlh97

Ajay

Lifer
Jan 8, 2001
16,094
8,111
136
We don't know if and how much RDNA5 is going to be pulled in.
I suppose that it's possible that the quick cancellation of Big RDNA4 allowed AMD to move more engineers on to the RDNA5 team. Still, it's really hard to pull in timelines by adding more engineers depending on when they are injected into the development process. The later they start, the smaller their impact. The best ways to speed up projects when I was a software were for us to work more hours (oh well) or cut out some features (even if they were half done).
 
Reactions: Tlh97 and Joe NYC

jpiniero

Lifer
Oct 1, 2010
15,904
6,391
136
N3B is awful, so I doubt RDNA4 was ever intended for it, there's a reason Apple is the only major customer for it and N3E was rushed out asap. N3E isn't terribly better, but just better enough to be worth it for the most competitive markets. It's not until N2 that TSMC will get back on track with something like a worthwhile advance.

N2 has a small quality increase but the density gain is minimal. It's looking like another N3B. AMD might be the only customer of it (and for Turin Dense only)
 

Frenetic Pony

Senior member
May 1, 2012
218
179
116
N2 has a small quality increase but the density gain is minimal. It's looking like another N3B. AMD might be the only customer of it (and for Turin Dense only)

Density appears to be done for either way, but unlike N3 (at least N3 vs N4P) the power usage at same frequencies is projected to drop a whole lot. For most parts today that's more than good enough, whether it's a consumer part in likely a mobile device or an enterprise/cloud part that's just going to go into some ridiculous chiplet package and substrate size anyway, nigh the entire spectrum can use better power efficiency well.
 

Orodruin

Member
Sep 30, 2020
37
5
71
When will RDNA 4 be released? Will it come in 2025?

Frankly, I think it will fall behind Nvidia again. Because AMD urgently needs to make some changes and adjustments in the software wing. It produces very powerful cards, but it always lags behind Nvidia on the software side. Under normal circumstances, the RX 7000 series should be at par with Nvidia RTX4000 with technologies such as RTX or DLSS turned on. But somehow he always falls behind.
 

Joe NYC

Platinum Member
Jun 26, 2021
2,909
4,279
106
When will RDNA 4 be released? Will it come in 2025?

Frankly, I think it will fall behind Nvidia again. Because AMD urgently needs to make some changes and adjustments in the software wing. It produces very powerful cards, but it always lags behind Nvidia on the software side. Under normal circumstances, the RX 7000 series should be at par with Nvidia RTX4000 with technologies such as RTX or DLSS turned on. But somehow he always falls behind.

RDNA4 should definitely be released in 2024.
 
Reactions: Tlh97 and Kepler_L2

jpiniero

Lifer
Oct 1, 2010
15,904
6,391
136
Density appears to be done for either way, but unlike N3 (at least N3 vs N4P) the power usage at same frequencies is projected to drop a whole lot.

It's really not. And with minimal density gain, it's going to be stupidly expensive. You'd have to have a product where customers would gladly pay for the small power savings.
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |