Discussion RDNA4 + CDNA3 Architectures Thread

Page 334 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

DisEnchantment

Golden Member
Mar 3, 2017
1,755
6,635
136





With the GFX940 patches in full swing since first week of March, it is looking like MI300 is not far in the distant future!
Usually AMD takes around 3Qs to get the support in LLVM and amdgpu. Lately, since RDNA2 the window they push to add support for new devices is much reduced to prevent leaks.
But looking at the flurry of code in LLVM, it is a lot of commits. Maybe because US Govt is starting to prepare the SW environment for El Capitan (Maybe to avoid slow bring up situation like Frontier for example)

See here for the GFX940 specific commits
Or Phoronix

There is a lot more if you know whom to follow in LLVM review chains (before getting merged to github), but I am not going to link AMD employees.

I am starting to think MI300 will launch around the same time like Hopper probably only a couple of months later!
Although I believe Hopper had problems not having a host CPU capable of doing PCIe 5 in the very near future therefore it might have gotten pushed back a bit until SPR and Genoa arrives later in 2022.
If PVC slips again I believe MI300 could launch before it

This is nuts, MI100/200/300 cadence is impressive.



Previous thread on CDNA2 and RDNA3 here

 
Last edited:

Josh128

Senior member
Oct 14, 2022
630
1,030
106
96 rdna3 cu's / 64 rdna4 cu's - clocks improvement = "nearly 50%".
Just simple, Annalena B. 360° math.
lol, just as I said. +42% vs 80 CU 7900GRE over a comparison of 30+ games with almost half of those comprised of RT comparisons, which give an additional ~20%-40% boost over straight raster. And that is with ~38% higher boost clocks.
 

Kepler_L2

Senior member
Sep 6, 2020
733
2,933
136
It's become more useful for comparisons than die size when using different process nodes, as node related transistor cost reductions have stagnated.

So designs with a lot more transistors usually rise in cost.
You pay for wafers not transistors. The fact that N48 is like 2x the transistor density of B580 doesn't change the price.
 

Jan Olšan

Senior member
Jan 12, 2017
489
924
136
Whose numbers do you based it on?
When I first saw the numbers I was trying to "plot" it against 5070 Ti based on TPU's and CoputerBase's graphs from 5070 Ti reviews but 9070 XT was tracking as being below 5070ti.

Thanks for doing the math btw.

(Edit: I have just retried it against TPU's review of the Palit card what has lowest OC and it still tells me 9070 XT will be below 5070 Ti on average from the games TPU had - I didn't want to put in more sources to make it even less reliable anyway.)
 
Last edited:

gaav87

Senior member
Apr 27, 2024
548
957
96
Whose numbers do you based it on?
When I first saw the numbers I was trying to "plot" it against 5070 Ti based on TPU's and CoputerBase's graphs from 5070 Ti reviews but 9070 XT was tracking as being below 5070ti.

Thanks for doing the math btw.
All TPU review of 5070Ti or selected games.
RT 5games TPU 1 game HUB te rest i cba searching
 

gdansk

Diamond Member
Feb 8, 2011
3,865
6,208
136
It's become more useful for comparisons than die size when using different process nodes, as node related transistor cost reductions have stagnated.

So designs with a lot more transistors usually rise in cost.
Zen 5 is many more transistors but same size. Cost per transistor decreased. It's plausible RDNA4 is DTCO cache maxxing in a similar fashion.

But I still think it is 380mm²+.
 

Heartbreaker

Diamond Member
Apr 3, 2006
4,704
6,175
136
Zen 5 is many more transistors but same size. Cost per transistor decreased. It's plausible RDNA4 is DTCO cache maxxing in a similar fashion.

But I still think it is 380mm²+.

There are things you can do to pack in more density, to make a bit better use of a node. But there isn't a endless supply of these, so you are still faced with the flattening of the old curve.

380mm²+ is still a big die on an expensive node.
 

GTracing

Senior member
Aug 6, 2021
346
757
106
It's become more useful for comparisons than die size when using different process nodes, as node related transistor cost reductions have stagnated.

So designs with a lot more transistors usually rise in cost.
Transistor count is a terrible metric for comparing designs.

First off, transistor counts can vary depending on how they're measured. Unless we know that two dies are measured the same way, the count could be off.

Secondly, chips can have varying levels of transistor density, even on the same node.

An extreme example of this would be Zen5 VS Zen5c. They have the same transistor count, but the die area is way different.

A more applicable example for graphics cards would be how B580 is way less dense than AMD or Nvidia GPUs. The B580 has less than 40% as many transistors as the 9070 XT, but the die is ~70-77% as big.

Transistor count should never be used to compare chips imo. It has too much margin for error.
 

Heartbreaker

Diamond Member
Apr 3, 2006
4,704
6,175
136
Transistor count is a terrible metric for comparing designs.

First off, transistor counts can vary depending on how they're measured. Unless we know that two dies are measured the same way, the count could be off.

Secondly, chips can have varying levels of transistor density, even on the same node.

An extreme example of this would be Zen5 VS Zen5c. They have the same transistor count, but the die area is way different.

A more applicable example for graphics cards would be how B580 is way less dense than AMD or Nvidia GPUs. The B580 has less than 40% as many transistors as the 9070 XT, but the die is ~70-77% as big.

Transistor count should never be used to compare chips imo. It has too much margin for error.

It's more useful than die size, if they are on different nodes.

If they are on the same node, then you can compare die size.

The problem is when people comparing die size across nodes, which is completely non comparable.
 
Reactions: Mopetar

CastleBravo

Member
Dec 6, 2019
180
420
136
I didn't expect that. A clean win over the 4070 ti super in RT

I predict a shareholder coup if this thing launches for less than $700

It is only a clean win over the 4070 Ti Super if FSR4 upscaling matches DLSS4.

IMO, the right play for AMD is an "msrp" of $550-600, and an initial retail price of $700-750 for most AIB models for long as 5070 Ti is unobtanium.
 

coercitiv

Diamond Member
Jan 24, 2014
7,022
15,925
136
It's more useful than die size, if they are on different nodes.
Navi 21 N7 ~27B transistors
Navi 48 N4 ~53B transistors, almost 100% increase

I agree that tracking transistor costs is useful, but using it for die costs estimates of different designs might lead to very weird results. 6900 XT launched for $1000, and it looks like N48 needs even higher price for similar margins. Even if we take 6800XT MSRP as guideline, the math still looks bad.
 
Reactions: GTracing and marees

maddie

Diamond Member
Jul 18, 2010
5,047
5,350
136
It's more useful than die size, if they are on different nodes.

If they are on the same node, then you can compare die size.

The problem is when people comparing die size across nodes, which is completely non comparable.
What about libraries and the varying densities? Die area, wafer cost and defect density, not simply transistor costs.
 

GTracing

Senior member
Aug 6, 2021
346
757
106
It's more useful than die size, if they are on different nodes.

If they are on the same node, then you can compare die size.

The problem is when people comparing die size across nodes, which is completely non comparable.
The best way to compare costs across nodes is to multiply the die area by the cost-per-wafer.

Any comparison with a 50%+ margin for error is useless.
 
Reactions: marees
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |