Discussion RDNA4 + CDNA3 Architectures Thread

Page 364 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

DisEnchantment

Golden Member
Mar 3, 2017
1,770
6,720
136





With the GFX940 patches in full swing since first week of March, it is looking like MI300 is not far in the distant future!
Usually AMD takes around 3Qs to get the support in LLVM and amdgpu. Lately, since RDNA2 the window they push to add support for new devices is much reduced to prevent leaks.
But looking at the flurry of code in LLVM, it is a lot of commits. Maybe because US Govt is starting to prepare the SW environment for El Capitan (Maybe to avoid slow bring up situation like Frontier for example)

See here for the GFX940 specific commits
Or Phoronix

There is a lot more if you know whom to follow in LLVM review chains (before getting merged to github), but I am not going to link AMD employees.

I am starting to think MI300 will launch around the same time like Hopper probably only a couple of months later!
Although I believe Hopper had problems not having a host CPU capable of doing PCIe 5 in the very near future therefore it might have gotten pushed back a bit until SPR and Genoa arrives later in 2022.
If PVC slips again I believe MI300 could launch before it

This is nuts, MI100/200/300 cadence is impressive.



Previous thread on CDNA2 and RDNA3 here

 
Last edited:

gdansk

Diamond Member
Feb 8, 2011
4,006
6,574
136
Plus 5070 Ti is the cut down part on GB203, 5080 is the full part.
I'll also add that the full GB203 would be slower than the 5080 when at 304W instead of 360W. And possibly slower still if hobbled by equally slow memory, it'd be only 2/3rd the memory bandwidth to be comparable. I think it's close and I did not expect that.
 

techjunkie123

Member
May 1, 2024
138
292
96
I'm not sure transistor counts are as comparable as people would like.
It's 2x the transistor density of Intel's B580 on what is supposed to be a similar density process. Is it really that many less transistors or are they counting differently?
N31 overall has a density of 110 MTr/mm^2. Comparing to just the N31 GCD on 5 nm, N48 has the same density (150 MTr/mm^2). But the GCD doesn't have infinity cache, memory controller, etc. This would suggest that the density improvements are quite dramatic, presumably for logic, given only minor improvements from the node.
 

PJVol

Senior member
May 25, 2020
844
823
136
TBP, not TDP according to AMD site
Despite being technically different, they both refer to the same metric in the context of specs, i.e. max power allowed by the board's thermal design (read - cooler cap. + thermal limit)
 
Last edited:

exquisitechar

Senior member
Apr 18, 2017
720
1,016
136
I'm not sure transistor counts are as comparable as people would like.
It's 2x the transistor density of Intel's B580 on what is supposed to be a similar density process. Is it really that many less transistors or are they counting differently?
Maybe they are, but either way, different designs can have massive differences in transistor counts despite having a similar die size using a similar node. “Performance per transistor” is a meaningless and irrelevant thing.
 
Reactions: TESKATLIPOKA

Novacius

Member
Apr 27, 2015
32
47
91
Something I just now realized is that AMD is now ahead of NVIDIA in terms of performance/watt and performance/area for raster. Also possibly equal in RT. (NVIDIA has more cores so I suspect that is a big reason for the gap)

This is comparing the 9070XT to the 5070 Ti.

They were in the perfect position to drop a halo product. Of course the halo product got canned.
Huh? According to AMD the XT delivers 5070 Ti performance in raster at roughly the same power, meaning perf/W is now equal. Also the Ti is GB203's salvage, you have to compare to the 5080 for perf/area. GB203 is around 6% bigger than Navi48, but the 5080 should be around 15% faster, so perf/area of GB203 is slightly better.
N31 overall has a density of 110 MTr/mm^2. Comparing to just the N31 GCD on 5 nm, N48 has the same density (150 MTr/mm^2). But the GCD doesn't have infinity cache, memory controller, etc. This would suggest that the density improvements are quite dramatic, presumably for logic, given only minor improvements from the node.
According to TPU, Navi48 uses N4C, which should be denser than N4P.
 

exquisitechar

Senior member
Apr 18, 2017
720
1,016
136
Huh? According to AMD the XT delivers 5070 Ti performance in raster at roughly the same power, meaning perf/W is now equal. Also the Ti is GB203's salvage, you have to compare to the 5080 for perf/area. GB203 is around 6% bigger than Navi48, but the 5080 should be around 15% faster, so perf/area of GB203 is slightly better.

According to TPU, Navi48 uses N4C, which should be denser than N4P.
Although the 5080 is using GDDR7.
 

Hail The Brain Slug

Diamond Member
Oct 10, 2005
3,686
2,908
136

Novacius

Member
Apr 27, 2015
32
47
91
It has 40% moar membw.

It haez 40% more membw too.
Somebody should bench the 5080 with downclocked memory then, to see how much that impacts performance
From GameGPU: "The use of ray tracing technology makes the game even more cinematic, especially in scenes with sunsets and illuminated cities at night."

That's software RT (SVOGI), there's no hardware acceleration happening
 

poke01

Diamond Member
Mar 8, 2022
3,316
4,569
106
Kingdom come 2 uses Voxel cone tracing, not RT. NV and Intel cards play well with this game.

Oh well at least AMD has CoD and Spider-Man.
 

Mahboi

Golden Member
Apr 4, 2024
1,057
1,969
96
At this point they've been a part of AMD so long that it doesn't make a difference.
No the h264 encoders in RDNA have always been butt, while Xilinx clearly had far better stuff for their Alveo transcoders. I was hoping that RDNA 4 would start implementing the Xilinx ones instead of the very poor AMD ones, I'm curious if that's the case.
 
Reactions: Mopetar and Gideon

gdansk

Diamond Member
Feb 8, 2011
4,006
6,574
136
No the h264 encoders in RDNA have always been butt, while Xilinx clearly had far better stuff for their Alveo transcoders. I was hoping that RDNA 4 would start implementing the Xilinx ones instead of the very poor AMD ones, I'm curious if that's the case.
Most their open source software commits are from the same people as ever. I don't think there was a major change in teams, just new work.
 
Last edited:
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |