Discussion RDNA4 + CDNA3 Architectures Thread

Page 405 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

DisEnchantment

Golden Member
Mar 3, 2017
1,760
6,675
136





With the GFX940 patches in full swing since first week of March, it is looking like MI300 is not far in the distant future!
Usually AMD takes around 3Qs to get the support in LLVM and amdgpu. Lately, since RDNA2 the window they push to add support for new devices is much reduced to prevent leaks.
But looking at the flurry of code in LLVM, it is a lot of commits. Maybe because US Govt is starting to prepare the SW environment for El Capitan (Maybe to avoid slow bring up situation like Frontier for example)

See here for the GFX940 specific commits
Or Phoronix

There is a lot more if you know whom to follow in LLVM review chains (before getting merged to github), but I am not going to link AMD employees.

I am starting to think MI300 will launch around the same time like Hopper probably only a couple of months later!
Although I believe Hopper had problems not having a host CPU capable of doing PCIe 5 in the very near future therefore it might have gotten pushed back a bit until SPR and Genoa arrives later in 2022.
If PVC slips again I believe MI300 could launch before it

This is nuts, MI100/200/300 cadence is impressive.



Previous thread on CDNA2 and RDNA3 here

 
Last edited:

linkgoron

Platinum Member
Mar 9, 2005
2,531
1,208
136
Not for long, by the looks of it.
Let's wait and see. Blackwell is a disappointment, but if you want 5080 or 5090 levels of performance - you only have one place to go.

Also, it's not like Blackwell is Nvidia's last gen. Rubin might fix whatever issues Blackwell has, and RDNA5 still has to deliver.

Over here "in my neck of the woods", prices for the 9070 XT @ my usual online shop, range between 700€ and 900€: i'd have no problem with 800€ ... but 900€ seems a bit too much ...
9070 and 9070XT are also very expensive on my side (vs MSRP), but I expect prices to go down in a month or two. People are desperate for graphics cards. I've also been itching to replace my extremely aging system, but due to a large downpayment, I'm tightening the leash for the next few months. Also 9950x3d is just around the corner as well.
 

GTracing

Senior member
Aug 6, 2021
410
931
106
Some musings about VRAM after the past few days of discussion here.

VRAM usage used to be decided by your monitor resolution and graphics settings. Back during the Polaris/Pascal days, the VRAM amount went up as you went up the product stack:

4GB 1050 ti ($140), 6GB 1060 ($300), 8GB 1070 ($400), 8GB 1080 ($600), 11GB 1080 ti ($700)

Lower VRAM on midrange cards like the 4GB RX 480 was largely seen as fine.

Nowadays, higher resolutions don't matter nearly as much. It still plays a small role in VRAM usage, but no one asks what resolution you're playing at to decide how much VRAM you need. Even at 1080p, 8GB is not enough. And looking at the other end of the product stack, 16GB is fine on the 9070 and 9070 XT. It's created this strange situation where Radeon's whole product lineup needs 16GB VRAM; any GPUs with less VRAM aren't worth buying.
 
Reactions: Ranulf

blckgrffn

Diamond Member
May 1, 2003
9,581
4,066
136
www.teamjuchems.com
In hindsight, I think the lack of AMD 1st party model was a warning sign of incoming fake MSRP from AMD as well. Glad I got my 4070 in saner times.

This feels like Crypto Coin and COVID nonsense all over again.
The vibe in the MC line was very much this. A lot of agitation caused by the blink and you miss Blackwell launches and the talk of “adjusted” MSRPs inbound on the AMD side too. A lot of expectations of all MSRPs rising hundreds of dollars.

I don’t trust prices to go down in the coming months at all, especially here in the states. Tariff yo-yoing is going to ensure that retailers keep prices as high as possible to ensure they are able to cover their own positions, even if there is no active tariff. They will want to cover the replacement cost of the good and that will be an unknown literally until it clears customs.

@DAPUNISHER said it better but this is a weird time. Maybe there are other things in common with the crypto/covid time that weren’t as obvious then either.

I think there is way more belief that it could happen this time. We lived through $1200 3080’s and $900 3070 and here we are again…
 

coercitiv

Diamond Member
Jan 24, 2014
7,066
16,221
136
This feels like Crypto Coin and COVID nonsense all over again.

Ahh, the good old mining days with a rig like the one below. Except, wait a second, why is there a server board underneath?! Why are so many memory channels populated?! No.... not.... NOT AGAIN!



Jokes aside, the image above depicts a "homelab" AI rig, with 16x 3090 allegedly purchased from an old mining business, serviced by an Epyc 7663 /w 512 GB of RAM. People are building these as "low cost" alternatives for local AI compute. How much this affects the market is unknown, but I would wager it's definitely skewing the demand for 4090 / 5090 cards.

I would also not be surprised if at some point we find out that 16GB cards were used to power cheap home labs or worse, some kind of "datacenter" use. We live in times when not even AMD engineers have direct access to their own high-end AI products, they're instead given access to instances in the cloud. Demand of this scale will force all kinds of innovation from the little guys doing research or small scale business.
 

MrTeal

Diamond Member
Dec 7, 2003
3,855
2,494
136
So, it doesn’t look as bad as I thought. Why are HUB’s performance figures lower than others, and why is its power consumption much higher than average (including his 9070 review having worse efficiency than 5070)?
HUB/Techspot's raster numbers look fine. I'd imagine the difference in performance just comes down to game selection. HUB has a 6 game RT average, and it looks like they test both Wukong and Indy where RDNA4 still struggles. The other RT titles their numbers show it as faster than the 5070.

Reason #315 why big bar charts comparing 30 GPUs are not ideal.
 

DisEnchantment

Golden Member
Mar 3, 2017
1,760
6,675
136
Ahh, the good old mining days with a rig like the one below. Except, wait a second, why is there a server board underneath?! Why are so many memory channels populated?! No.... not.... NOT AGAIN!

View attachment 119405

Jokes aside, the image above depicts a "homelab" AI rig, with 16x 3090 allegedly purchased from an old mining business, serviced by an Epyc 7663 /w 512 GB of RAM. People are building these as "low cost" alternatives for local AI compute. How much this affects the market is unknown, but I would wager it's definitely skewing the demand for 4090 / 5090 cards.

I would also not be surprised if at some point we find out that 16GB cards were used to power cheap home labs or worse, some kind of "datacenter" use. We live in times when not even AMD engineers have direct access to their own high-end AI products, they're instead given access to instances in the cloud. Demand of this scale will force all kinds of innovation from the little guys doing research or small scale business.
We would be guilty of that too!!
Not sure where tech is heading, but like most of our competitors we are using AI/ML for almost everything now. Power point, Requirements analysis, Validation with Vision Model, Coding, etc..

I hope one thing for Zen 6 platform will come to fruition.
EPYC CXL.mem to come to DT.
Turin can interleave CXL and DDR memory regions already, so ideally GPU CXL device attached to the root can see the memory of the host.

Linux is getting patches for address translation for Zen 5. Hopefully this Zen 6 goes further in this direction

With such a setup we could install 1 TB of DDR on CPU and let the GPU use all of that for some LLMs and other interesting use cases. Can turn your Linux PC to some LLM monster
Will not be the most performant but at least can run something interesting.
I was just commenting on this same thing in Zen 6 thread, to get CXL.mem support on DT and use your UDNA GPU for some AI at home.

AMD's mistake was splitting DC and Client GPUs are separate architectures but if unified architecture with UDNA is to fix all that then CXL memory support could be a turn in the opposite direction allowing even more exotic ML models/algorithms to be developed and run on client GPU.
I kind of hope it will happen otherwise tough for budding AI graduates to compete with MS/big corpos for GPU shipments.
 
Reactions: coercitiv

coercitiv

Diamond Member
Jan 24, 2014
7,066
16,221
136
We would be guilty of that too!!
Not sure where tech is heading, but like most of our competitors we are using AI/ML for almost everything now. Power point, Requirements analysis, Validation with Vision Model, Coding, etc..
I wasn't point fingers, just giving people a heads up that gamers might already be into yet another dark time of availability and pricing. We'll have to see if the industry can react and fill the gap.
 

GTracing

Senior member
Aug 6, 2021
410
931
106
HUB/Techspot's raster numbers look fine. I'd imagine the difference in performance just comes down to game selection. HUB has a 6 game RT average, and it looks like they test both Wukong and Indy where RDNA4 still struggles. The other RT titles their numbers show it as faster than the 5070.

Reason #315 why big bar charts comparing 30 GPUs are not ideal.
Their Ray Tracing results are valid, but their power consumption tests are way off, and frankly makes no sense. All three of their 9070 power consumption tests have weird anomalies. In the first one, the 5070 ti draws more power than the 5080. In the second, the 7900 GRE is drawing 74W(!) less than the 7700 XT. In the last one the 9070 draws 60W more than the 5070.

I honestly don't know why they published those results. If they can't do power consumption testing right, then they should just leave it out of the review.

 

Saylick

Diamond Member
Sep 10, 2012
3,849
8,872
136
Eurogamer has an interview with Mark Cerny regarding FSR4 and PSSR:
https://www.eurogamer.net/digitalfo...-part-in-the-next-evolution-of-pssr-upscaling

"The neural network (and training recipe) in FSR 4's upscaler are the first results of the Amethyst collaboration," Cerny told us. "And results are excellent, it's a more advanced approach that can exceed the crispness of PSSR. I’m very proud of the work of the joint team!"

Explains why FSR4 is so good for AMD's first AI upscaler attempt. From what I hear, Sony's AI division is really good. Their professional cameras have been using machine learning image recognition to improve autofocus speed and accuracy for years now.
 
Reactions: Elfear

GaiaHunter

Diamond Member
Jul 13, 2008
3,673
286
126
I think that's what's going on. People just can't control themselves with the FOMO. With nVidia not producing much of anything, AMD is the only game in town.

At some point everyone is going to buy.
Their GPU died.
They bought a new higher Res/higher refresh rate monitor.
They want to play a new game and the performance of their system interferes with their fun.
They want to play a new game but it requires features their current GPU doesn't have.

I was a lot happier paying $200 for a GPU to play games.

My last upgrades were
rx480 to 5700xt - that was about double the performance for double the price.
5700xt to 9070xt - 2.5x to 3x performance (@4k since new monitor) plus additional features, for about 65% higher price.

Compared to CPU going from an i7 6700k to a 7700x for like double the cores and double the performance and additional features for the same price, it is atrocious.

If one is buying every generation to primarily play games, I hope you buying the halo products, because these days the only reason to upgrade every generation is for bragging rights.
 

Josh128

Senior member
Oct 14, 2022
681
1,170
106
Eurogamer has an interview with Mark Cerny regarding FSR4 and PSSR:
https://www.eurogamer.net/digitalfo...-part-in-the-next-evolution-of-pssr-upscaling



Explains why FSR4 is so good for AMD's first AI upscaler attempt. From what I hear, Sony's AI division is really good. Their professional cameras have been using machine learning image recognition to improve autofocus speed and accuracy for years now.
Sony has been researching and producing non-AI upscalers and upscaling techniques since the dawn of HD CRTs, back in the late '90s/early 00's. Their sets generally have better PQ than their competitors, even if they utilize the panel made by the competitor.
 

TESKATLIPOKA

Platinum Member
May 1, 2020
2,599
3,151
136
Because based on what we know the 9060XT die will be tiny so the thing will be dirt cheap. It would offer similar margin to the 9070XT in a higher volume product.

As for the stack it should be simple

9060XT - 16GB 20gbps ram $330
9060 - 12GB 20gbps ram, cut the bus to 96bit. $250
9050 - rebrand N33 and lower clocks to hit a sub $200 price point.

Any 8GB card for more than $200 will get absolutely shredded in reviews so no point in doing it. A 96bit 12GB card that manages 6700XT tier performance will be far better received than a 128but 8GB card that can sit in the gap between the 6700XT and 7700XT but then suffers from horrid performance/IQ in some of the latest titles due to Vram limits.
N33 was also dirt cheap to make. Cheaper than N44 to be honest and we all know what the price was.
 

TESKATLIPOKA

Platinum Member
May 1, 2020
2,599
3,151
136
Ah, you seem to be sure about that...

- 32CU @ 3.0 GHz with 18Gbps and 128bit -> 100% bandwidth / FLOPS
- 28CU @ 2.9 GHz with 20Gbps and 96bit -> 98% bandwidth / FLOPS
The full N44 supposedly uses 20gbps memory, so It would be only 88.7% and you need a cutdown version for that 96-bit bus.
But yeah, I can agree that AMD could make such models.
9600 XT: 32CU @ 3.2 GHz with 20Gbps and 128-bit 16GB Vram
9600: 28CU @ 2.75 GHz with 20Gbps and 96-bit 12GB Vram
This doesn't look half bad actually. The problem is that for laptops you are still limited to 8GB Vram.
 

TESKATLIPOKA

Platinum Member
May 1, 2020
2,599
3,151
136
You can do the same extrapolation from 3d centre data and you get the same result, a 9060XT that has an uplift over the 7600XT that is the same as the uplift the 9070XT had over the 7800XT is about 7700XT tier in raster and it will be faster still in RT. Further, given that the 9060XT, 7600XT, 7800XT and 9070XT should all have 16GB of VRAM it is not like we are in a situation where the 8GB performance is overstated due to poor testing practices that hides the stutters or hides the IQ degradation.
N48 has more CU than N32, but N44 didn't increase It. So N44 would need additional frequency to compensate.
As such a 9060XT 16GB that performs like a 7700XT in raster should probably cost no more than $350 and ideally $330 like the 7600XT costs. Those who think it will cost a lot more I believe to be misguided and those who think it will be cheaper might have a more pessimistic view of where its performance will land.
N48 costs more than N32 by $100, that's in my opinion more than enough proof that N44 won't cost the same as N33. My bet is $50 more.
 
Last edited:
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |