Discussion RDNA4 + CDNA3 Architectures Thread

DisEnchantment · Mar 23, 2022

With the GFX940 patches in full swing since first week of March, it is looking like MI300 is not far in the distant future!
Usually AMD takes around 3Qs to get the support in LLVM and amdgpu. Lately, since RDNA2 the window they push to add support for new devices is much reduced to prevent leaks.
But looking at the flurry of code in LLVM, it is a lot of commits. Maybe because US Govt is starting to prepare the SW environment for El Capitan (Maybe to avoid slow bring up situation like Frontier for example)

See here for the GFX940 specific commits

History for llvm/lib/Target/AMDGPU - llvm/llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies. - History for llvm/lib/Target/AMDGPU - llvm/llvm-project

github.com

Or Phoronix

More AMD "GFX940" Enablement Work Landing In LLVM - Phoronix

www.phoronix.com

There is a lot more if you know whom to follow in LLVM review chains (before getting merged to github), but I am not going to link AMD employees.

I am starting to think MI300 will launch around the same time like Hopper probably only a couple of months later!
Although I believe Hopper had problems not having a host CPU capable of doing PCIe 5 in the very near future therefore it might have gotten pushed back a bit until SPR and Genoa arrives later in 2022.
If PVC slips again I believe MI300 could launch before it

This is nuts, MI100/200/300 cadence is impressive.

Previous thread on CDNA2 and RDNA3 here

Question - Speculation: RDNA3 + CDNA2 Architectures Thread

Man I have been dying to make this one for a while now. First rumours for RDNA3 are here so new thread time! Just going to start off with this one for now: kopite7kimi on Twitter: "@VideoCardz Ah, I mean a simple mcm design with 10240 cores is not enough. Because the lift from RDNA2 to RDNA3...

forums.anandtech.com

adroc_thurston · Dec 23, 2024

SolidQ said:
SM 3.0 was in like 2-3 games, Far Cry HDR and splinter cell and that all.

Yeah but it was Feature of the Day You Just Can't Miss and there was that.

SolidQ said:
R520 yes was bad, but R580 easily beat any G70

Yeah R580 kicked major ass but it also released close to g80 so gg.

inquiss said:
more a comment on the thinking that heavily discounted previous gen is the best comparison.

yeah like. GPU streep prices have been going down close to a new gen launch for 25 years.

inquiss · Dec 23, 2024

jpiniero said:
That's why AMD letting the Fire Sales happen is so problematic.

Yes, you get it. It makes the comparisons odd and people whine that a new product doesn't make sense against a heavily discounted old product.

SolidQ · Dec 23, 2024

adroc_thurston said:
Yeah but it was Feature of the Day You Just Can't Miss and there was that.

maybe for some people, but a lot friend was have X800 series, especially X800XL, because it's was cheap. I also buyed X800XL and played Doom3 etc
and SM 3.0 on 6xxx was irrelevant, because, back there GPU became obsolete much faster, than today. Like 2 years for top GPU and dead, today you can sitting on middle gpu like 6-7 years, on top even more

adroc_thurston · Dec 23, 2024

SolidQ said:
maybe for some people, but a lot friend was have X800 series, especially X800XL, because it's was cheap. I also buyed X800XL and played Doom3 etc
and SM 3.0 on 6xxx was irrelevant, because, back there GPU became obsolete much faster, than today. Like 2 years for top GPU and dead, today you can sitting on middle gpu like 6-7 years, on top even more

Yeah, but remember, NV product marketing is very sawwy and makes a mountain out of every molehill.

Joe NYC · Dec 23, 2024

GodisanAtheist said:
Devil's Advocate: 530mm2 of silicon on a relatively complicated packaging process just only matched NV's 380mm2 die size in raster with far worse ray tracing performance and "feature set" (regardless of its value to you).

Of course, RDNA3's issues seem to stem from its architecture rather than its chiplet packaging, but it can be hard to separate the two without direct acknowledgement from AMD on what went wrong.

The problems were clock speed and power consumption (almost no increase in performance / power). On the main die. Neither is related to chiplet arrangement. The chiplet has a a bit of an overhead, but limited. Not the cause of RDNA3 not being able to clock high enough at reasonable power consumption.

insertcarehere · Dec 23, 2024

adroc_thurston said:
Spoken like someone who never had CUDA setup break in horrendous ways on update.
You're talking in abstract hypotheticals and not anything related to NV devportal experience. Stop.

It does not take much experience reading Dylan's article from SemiAnalysis here benchmarking the MI300X to the NVs to see how much worse the experience is on team AMD. Even with VIP custom dev builds directly from AMD themselves specifically for the test (!) this still gets worse performance than any of NV's offerings out of the box with more bugs and other nasties. Small pricing discounts will not cut it for enterprises to tolerate these sort of issues.

Choice quotes from the article:

The only reason we have been able to get AMD performance within 75% of H100/H200 performance is because we have been supported by multiple teams at AMD in fixing numerous AMD software bugs. To get AMD to a usable state with somewhat reasonable performance, a giant ~60 command Dockerfile that builds dependencies from source, hand crafted by an AMD principal engineer, was specifically provided for us, since the Pytorch Nightly and public PyTorch AMD images functioned poorly and had version differences. This docker image requires ~5 hours to build from source and installs dependencies and sub-dependencies (hipBLASLt, Triton, PyTorch, TransformerEngine), a huge difference compared to Nvidia, which offers a pre-built, out of the box experience and takes but a single line of code.

Although AMD’s own documentation recommends using PyTorch native Flash Attention, for a couple months this summer, AMD’s PyTorch native Flash Attention kernel ran at less than 20 TFLOP/s, meaning that a modern CPU would have calculated the attention backwards layer faster than an MI300X GPU. For a time, basically all Transformer/GPT model training using PyTorch on the MI300X ran at a turtle’s pace. Nobody at AMD noticed this until a bug report was filed following deep PyTorch/Perfetto profiling showing the backwards pass (purple/brown kernels) took up far more time than the forward pass (dark green section). Normally, the backwards section should take up just ~2x as much time as the forward pass (slightly more if using activation checkpointing).

adroc_thurston · Dec 23, 2024

insertcarehere said:
It does not take much experience reading Dylan's article from SemiAnalysis here benchmarking the MI300X to the NVs to see how much worse the experience is on team AMD. Even with VIP custom dev builds directly from AMD themselves specifically for the test (!) this still gets worse performance than any of NV's offerings out of the box with more bugs and other nasties. Small pricing discounts will not cut it for enterprises to tolerate these sort of issues.

Yeah, that's training. Not what the customers asked for (for now, anyway).
Now please showcase your actual hands-on experience with CUDA.

SolidQ · Dec 23, 2024

itsmydamnation said:
because words cost nothing......

Jack twitter, and that where 40% market share came

itsmydamnation · Dec 23, 2024

SolidQ said:
Jack twitter, and that where 40% market share came

And ?

you people are so odd.

Try working in the highend of design for an IT megacorp, people say stuff all the time.....

This thread just feels like FanFic........

What are AMD's actions.

When I look at AMD's actions , it looks like they are building top to bottom product stacks in every market they compete in except DGPU....

I say this as quite the happy owner of a 7900XTX.

You could probably argue the only reason to have a 200-300 watt class DGPU is to lop 10-15% clock off it to put it into 100-150watt laptop chassis.

Joe NYC · Dec 23, 2024

adroc_thurston said:
server table scraps. Not a purpose-build product.

AMD EPYC revenue: ~1.5 billion
NVidia client dGPU revenue: ~3.3 billion

The dGPU market is not table scraps. It is more than AMD EPYC product line.

I think the bigger problem is AMD not having enough staff for datacenter GPU, and probably raided the client GPU division.

adroc_thurston · Dec 23, 2024

Joe NYC said:
AMD EPYC revenue: ~1.5 billion

34% unit share with CPU capex majorly down due to GPU overspend.

Joe NYC said:
NVidia client dGPU revenue: ~3.3 billion

91% unit share.

Joe NYC said:
The dGPU market is not table scraps

Yeah it is, especially in a tough comp env that forces a price war.

Joe NYC said:
I think the bigger problem is AMD not having enough staff for datacenter GPU

They do.

Joe NYC said:
and probably raided the client GPU division.

Client GFX IP roadmap is wholly intact. They just surrendered the actual discrete GPUs as a market.

SolidQ · Dec 23, 2024

adroc_thurston said:
They just surrendered the actual discrete GPUs as a market.

I think they will back in future. Someone will set the task, create HALO gpu which beat NV, and then will take market share, also gonna hiring good marketing people

adroc_thurston · Dec 23, 2024

SolidQ said:
I think they will back in future.

nope.

SolidQ said:
Someone will set the task, create HALO gpu which beat NV

Wasting money is not how AMD operates.

SolidQ said:
and then will take market share, also gonna hiring good marketing people

never gonna happen.

SolidQ · Dec 23, 2024

adroc_thurston said:
never gonna happen.

their enemy is software and marketing teams.

Tup3x · Dec 23, 2024

SolidQ said:
SM 3.0 was in like 2-3 games, Far Cry HDR and splinter cell and maybe yet one game was

When SM 3.0 was a new thing it obviously didn't matter but after that it became a must. When that happened GeForce 6 series was just too slow.

adroc_thurston · Dec 23, 2024

SolidQ said:
their enemy is software and marketing teams.

Their enemy is the rising semiconductor R&D cost.

SolidQ · Dec 23, 2024

adroc_thurston said:
Their enemy is the rising semiconductor R&D cost.

Also, but when my friend change 7800XT for 4070 super because it's like 7 times faster in Blender, that minus client and there a lot people with same situation. They need fix Adobe/Blender etc perfomance. Step by step they need fix everything, because monopoly is bad for all of us

adroc_thurston · Dec 23, 2024

SolidQ said:
but when my friend change 7800XT for 4070 super because it's like 7 times faster in Blender, that minus client and there a lot people with same situation.

Niche. Average NV client customer is a kid that needs a GPU.

SolidQ said:
They need fix Adobe/Blender etc perfomance. Step by step they need fix everything, because monopoly is bad for all of us

Irrelevant. They're not in the game without a halo part so any ounce of bling is just money down the drain.

SolidQ · Dec 23, 2024

adroc_thurston said:
Irrelevant. They're not in the game without a halo part so any ounce of bling is just money down the drain.

step by step. Not everyone need halo product, when non-forum people hearing AMD driver bad/software is bad etc. ofc you won't gain any share

Niche.

Niche, not niche. Money is money. Share is share

adroc_thurston · Dec 23, 2024

SolidQ said:
Not everyone need halo product

You need a halo product to move units.

SolidQ said:
when non-forum people hearing AMD driver bad/software is bad etc. ofc you won't gain any share

It's really easy for the shills to find new ways to bash AMD GPUs with. It's a futile effort. A game long lost.

SolidQ said:
Niche, not niche. Money is money. Share is share

That only works with a chainsaw. No chainsaw and you're just drowning money for kicks.

tajoh111 · Dec 23, 2024

adroc_thurston said:
Their enemy is the rising semiconductor R&D cost.

Seems like your agreeing with me now about struggling with R and D cost and the discrete market battle isn't worthwhile to AMD considering other market opportunities.

adroc_thurston · Dec 23, 2024

tajoh111 said:
Seems like your agreeing with me now about struggling with R and D cost

They're not short of cash.

tajoh111 said:
the discrete market battle isn't worthwhile to AMD considering other market opportunities.

It can be worthwhile, but it's also a very casino market and running expensive R&D programs to gamble is exactly the thing that almost killed AMD before.

DaaQ · Dec 23, 2024

linkgoron said:
I've posted this before at least twice, but I'll post it again - here's AMD's TAM strategy for Polaris the last time they were extremely behind with just a mainstream lineup (basically 480 was kind of competitive with the 1060, Nvidia was alone with 1070/1080 and of course 1080ti later on). I remember that the "cheap" 4GB (for $200) was mostly marketing and in reality it was just for the initial launch or for a very short while and then they actually stopped making them (so actual entry price was $240), but I couldn't find a source to back my memory up. When AMD is extremely behind they always claim how they're going for TAM or whatever. What are their other options? Saying that they have a weak brand? that Nvidia is pushing for RT and DLSS and that they're behind?

They could say that they've failed with their chiplet strategy for client, and how they wrecked four years of client GPUs (RDNA3/RDNA4 at least), but they don't want to do that. They've failed before with going all-in on HBM on client, and now they've made a similar mistake with chiplets.

I get it, I've had solely AMD cards since the Radeon HD 4870 (last Nvidia card was the 6600GT), but given AMD's recent behavior even if they'll have a winner, they won't significantly undercut Nvidia. They had a competitive lineup with RDNA2 and didn't really undercut Nvidia at launch MSRP-wise with the 6800xt and 6700xt. RDNA3 also (IMO) didn't really undercut Nvidia enough. They've shown this time and time again. We'll see what happens with RDNA4, but given RDNA3's pricing with the 7900xtx and 7900xt - I'm not hopeful. Only when it was very clear that RDNA3 was a dud, only then did AMD provide decently priced cards - with the 7800XT and 7900GRE.

Bold:
Looks like you only want them to undercut Nvidia to lower Nvidia pricing. You already said you are going that route.
Underlined:
Do you own a chiplet RDNA3 card? Doesn't seem a dud to me.

Shmee · Dec 23, 2024

I certainly hope that AMD does not give up the competition, they have been doing relatively well with their innovation and delivering nice things. Anyway time will tell, but I hope that we can get a top end card with RDNA5/UDNA whatever...and hopefully with a name that makes sense

Anyway, I will be happy for now with my 7900XTX from Sapphire. But I do hope that someone will continue to compete with Nvidia at the higher end, as many of us don't want the melty connectors on our cards. Without AMD Radeon to do this, who would? Intel? It would be even harder for them.

SolidQ · Dec 23, 2024

Shmee said:
will continue to compete with Nvidia at the higher end, as many of us don't want the melty connectors on our cards.

hope they gonna at least compete with xx80 series, even would be fine if they make perfomance between NV 8 and 9 series

about Mi300

https://twitter.com/x/status/1871287937268383867

Discussion RDNA4 + CDNA3 Architectures Thread

Golden Member

Diamond Member

Senior member

Golden Member

Diamond Member

Platinum Member

Senior member

Diamond Member

Golden Member

Platinum Member

Platinum Member

Diamond Member

Golden Member

Diamond Member

Golden Member

Golden Member

Diamond Member

Golden Member

Diamond Member

Golden Member

Diamond Member

Senior member

Diamond Member

Golden Member

Memory & Storage, Graphics Cards Mod Elite Member

Golden Member