Discussion Nvidia Blackwell in Q1-2025

Page 105 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

adroc_thurston

Diamond Member
Jul 2, 2023
4,714
6,503
96
5090 is 0.88x L1 bandwidth with 1.33x SM's.
That is 50% less bandwidth/SM, what on earth?
Well it clocks lower.
But that's a nasty regression either way.
They lobotomized the SM to not win on fmax.
Like Maxwell lobotomized the SM inna bunch of ways relative to Kepler but it also clocked like 30-40% faster iso node.

I think they just missed and by quite a lot.
 

adroc_thurston

Diamond Member
Jul 2, 2023
4,714
6,503
96
With Ampere and the dual-FP32 design they doubled L1 bandwidth, seems like the return to Maxwell/Pascal-like ALU design also returned to the older L1.
Turing also had cucked L1 b/w relative to Volta.
Idk what's the reason of doing it here.
 

Grooveriding

Diamond Member
Dec 25, 2008
9,144
1,322
126
IgorLabs seems post early RTX 5080 review, and then took it down fast.
People who read there, was like 7% over 4080s in 4k

Pathetic. 5080 is a disaster. They know they can get it away with it though, as it’s still the third best card, and the second best is only available in the used market for more $$.
 

branch_suggestion

Senior member
Aug 4, 2023
504
1,051
96
Well it clocks lower.
But that's a nasty regression either way.
They lobotomized the SM to not win on fmax.
Like Maxwell lobotomized the SM inna bunch of ways relative to Kepler but it also clocked like 30-40% faster iso node.

I think they just missed and by quite a lot.
Clock drop is only like 10% so they clearly architected for density at the cost of associativity, hoping to get a good fmax bump to compensate ala Maxwell.
Instead the opposite happened with seemingly little to show in PPW.
With Ampere and the dual-FP32 design they doubled L1 bandwidth, seems like the return to Maxwell/Pascal-like ALU design also returned to the older L1.
Sounds like a design lead quirk.
 

ToTTenTranz

Senior member
Feb 4, 2021
278
522
136
Supply is very low because this is a **** gen so the only way they can get away with launching it is by creating artificial scarcity.

With 512bit GDDR7 bandwidth and 8x FP4 throughput, the RTX5090 can probably run DeepSeek's (or similarly trained) 32B model super fast.

It's going to be gobbled up by businesses that can now run GPT-o1 (ish) models on a ~$2000 GPU which is ridiculous compared to what they needed before. Same thing with the RTX5080 running 14B models.


Just genuinely baffled at NV delivering a new SM that's just like, nothing.

Blackwell SM were upgraded for transformers and tensor FP4, nothing else. There's really nothing there for gaming, except for the new transformer-based DLSS.
Gaming was an afterthought for Blackwell. It's an AI architecture. Be glad there's still ROPs in there.

They're going to be great cards if you want to become an AI hobbyist, though.
 

Win2012R2

Senior member
Dec 5, 2024
647
609
96
There's really nothing there for gaming
Changing 2nd FP32 to also support INT should have done something positive since according to Nvidia 1/3 ops are integer and that was supposedly limiting factor in dual issue, but it looks like it did diddly squat, perhaps because of bandwidth limitation for L1 and GDDR7 cant fix it

They know they can get it away with it though

Can they? Last time 4080 bombed (and "4080" got canned in days).
 
Last edited:

ToTTenTranz

Senior member
Feb 4, 2021
278
522
136
Changing 2nd FP32 to also support INT should have done something positive since according to Nvidia 1/3 ops are integer and that was supposedly limiting factor in dual issue, but it looks like it did diddly squat, perhaps because of bandwidth limitation for L1 and GDDR7 cant fix it

Dual-pumping ALUs make very little difference if they're not increasing caches and registers accordingly. Turing introduced FP32 + INT32 and that's where the bulk of performance increase per-SM happened. Ampere then made it FP32/INT32+FP32 and it did almost nothing, and now it's FP32/INT32+FP32/INT32 and it did nothing.


The truth is dual-pumping ALUs seems to be super cheap transistors and area-wise, but performance gains are just as small.
AMD saw the same when they introduced dual-pumped ALUs in RDNA3.
 
Reactions: Tlh97

CastleBravo

Member
Dec 6, 2019
174
405
136
I want to see performance of 5080 vs 4080 with DLSS4 transformer model turned on. If the 50 series doesn't lose ~5% FPS with transformer vs CNN like the 40 series, maybe the 5080 is at least slightly better.
 

Heartbreaker

Diamond Member
Apr 3, 2006
4,653
6,108
136
I want to see performance of 5080 vs 4080 with DLSS4 transformer model turned on. If the 50 series doesn't lose ~5% FPS with transformer vs CNN like the 40 series, maybe the 5080 is at least slightly better.

Totally better. 3X as many Fake Frames. What more could you ask for?
 
Reactions: lightmanek

CastleBravo

Member
Dec 6, 2019
174
405
136
Totally better. 3X as many Fake Frames. What more could you ask for?

Not frame gen, just super resolution. Blackwell might handle new and improved model without a performance hit unlike Ada and Ampere. Hopefully we end up with at least a "4080 Ti" rather than just a 4080 Tie.
 

Win2012R2

Senior member
Dec 5, 2024
647
609
96
Ampere then made it FP32/INT32+FP32 and it did almost nothing
Yeah, but that was explained at the time that since in gaming INT is 33% of the time they could not keep second FP32 busy, but now they both supposedly support it, so what's the problem - total bandwidth starving? It's important because if they fix that on N3 then perf in 60 series might be good = decision not to buy 50 series
 

ToTTenTranz

Senior member
Feb 4, 2021
278
522
136
Yeah, but that was explained at the time that since in gaming INT is 33% of the time they could not keep second FP32 busy, but now they both supposedly support it, so what's the problem - total bandwidth starving? It's important because if they fix that on N3 then perf in 60 series might be good = decision not to buy 50 series

60 series are most probably coming only in 2027.

RTX20/30 and RX6000 users are probably going to upgrade this year, and they're not waiting another 2 years.
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |