Discussion Nvidia Blackwell in Q1-2025

MrTeal · Mar 21, 2024

2x L2 per die seems odd but is possible. H100 had 60MB of L2, though with one of the six MC's disabled most only had 50MB. If the layout is similar and they are referring to L2, that would be 30MB per MC.

SmokSmog · Mar 21, 2024

104B xtors at 850mm2 ( 858mm2 is the reticle limit) = 122.3MTr/mm2
So similar to AD102 which has 125MTr/mm2 on 4N.

They probably used the same libraries as AD102.
Don't expect more than 130MTr/mm2 on GB202.

Navi31 GCD has 150MTr/mm2 cuz it's just a compute die without memory controllers.

MoogleW · Mar 22, 2024

To process these datasets efficiently on GPUs, the Blackwell architecture introduces a hardware decompression engine that can natively decompress compressed data at scale and speed up analytics pipelines end-to-end. The decompression engine natively supports decompressing data compressed using LZ4, Deflate, and Snappy compression formats.

https://developer.nvidia.com/blog/n...rameter-llm-training-and-real-time-inference/

Nvidia could perhaps improve their DirectStorage implementation from previous gen for consumer cards.

Heartbreaker · Mar 22, 2024

Saylick said:
Looks like the rumors were true: B100 is an MCM (not quite chiplet imo) comprised of two reticle limit TSMC N4P dies connected with a high-bandwidth interconnect, likely NVLink 5 C2C. 8 stacks of HBM3E memory total, or 4 stacks per die. It's an engineering marvel and an absolute beast but I'm kind of numb to all this AI stuff tbh, especially since it's the new buzzword so it's impossible not to get reminded of it on a daily basis. This stuff doesn't impact me directly so while it is impressive, it's largely forgettable for me, the average consumer casual gamer.

Essentially this is the same path as Apple M1 - Ultra. Two massive chips with a massive interconnect.

igor_kavinski · Mar 22, 2024

MoogleW said:
Nvidia could perhaps improve their DirectStorage implementation from previous gen for consumer cards.

Fat chance anyone is working there who believes in improving old stuff for the good of mankind.

gdansk · Mar 22, 2024

Why are both gaming and data/AI center called Blackwell this time? Are they more similar than Ada and Hopper? It sounds like they will be very different chips.

Did Ada and Hopper have different ISA?

blckgrffn · Mar 22, 2024

MoogleW said:
https://developer.nvidia.com/blog/n...rameter-llm-training-and-real-time-inference/

Nvidia could perhaps improve their DirectStorage implementation from previous gen for consumer cards.

Whatever they do in that regard, it should be accelerating the calls that MS DirectStorage uses. Otherwise, why bother other than to sponsor three titles that use it and plant a victory flag for yet another proprietary tech that will likely see little industry uptake?

HW acceleration for DirectStorage does seem like something we could see in the next major release or two of GPUs and might be a feature people actually care about.

MoogleW · Mar 25, 2024

igor_kavinski said:
Fat chance anyone is working there who believes in improving old stuff for the good of mankind.

I feel Nvidia trying to be better than others in some aspect of cutting edge graphics tech is as nvidia as nvidia ever gets.

On another note, I am yet to see Nvidia's work graphs performance to compare with AMD's demo.

Mopetar · Apr 2, 2024

jpiniero said:
Cache in general is stupidly expensive on N3E.

PHY is even worse. It's probably worth it to take whatever space you would save dropping from a 512-bit bus to a 384-bit bus and turn it into cache.

igor_kavinski said:
Question: is consumer Blackwell also going to be a multi-chip solution? If so, that would make it very expensive since the wafer yield will essentially get halved.

This is not a consumer chip and there's no reason to use it as the basis for any guesses about what the dies that get used for consumer products will look like.

Unless NVidia is willing to release a consumer card with HBM, you won't see one of these dies on a GForce card, let alone two of them.

SolidQ · Apr 9, 2024

It's for real x2 SM compared to 5080?

Aapje · Apr 9, 2024

Makes sense. The AI people will pay whatever it will cost. I do think that the 5090 will get a significant cut. They can sell a 176 SM 5090 for $2000 or so. Then if RDNA5 is really good or the AI market crashes, they can sell 5090 Ti later on.

SolidQ · Apr 9, 2024

Aapje said:
Then if RDNA5 is really good or the AI market crashes, they can sell 5090 Ti later on.

We don't have any info. With what 5090 will compete N52 or N51

Mahboi · Apr 9, 2024

SolidQ said:
It's for real x2 SM compared to 5080?

> 512 bit GB202
I wish we would stop with this meme. Every card that did this was an incredibly expensive waste of money. Look at the Titans.
Giant buses are not coming back except as the most desperate last ditch attempt to punch above your weight.

jpiniero · Apr 9, 2024

Aapje said:
Makes sense. The AI people will pay whatever it will cost. I do think that the 5090 will get a significant cut. They can sell a 176 SM 5090 for $2000 or so. Then if RDNA5 is really good or the AI market crashes, they can sell 5090 Ti later on.

I think it's going to be a lot less than 170 SM for 5090. Also the memory bus cut down to 384-bit.

All the fully enabled memory controllers will go to AI cards.

jpiniero · Apr 9, 2024

Mahboi said:
Giant buses are not coming back except as the most desperate last ditch attempt to punch above your weight.

AI

Mahboi · Apr 9, 2024

jpiniero said:
AI

That ain't commercial, and honestly with GDDR7 having beefier chips, I'm not sure it makes sense either.
Also I bet 160 SMs. Nvidia is legendarily paranoid, if there isn't an AMD to beat into the ground, they'll just shadow box the wind until they die. They always go hard.

techenjoyer · Apr 9, 2024

nvdia been making some big moves recently

jpiniero · Apr 9, 2024

Mahboi said:
That ain't commercial, and honestly with GDDR7 having beefier chips, I'm not sure it makes sense either.

It's for the memory capacity. NV can boost it further once 3 GB chips come out.

Mahboi · Apr 9, 2024

That's exactly what I said, yes. With GDDR7 having 3Go chips, the bus makes even less sense.

jpiniero · Apr 9, 2024

Mahboi said:
That's exactly what I said, yes. With GDDR7 having 3Go chips, the bus makes even less sense.

Going to 512-bit increases the memory capacity from 48 to 64... and 3 GB chips increases it to 96. That's the reason really.

Mahboi · Apr 9, 2024

jpiniero said:
Going to 512-bit increases the memory capacity from 48 to 64... and 3 GB chips increases it to 96. That's the reason really.

You do realise Nvidia is the stingiest company ever when it comes to VRAM?
Not happening.

MrTeal · Apr 9, 2024

Mahboi said:
> 512 bit GB202
I wish we would stop with this meme. Every card that did this was an incredibly expensive waste of money. Look at the Titans.
Giant buses are not coming back except as the most desperate last ditch attempt to punch above your weight.

Even the Titans didn't have 512 bit busses, Nvidia hasn't made one since GT200 15 years ago. Since then anything that's needed more than 384 bit could provide has been HBM.

Aapje · Apr 9, 2024

Mahboi said:
That's exactly what I said, yes. With GDDR7 having 3Go chips, the bus makes even less sense.

There aren't any 3 Gb chips until at least the refresh.

I wish we would stop with this meme. Every card that did this was an incredibly expensive waste of money. Look at the Titans.
Giant buses are not coming back except as the most desperate last ditch attempt to punch above your weight.

You seem to be judging it as a gaming chip. It's not. It's a professional & prosumer card. If gamers want to pay out of the nose for it, that's great, but it's not what it's designed for.

jpiniero · Apr 9, 2024

Aapje said:
You seem to be judging it as a gaming chip. It's not. It's a professional & prosumer card. If gamers want to pay out of the nose for it, that's great, but it's not what it's designed for.

Yeah. That's what I was getting at. These are AI first, AI second, and very expensive gaming cards third.

Aapje · Apr 9, 2024

MrTeal said:
Even the Titans didn't have 512 bit busses, Nvidia hasn't made one since GT200 15 years ago. Since then anything that's needed more than 384 bit could provide has been HBM.

There is a HBM shortage, so companies are using GDDR cards for AI and other tasks that would benefit from HBM.

Discussion Nvidia Blackwell in Q1-2025

Diamond Member

Member

Member

Diamond Member

Lifer

Diamond Member

Diamond Member

Member

Diamond Member

Golden Member

Golden Member

Golden Member

Golden Member

Lifer

Lifer

Golden Member

Junior Member

Lifer

Golden Member

Lifer

Golden Member

Diamond Member

Golden Member

Lifer

Golden Member