Discussion Nvidia Blackwell in Q1-2025

Page 102 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Heartbreaker

Diamond Member
Apr 3, 2006
4,653
6,108
136
I think this needs to end, Nvidia should to stop selling flagship cards and make them available through subscription only. $200/month and you get your 6090 delivery, no more bids and wasted time. For a premium $300/month you can even get early access to new cards before the press does (24 months subscription only, terms and conditions apply)

Subscription GPUs or any hardware sub will never take off. It’s not a good business model.

NVidia already sells subscription access to their gaming GPUs and a PC to go with it.. GeForce Now...

If you can live with the latency and have the internet for it, you can stop chasing HW, and just subscribe for much less. Though it's 080 class, not 090...
 

CP5670

Diamond Member
Jun 24, 2004
5,633
733
126
This is what it takes to get one. I will hang on to my 4090, got some good resale offers on it but I don't want to do this.

Subscription GPUs or any hardware sub will never take off. It’s not a good business model.
You would think so, but the car companies have all gone in this direction years ago. Tesla/BMW/Merc all charge annual fees for features. Same with consoles. I could see it happening on PCs and phones at some point.
 

Josh128

Senior member
Oct 14, 2022
612
1,001
106
This is what it takes to get one. I will hang on to my 4090, got some good resale offers on it but I don't want to do this.


You would think so, but the car companies have all gone in this direction years ago. Tesla/BMW/Merc all charge annual fees for features. Same with consoles. I could see it happening on PCs and phones at some point.
Yeah, but thats software locked features. The hardware is built into the cars already.
 

SiliconFly

Golden Member
Mar 10, 2023
1,925
1,280
96
Can someone ELI5 why DeepSeek being succesful would crush Nvidia prices? Is it not running on Nvidia GPUs?
Apparently, DeepSeek R1 requires only 1/25th to 1/30th of the resources. Thats like just less than 4% (or even roughly 3%) of the resources compared to older (and now possibly obsolete) models like ChatGPT. DeepSeek makes ChatGPT look like dinosaurs that just entered extinction.

There are videos on YouTube that show DeepSeek running on a Intel Arc 580 at usable speeds. Imagine that! If we patiently sift thru all the noise, one thing is clear. What just happened is a revolution in AI. How big of an revolution... that we can see in Nvidia stock prices.
 

coercitiv

Diamond Member
Jan 24, 2014
6,956
15,589
136
There are videos on YouTube that show DeepSeek running on a Intel Arc 580 at usable speeds. Imagine that!
Those videos are likely not running DeepSeek R1, but rather a hybrid LLM that distills the reasoning model of R1 into a smaller Llama or Qwen model. (that fits into the Arc 580 VRAM)
Using the reasoning data generated by DeepSeek-R1, we fine-tuned several dense models that are widely used in the research community. The evaluation results demonstrate that the distilled smaller dense models perform exceptionally well on benchmarks. We open-source distilled 1.5B, 7B, 8B, 14B, 32B, and 70B checkpoints based on Qwen2.5 and Llama3 series to the community.
DeepSeek-R1-Distill models are fine-tuned based on open-source models, using samples generated by DeepSeek-R1.We slightly change their configs and tokenizers. Please use our setting to run these models.
 

Heartbreaker

Diamond Member
Apr 3, 2006
4,653
6,108
136
Apparently, DeepSeek R1 requires only 1/25th to 1/30th of the resources. Thats like just less than 4% (or even roughly 3%) of the resources compared to older (and now possibly obsolete) models like ChatGPT. DeepSeek makes ChatGPT look like dinosaurs that just entered extinction.

There are videos on YouTube that show DeepSeek running on a Intel Arc 580 at usable speeds. Imagine that! If we patiently sift thru all the noise, one thing is clear. What just happened is a revolution in AI. How big of an revolution... that we can see in Nvidia stock prices.

I think there is a lot of exaggeration out there. From what I've read they are just combining already known efficiency techniques. It probably requires a lot more human labor to set it up, but being in China where human labor is cheaper and AI HW restricted, it's a perfect fit. In the west they may keep doing more brute force.

Also from what I'm reading the real threat is not to NVidia, but to the big proprietary AI models from Google and Open AI, because DeepSeek have open sourced their weightings.

Also all the efficiency is on the training side. The models still require same amount of VRAM that other models require.

Full DeepSeek V3 is 671 Billion Parameters, and needs over 1500 GB of VRAM... That's not running on any desktop GPU.

You can make small cut down models just like you can for other models.

There is a cut down DeepSeek 7 Billion parameter LLM model that will fit in 16GB, just as there are other cut down models...
 
Last edited:

CakeMonster

Golden Member
Nov 22, 2012
1,575
755
136
Obviously DeepSeek is doing something right, unless its a very elaborate fraud, clearly its a breakthrough that gives us a jump in performance and abilities. I don't get how NVidia should tank because of it though, clearly AI is not at some kind of end destination and won't be for a long time. Sure, if current AI offerings can processed on cheaper and slightly different product configurations than what NVidia is selling, that's a bummer for them short term. But this market is fast changing and why would the need for hardware stop at all? I think those who should feel threatened are the entitled frauds who were on top of the model game because of luck and mostly the work of others, and who have been making ridiculous sci-fi statements. I would LOVE for those to get wrecked by competition. NVidia is a very innovative hardware (and software) designer and although their practices in the gaming market have been questionable, there are bigger fish to fry who will feel it way before NVidia.
 

techjunkie123

Member
May 1, 2024
125
252
96
Full DeepSeek V3 is 671 Billion Parameters, and needs over 1500 GB of VRAM... That's not running on any desktop GPU.

You can make small cut down models just like you can for other models.

There is a cut down DeepSeek 7 Billion parameter LLM model that will fit in 16GB, just as there are other cut down models...
Is there a nice article that shows the performance dependence of the number of parameters in the model? Like what's the smallest reasonable model?
 

SiliconFly

Golden Member
Mar 10, 2023
1,925
1,280
96
That's not the problem for Nvidia, the problem is (if it's true) training that model only cost 6 mln USD
Nvidia is screwed. They're not walking out of this one with flying colors. Imagine all the people who spent billions building AI data centers. There's no easy way for them to make back their money. If it's Microsoft, Facebook, Google, Elon Musk, etc, I think it's kinda okay as they can take the hit. They rest, I pray for them.

Nvidia is now in such a precarious position, there's no simple way out. Looks like the AI bubble has finally burst!
 

yottabit

Golden Member
Jun 5, 2008
1,566
656
146
Nvidia is screwed. They're not walking out of this one with flying colors. Imagine all the people who spent billions building AI data centers. There's no easy way for them to make back their money. If it's Microsoft, Facebook, Google, Elon Musk, etc, I think it's kinda okay as they can take the hit. They rest, I pray for them.

Nvidia is now in such a precarious position, there's no simple way out. Looks like the AI bubble has finally burst!
I still really don't get it... but maybe you guys are smarter than me.

Like, if somebody invents a great new AI model, how is that going to be bad for the leader in GPUs?

Sure there's an order of magnitude efficiency gain in the training... won't they just make even bigger models then that are even better?

I don't get the problem for Nvidia. For OpenAI and Meta, I understand since this is an open model.

I think it's more just the collective shock of how much a unicorn tech bro bubble this was, that you don't have to pour 500B in to get meaningful results and the whole AI hype machine was built just as a pump and dump... maybe that I can see. But it feels like virtually anything could have triggered that realization.

Anyway, I'm happy there is a new and competitive open model, even if it wrecks my 401k.
 

SiliconFly

Golden Member
Mar 10, 2023
1,925
1,280
96
I still really don't get it... but maybe you guys are smarter than me.

Like, if somebody invents a great new AI model, how is that going to be bad for the leader in GPUs?

Sure there's an order of magnitude efficiency gain in the training... won't they just make even bigger models then that are even better?

I don't get the problem for Nvidia. For OpenAI and Meta, I understand since this is an open model.

I think it's more just the collective shock of how much a unicorn tech bro bubble this was, that you don't have to pour 500B in to get meaningful results and the whole AI hype machine was built just as a pump and dump... maybe that I can see. But it feels like virtually anything could have triggered that realization.

Anyway, I'm happy there is a new and competitive open model, even if it wrecks my 401k.
DeepSeek enables smaller but very capable models to run on very few GPUs. Someone on X/Twitter said he now runs his own instance locally in his 4090 that does code gen. It's nothing short of a miracle for a developer. No internet connection needed. No subscription needed. And no query limits. Like it's totally free. And as capable as o1.

In essence, when the requirements for AI come down drastically (both training & users), it directly hurts Nvidia sales as they're the predominant supplier. Of course, GPUs will always be needed for stuff like AGI. But for most common use cases, DeepSeek has literally made it free for many which kinda breaks Nvidia's backbone.
 
Reactions: Tlh97

biostud

Lifer
Feb 27, 2003
19,070
6,004
136
I still really don't get it... but maybe you guys are smarter than me.

Like, if somebody invents a great new AI model, how is that going to be bad for the leader in GPUs?

Sure there's an order of magnitude efficiency gain in the training... won't they just make even bigger models then that are even better?

I don't get the problem for Nvidia. For OpenAI and Meta, I understand since this is an open model.

I think it's more just the collective shock of how much a unicorn tech bro bubble this was, that you don't have to pour 500B in to get meaningful results and the whole AI hype machine was built just as a pump and dump... maybe that I can see. But it feels like virtually anything could have triggered that realization.

Anyway, I'm happy there is a new and competitive open model, even if it wrecks my 401k.
If it is "good enough" then many small companies can sell AI as a commodity or or let it be part of the usual "you are the product payment" and in that case the demand for AI will still grow really fast, but the demand for chips will grow much slower, because of you can get good enough for "free" only very few will pay for really expensive models.
 

APU_Fusion

Golden Member
Dec 16, 2013
1,411
2,118
136
DeepSeek enables smaller but very capable models to run on very few GPUs. Someone on X/Twitter said he now runs his own instance locally in his 4090 that does code gen. It's nothing short of a miracle for a developer. No internet connection needed. No subscription needed. And no query limits. Like it's totally free. And as capable as o1.

In essence, when the requirements for AI come down drastically (both training & users), it directly hurts Nvidia sales as they're the predominant supplier. Of course, GPUs will always be needed for stuff like AGI. But for most common use cases, DeepSeek has literally made it free for many which kinda breaks Nvidia's backbone.
New 6090 offers 70% more performance for a massive value of $999.99 🤣
 

maddie

Diamond Member
Jul 18, 2010
5,029
5,306
136
I still really don't get it... but maybe you guys are smarter than me.

Like, if somebody invents a great new AI model, how is that going to be bad for the leader in GPUs?

Sure there's an order of magnitude efficiency gain in the training... won't they just make even bigger models then that are even better?

I don't get the problem for Nvidia. For OpenAI and Meta, I understand since this is an open model.

I think it's more just the collective shock of how much a unicorn tech bro bubble this was, that you don't have to pour 500B in to get meaningful results and the whole AI hype machine was built just as a pump and dump... maybe that I can see. But it feels like virtually anything could have triggered that realization.

Anyway, I'm happy there is a new and competitive open model, even if it wrecks my 401k.
Nvidia's Revenue/profit projections were made on increasing sales of very expensive products, with demand growing quickly into the future. If you now need a much smaller amount of GPUs to deliver the same output, then all of those projections are worthless (worth less) by a lot.

What then is the new valuation for the company shares?

There seems, at present, to be limits to aggressively pushing larger models. Maybe further research will remove those limits, who knows.
 

coercitiv

Diamond Member
Jan 24, 2014
6,956
15,589
136
Like, if somebody invents a great new AI model, how is that going to be bad for the leader in GPUs?

Sure there's an order of magnitude efficiency gain in the training... won't they just make even bigger models then that are even better?
How much are you willing to bet on the bigger models? A few coins, a month's salary? There's a degree of hesitation in the market right now, investors had a wake-up call that was long overdue. Obviously nothing is certain and DeepSeek is riding a hype wave, but so were OpenAi until recently... it was all hype (in terms of investment confidence). If OpenAI can generate FOMO using smoke and mirrors, turns out another player can generate FUD using the same tricks.

This video from Asianometry does a good job to present the recent AI landscape, with both a healthy dose of skepticism and a good dose of optimism:
 

Josh128

Senior member
Oct 14, 2022
612
1,001
106
Nvidia is screwed. They're not walking out of this one with flying colors. Imagine all the people who spent billions building AI data centers. There's no easy way for them to make back their money. If it's Microsoft, Facebook, Google, Elon Musk, etc, I think it's kinda okay as they can take the hit. They rest, I pray for them.

Nvidia is now in such a precarious position, there's no simple way out. Looks like the AI bubble has finally burst!
Patty G disagrees wholeheartedly.

 

RnR_au

Platinum Member
Jun 6, 2021
2,205
5,271
106
Is there a nice article that shows the performance dependence of the number of parameters in the model? Like what's the smallest reasonable model?

I don't know of an article. https://old.reddit.com/r/LocalLLaMA/ is very active and if you read it for a bit you get used to the lingo and the latest models.

On performance. One of the other folk mentions coding below. There is a series of models dedicated towards coding, the Qwen 2.5 Coder. They come in sizes from 0.5b up to 32b. 32b is quite good. Folk are running it on a single RX 3090 at a decent q (see below) to fit into the vram limit with some context as well.

One aspect of performance is the tokens generated per second. Inference is mostly a memory bandwidth limited workload. The entire model needs to exercised for every token. So a 32b model at q4 (https://huggingface.co/bartowski/Qwen2.5-Coder-32B-Instruct-GGUF) needs to process roughly 20GB of weights for every token. So a rough rule of thumb is that the theoretically upper limit to performance is vram membw / model size = tokens/s. There is overhead ofcause and then the optimisation of the actual backend you are using. There is also prompt generation which needs to run before tokens starts to be generated.

I think there is a lot of exaggeration out there. From what I've read they are just combining already known efficiency techniques. It probably requires a lot more human labor to set it up, but being in China where human labor is cheaper and AI HW restricted, it's a perfect fit. In the west they may keep doing more brute force.
Nothing to do with human labour. Its entire due to the rumour that DeepSeek is a side project from a Chinese quant company. Lots of very very smart math heavy folk in the company with access to some nice gpu's that are idle during night time. The company is already profitable from their quant trading and so don't need a business model for their AI models. The reported number of gpu hours is so low that if you were to rent from a cloud provider to train the latest DeepSeek R1 it would cost $5.5mil. But this is a rumour. Could just be a brag.

There is a cut down DeepSeek 7 Billion parameter LLM model that will fit in 16GB, just as there are other cut down models...
Commonly models are not run with the weights in BF16 format, but rather in 8 bits or lower. The weights are quantised. So 7b model can run in 7GB of vram + overhead + context. Or at q4 for 3.5GB vram but not so good quality. It depends on what you need.

The rule of thumb is that the larger the model, the smaller the quant you can run without losing quality. So 72b models are often run at q4 spread across a couple of gpu's. A 3b model at q4 may perform terrible... ie generate nonsense, get lost in its own narrative or lose sight of the context and prompt the user passed in.

DeepSeek enables smaller but very capable models to run on very few GPUs. Someone on X/Twitter said he now runs his own instance locally in his 4090 that does code gen. It's nothing short of a miracle for a developer. No internet connection needed. No subscription needed. And no query limits. Like it's totally free. And as capable as o1.
This is nothing new. Folk have been running AI models locally for a long while. To be able to run the full DeepSeek R1 at home, you would need a serious amount of hardware and then be happy with a serious power bill. Most likely they are running a smaller distilled version which doesn't have R1's full quality but is worth running. But software developers have been using AI models for a while to help them scale their own personal productivity.

I've been reading LocalLLaMA every day for the last 6 months or so, and AI models leap frog eachother every month in capability terms. China is very strong in AI. But so is the West. The Nvidia bubble was always going to pop at some point and it seems that a mere rumour of a $5.5mil training cost was enough to question cap expenditure with Nvidia hardware.

Seems investors are skittish. Good.
 
Last edited:

yottabit

Golden Member
Jun 5, 2008
1,566
656
146
How much are you willing to bet on the bigger models? A few coins, a month's salary? There's a degree of hesitation in the market right now, investors had a wake-up call that was long overdue. Obviously nothing is certain and DeepSeek is riding a hype wave, but so were OpenAi until recently... it was all hype (in terms of investment confidence). If OpenAI can generate FOMO using smoke and mirrors, turns out another player can generate FUD using the same tricks.

This video from Asianometry does a good job to present the recent AI landscape, with both a healthy dose of skepticism and a good dose of optimism:
I think we can all agree the news is bad for Nvidia the meme stock in the short term here. I think it remains to be seen if it’s bad for their fundamentals (revenue) in the longer term.
 
Reactions: Tlh97 and coercitiv

SiliconFly

Golden Member
Mar 10, 2023
1,925
1,280
96
It has an even smaller moat there I think. Unless they bought patents recently.
They got on early in the crypto train. And so with AI. I'm sure they have a crack team working on the next big thing already. Thats what they do.
 
Reactions: Tlh97

Mopetar

Diamond Member
Jan 31, 2011
8,201
7,027
136
I think this needs to end, Nvidia should to stop selling flagship cards and make them available through subscription only. $200/month and you get your 6090 delivery, no more bids and wasted time. For a premium $300/month you can even get early access to new cards before the press does (24 months subscription only, terms and conditions apply)

Meanwhile everyone else can buy prepaid cards, with credit on them. Buy a $300 5060 with 300 hours of gaming on it and 6 months credit life. The more you buy, the more you save!

Shhh... you might give them ideas.
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |