The AI discussion thread

Page 36 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

[DHT]Osiris

Lifer
Dec 15, 2015
16,589
15,467
146
So, you assume that whatever a PhD in astrophysics is true. No, that wasn't you, but I have to think that since physics is in chaos astrophysics can't be a settled realm at this point. Astrophysics uses the laws of physics. Since those are all in doubt, so is astrophysics.
Eh... macrophysics and microphysics are two different realms. We can calculate the age, size, density, energy, distribution, and makeup of the universe, galactic clusters, galaxies, star systems, and stellar bodies within a few percentage points. They all work within a realm of physics that is relatively simple and quite well understood.

Subatomic physics is another realm entirely, and is frankly voodoo bullshit.
 

KMFJD

Lifer
Aug 11, 2005
31,431
49,084
136
wired article, not in depth though



DeepSeek had to come up with more efficient methods to train its models. “They optimized their model architecture using a battery of engineering tricks—custom communication schemes between chips, reducing the size of fields to save memory, and innovative use of the mix-of-models approach,” says Wendy Chang, a software engineer turned policy analyst at the Mercator Institute for China Studies. “Many of these approaches aren’t new ideas, but combining them successfully to produce a cutting-edge model is a remarkable feat.”

DeepSeek has also made significant progress on Multi-head Latent Attention (MLA) and Mixture-of-Experts, two technical designs that make DeepSeek models more cost-effective by requiring fewer computing resources to train. In fact, DeepSeek's latest model is so efficient that it required one-tenth the computing power of Meta's comparable Llama 3.1 model to train, according to the research institution Epoch AI.
 

quikah

Diamond Member
Apr 7, 2003
4,151
710
126
DeepSeek is interesting. if all their claims pan out this could definitely kill OpenAI. The capital intensive AI models of OpenAI are not sustainable IMO. Will have to see how OpenAI responds. DeepSeek is opensource, so possibly OpenAI could use these innovations to improve their own models.
 

misuspita

Senior member
Jul 15, 2006
660
777
136
Very interesting and prompted the biggest loss of value of a company ever, nVidia....



Even though China's Deepseek is of course biased against Tianmen and Taiwan, you can also see the same kind of bias in ChatGPT and western AI towards other matters
 
Reactions: Oric
Feb 4, 2009
35,741
17,281
136
Very interesting and prompted the biggest loss of value of a company ever, nVidia....



Even though China's Deepseek is of course biased against Tianmen and Taiwan, you can also see the same kind of bias in ChatGPT and western AI towards other matters
Such as?
 

biostud

Lifer
Feb 27, 2003
19,331
6,346
136
Very interesting and prompted the biggest loss of value of a company ever, nVidia....



Even though China's Deepseek is of course biased against Tianmen and Taiwan, you can also see the same kind of bias in ChatGPT and western AI towards other matters
But isn't that just the data it is trained on? If the model is open source then anyone can use it to train on other data and build a less China biased model?
 
Reactions: igor_kavinski

misuspita

Senior member
Jul 15, 2006
660
777
136
I admit knowing of those things indirectly from reading reddit, but googling chatgpt censoring gives enough results. Problem is its evolving and things it didn't censored before now are. Breastfeeding, political things, etc. The new Trump dynasty may even promt some changes to the responses allowed towards him, it's family, actions. I wouldn't put it past them especially since he loves authoritarianism so much
 
Reactions: biostud

biostud

Lifer
Feb 27, 2003
19,331
6,346
136
I admit knowing of those things indirectly from reading reddit, but googling chatgpt censoring gives enough results. Problem is its evolving and things it didn't censored before now are. Breastfeeding, political things, etc. The new Trump dynasty may even promt some changes to the responses allowed towards him, it's family, actions. I wouldn't put it past them especially since he loves authoritarianism so much
Like J6
 

[DHT]Osiris

Lifer
Dec 15, 2015
16,589
15,467
146
No current publicly accessible AI models (as far as I'm aware of) have access to live data. They're all working off a dataset that's x months or years old. It might be able to tell you the current day or maybe weather if it has hooks in it, but ask it what the most recent sub variant of COVID is and it'll give you a rough idea of how old the data it was trained on is.
 
Feb 4, 2009
35,741
17,281
136
Are any of the AI video generators not complete shit? I totally understand it’s about the prompts they all do multiple things wrong such as:
Wildly vary based upon the prompt
Need multiple requests to get it moderately good
Conceal their pricing
Hide their pricing as in how many images/videos can be made with 20 credits
Are constantly “busy” during the free trial
Need apps to function and those apps tend to be made by someone else and their review scores swing quite a lot.
 

RnR_au

Platinum Member
Jun 6, 2021
2,344
5,624
106
But isn't that just the data it is trained on? If the model is open source then anyone can use it to train on other data and build a less China biased model?
The model weights are open and free. The model arch is open and free so open source back ends can and have implemented the arch, so the model weights can be run on anyone's hardware.

The training data is not free and open. You need trillions of tokens to process during training on 10's of thousands of gpu's over periods of weeks to months.

You can fine tune the publicly available weights to remove any blindspots or censorships. This is done commonly enough since some folks like to run AI's locally that are good at generating smut reading material according to the owners fetishes
 

KMFJD

Lifer
Aug 11, 2005
31,431
49,084
136

lol, lmao

this is part of the reason isn't it?

https://stratechery.com/2025/deepseek-faq/

more detailed

Here’s the thing: a huge number of the innovations I explained above are about overcoming the lack of memory bandwidth implied in using H800s instead of H100s. Moreover, if you actually did the math on the previous question, you would realize that DeepSeek actually had an excess of computing; that’s because DeepSeek actually programmed 20 of the 132 processing units on each H800 specifically to manage cross-chip communications. This is actually impossible to do in CUDA. DeepSeek engineers had to drop down to PTX, a low-level instruction set for Nvidia GPUs that is basically like assembly language. This is an insane level of optimization that only makes sense if you are using H800s.

Lastly, we emphasize again the economical training costs of DeepSeek-V3, summarized in Table 1, achieved through our optimized co-design of algorithms, frameworks, and hardware. During the pre-training stage, training DeepSeek-V3 on each trillion tokens requires only 180K H800 GPU hours, i.e., 3.7 days on our cluster with 2048 H800 GPUs. Consequently, our pre- training stage is completed in less than two months and costs 2664K GPU hours. Combined with 119K GPU hours for the context length extension and 5K GPU hours for post-training, DeepSeek-V3 costs only 2.788M GPU hours for its full training. Assuming the rental price of the H800 GPU is $2 per GPU hour, our total training costs amount to only $5.576M. Note that the aforementioned costs include only the official training of DeepSeek-V3, excluding the costs associated with prior research and ablation experiments on architectures, algorithms, or data.
 
Last edited:

Kaido

Elite Member & Kitchen Overlord
Feb 14, 2004
49,837
6,182
136
DeepSeek is interesting. if all their claims pan out this could definitely kill OpenAI. The capital intensive AI models of OpenAI are not sustainable IMO. Will have to see how OpenAI responds. DeepSeek is opensource, so possibly OpenAI could use these innovations to improve their own models.

ChatGPT just lost its job to AI
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |