Question By 2030, we will be buying massive NPUs with a CPU and GPU attached to it.

mikegg · Mar 12, 2023

Today, NPUs such as Apple's neural engine takes less space than the CPU or GPU in a SoC. By 2030, I predict that we won't be buying "CPUs". We will all be buying NPUs with a CPU and a GPU attached to it.

NPUs will become the new CPUs.

More applications will start to make massive use of AI inference. Soon, consumers will demand that their laptops and mobile phones infer models as big as GPT or LLaMA or Stable Difusion or future large models. It has been theorized that the current iPhone 14 Pro could infer Meta's LLaMA model, though slowly and with much less accuracy.

In order to do this, chip makers will focus on making NPUs and making them huge.

We are in the beginning of a complete paradigm shift in chip requirements.

Apologies if this is the wrong place to post this. There is no NPU forum on Anandtech.

TheELF · Mar 12, 2023

If A.I. will become standard in systems then they will become accelerator modules inside CPUs and GPUs and not the other way around.
Intel already included nervana and then habana in their GPUs years ago.
The inference for apps will still be happening on servers, there is no reason for the same work to be done a billion times over instead just once.

Timmah! · Mar 12, 2023

I will be buying RTX 7090 with motherboard, CPU, RAM and whatnot plugged into it.

mikeymikec · Mar 12, 2023

In ten years' time many people will still be using processors available today. Tech adoption does not move that quickly.

TheELF · Mar 12, 2023

Timmah! said:
I will be buying RTX 7090 with motherboard, CPU, RAM and whatnot plugged into it.

Nvidia maybe not, but intel already has this, it's just a question of if they can pull it off with one of their full power ARC GPUs on there.
This thing is supposed to plug into a backboard with nothing on it other than a PCI bridge for an GPU but derbauer put it into a system.
Yo dawg...

A/// · Mar 12, 2023

Had no idea Nostra had a twin.

Mopetar · Mar 12, 2023

The NPU in your phone will actually shape your thought patterns in order to make you want a better NPU so that more of human economic activity is devoted to developing a better NPU.

If you want to prospect yourself from your future AI overlords, I suggest you purchase one of the lovely tinfoil hats I happen to have available for sale. Get one before the NPU gets you.

NostaSeronx · Mar 12, 2023

*shakes foretelling 8-ball* ah yes...indeed hmm..

I do not foresee a Neural/Inference processing unit craze. CPUs and GPUs will remain as the big features and NPU/IPU will remain as minor features. Nice to have but nothing important.

Proof of the 8-ball:

It has been foretold! *magic poof and Nosta without an R vanished*

soresu · Mar 12, 2023

Timmah! said:
I will be buying RTX 7090 with motherboard, CPU, RAM and whatnot plugged into it.

Not really viable simply because of how much IO that mobos have attached to them and how much space these connectors take up on the board.

It's possible that they could simply split some of this IO into separate daughter boards as they did decades ago - it would probably make installation less of an issue if you could split all of those connectors that are such a problem matching up once the mobo is already inside the case so that you can just plug it all into one or 2 slots on the mobo, especially on ATX12VO mobos which need extra connectors anyway.

moinmoin · Mar 12, 2023

How can one start this thread and not be aware of Lisa Su's recent talk about compute efficiency?

Her point essentially is: Energy efficiency is the primary limiter. For a target of >10.000 GF/Watt for <= 100MW Zettascale the industry will need to find ways to use AI in integral ways as efficiency shortcuts. So CPUs and GPUs fundamentally assisted by NPUs.

Shivansps · Mar 12, 2023

TheELF said:
Nvidia maybe not, but intel already has this, it's just a question of if they can pull it off with one of their full power ARC GPUs on there.
This thing is supposed to plug into a backboard with nothing on it other than a PCI bridge for an GPU but derbauer put it into a system.
Yo dawg...

That concept has been attempted since the ISA slot days, it never had success because you will only be exchaging what card is the motherboard and what card is a I/O expansion board. That said, with the upcoming PCI-E 6.0 rev im thinking it may be actually be possible to make "CPU addon cards" that will have a socket or a soldered cpu, and will just add to the system using the same system ram via the pci-e slot and NUMA nodes.

What is definatelly clear here, igpus already are a huge part of a cpu die, if they have to add a NPU to it then the cpu cores will be a very minor part. And when that happens they may actually change the name for marketing reasons (like they did with the APUs).

Doug S · Mar 13, 2023

senttoschool said:
Today, NPUs such as Apple's neural engine takes less space than the CPU or GPU in a SoC. By 2030, I predict that we won't be buying "CPUs". We will all be buying NPUs with a CPU and a GPU attached to it.

NPUs will become the new CPUs.

More applications will start to make massive use of AI inference. Soon, consumers will demand that their laptops and mobile phones infer models as big as GPT or LLaMA or Stable Difusion or future large models. It has been theorized that the current iPhone 14 Pro could infer Meta's LLaMA model, though slowly and with much less accuracy.

So you're saying "build it and they will come"?

Most inference tasks are far less latency sensitive than CPU or GPU tasks, and would work just fine via cloud for a lot of stuff where privacy is less of an issue - even where privacy is a potential concern, the fact consumers keep using products like Facebook and buying gadgets like Alexa and Ring demonstrates that they either don't care or don't understand these issues so most will not be willing to pay more to get an SoC with a giant NPU to keep those inference tasks local.

Not to mention there has to be a 'killer app' for this, something people can't live without. ChatGPT is fun to play with, but a killer app it is not.

Ranulf · Mar 13, 2023

I want a Gnu-PU or maybe a Grue-PU.

alcoholbob · Mar 13, 2023

Id be disappointed if CPUs didnt have enough cache on die to offset the unified memory of consoles, otherwise even a 5 year old console would outperform a $10,000 TOTL PC in minimum framerate.

Also, I hope micro LED is affordable by 2030…

TheELF · Mar 13, 2023

Shivansps said:
That concept has been attempted since the ISA slot days, it never had success because you will only be exchaging what card is the motherboard and what card is a I/O expansion board.

I don't get your point here, this is an all inclusive card, it only needs power.
All the I/O is on board.
If somebody would be able to come up with a mobo that would combine a bunch of these into a cloud like network it would be a thing to behold.

Shivansps · Mar 13, 2023

TheELF said:
I don't get your point here, this is an all inclusive card, it only needs power.
All the I/O is on board.
If somebody would be able to come up with a mobo that would combine a bunch of these into a cloud like network it would be a thing to behold.

Its a very old concept.

Those things have existed for a long, long time, it is just a SBC in a very innefficient format, it was used mostly for industrial uses. You can make SBCs smaller than that with more expandability. In fact LGR made a video on one of those old devices some time ago.

In fact, it actually made FAR more sence for the ISA slot because due to how it worked it allowed you to have a I/O board with multiple ISA slots. You cant really do that with PCI-E whiout adding a bridge on the I/O board.

kschendel · Mar 13, 2023

The problem I see with this notion that neural engine processing will take over, is the lack of accuracy. Wetware NPU's have been developing for millions of years, and yet we still get garbage results. (Election denial in the US, for instance.) Systems like chatgpt might be cute, but if you're using its output without a thorough vetting, you're building your house on sand, in a flood zone.

In other words, not gonna happen.

mikegg · Feb 20, 2024

itsmydamnation · Feb 20, 2024

mikegg said:
+1

Your an odd fellow....

mikegg · Feb 20, 2024

We may soon have LPUs.

Grog's Mixtral 8x7 demo is truly impressive: https://groq.com/

FlameTail · Feb 24, 2024

TheELF said:
If A.I. will become standard in systems then they will become accelerator modules inside CPUs and GPUs and not the other way around.
Intel already included nervana and then habana in their GPUs years ago.
The inference for apps will still be happening on servers, there is no reason for the same work to be done a billion times over instead just once.

There are already matrix accelerators for AI embedded in the CPU in ARM processors. (See Apple AMX and ARM SME).

Those accelerators are used for tasks where latency is more important than brute computing power, while the NPUs are used for tasks where the opposite is true.

FlameTail · Feb 24, 2024

mikeymikec said:
In ten years' time many people will still be using processors available today. Tech adoption does not move that quickly.

Not in the PC, but in the smartphone it does.

mikegg · Apr 12, 2024

M4 will be "AI focused". Will likely feature a significantly larger NPU with maybe added quant types accelerations such as INT4.

Macs to Get AI-Focused M4 Chips Starting in Late 2024

Apple will begin updating its Mac lineup with M4 chips in late 2024, according to Bloomberg's Mark Gurman. The M4 chip will be focused on...

www.macrumors.com

See OP. I believe that by 2030, the NPU will be the biggest part of the SoC. We're going to be buying NPUs with a CPU and GPU attached to it.

Doug S · Apr 12, 2024

mikegg said:
M4 will be "AI focused". Will likely feature a significantly larger NPU with maybe added quant types accelerations such as INT4.

Macs to Get AI-Focused M4 Chips Starting in Late 2024

Apple will begin updating its Mac lineup with M4 chips in late 2024, according to Bloomberg's Mark Gurman. The M4 chip will be focused on...

www.macrumors.com

See OP. I believe that by 2030, the NPU will be the biggest part of the SoC. We're going to be buying NPUs with a CPU and GPU attached to it.

I would think the future direction will be to integrate the GPU and NPU to avoid duplication of resources. There is a lot of similar floating point hardware in both, the main difference being the support for smaller formats like INT16, INT8 and even INT4.

It hasn't made sense for Apple to do this yet because the NPU is a tiny block on the overall SoC, but the bigger the NPU becomes relative to the GPU / rest of the SoC the more sense it makes.

mikegg · Apr 12, 2024

Doug S said:
I would think the future direction will be to integrate the GPU and NPU to avoid duplication of resources. There is a lot of similar floating point hardware in both, the main difference being the support for smaller formats like INT16, INT8 and even INT4.

It hasn't made sense for Apple to do this yet because the NPU is a tiny block on the overall SoC, but the bigger the NPU becomes relative to the GPU / rest of the SoC the more sense it makes.

There's a reason they don't do this today. GPUs take a substantial amount of power to run. GPUs are becoming more and more like general purpose compute. NPUs are far more like ASICs. They're for efficiency. They are optimized for low precision inference.

In 2024, the AI labs are experimenting with different quantizations. Some are trying INT4 and found that LLMs are still pretty darn good. Some are even trying single bit inference.

We don't know what is the best yet. Once we do, the NPU will evolve to focus on a particular quantization.

Question By 2030, we will be buying massive NPUs with a CPU and GPU attached to it.

Golden Member

Diamond Member

Golden Member

Lifer

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Platinum Member

Diamond Member

Diamond Member

Platinum Member

Platinum Member

Diamond Member

Diamond Member

Diamond Member

Senior member

Golden Member

Platinum Member

Golden Member

Diamond Member

Diamond Member

Golden Member

Platinum Member

Golden Member