Discussion Apple Silicon SoC thread

Page 373 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Eug

Lifer
Mar 11, 2000
23,953
1,567
126
M1
5 nm
Unified memory architecture - LP-DDR4
16 billion transistors

8-core CPU

4 high-performance cores
192 KB instruction cache
128 KB data cache
Shared 12 MB L2 cache

4 high-efficiency cores
128 KB instruction cache
64 KB data cache
Shared 4 MB L2 cache
(Apple claims the 4 high-effiency cores alone perform like a dual-core Intel MacBook Air)

8-core iGPU (but there is a 7-core variant, likely with one inactive core)
128 execution units
Up to 24576 concurrent threads
2.6 Teraflops
82 Gigatexels/s
41 gigapixels/s

16-core neural engine
Secure Enclave
USB 4

Products:
$999 ($899 edu) 13" MacBook Air (fanless) - 18 hour video playback battery life
$699 Mac mini (with fan)
$1299 ($1199 edu) 13" MacBook Pro (with fan) - 20 hour video playback battery life

Memory options 8 GB and 16 GB. No 32 GB option (unless you go Intel).

It should be noted that the M1 chip in these three Macs is the same (aside from GPU core number). Basically, Apple is taking the same approach which these chips as they do the iPhones and iPads. Just one SKU (excluding the X variants), which is the same across all iDevices (aside from maybe slight clock speed differences occasionally).

EDIT:



M1 Pro 8-core CPU (6+2), 14-core GPU
M1 Pro 10-core CPU (8+2), 14-core GPU
M1 Pro 10-core CPU (8+2), 16-core GPU
M1 Max 10-core CPU (8+2), 24-core GPU
M1 Max 10-core CPU (8+2), 32-core GPU

M1 Pro and M1 Max discussion here:


M1 Ultra discussion here:


M2 discussion here:


Second Generation 5 nm
Unified memory architecture - LPDDR5, up to 24 GB and 100 GB/s
20 billion transistors

8-core CPU

4 high-performance cores
192 KB instruction cache
128 KB data cache
Shared 16 MB L2 cache

4 high-efficiency cores
128 KB instruction cache
64 KB data cache
Shared 4 MB L2 cache

10-core iGPU (but there is an 8-core variant)
3.6 Teraflops

16-core neural engine
Secure Enclave
USB 4

Hardware acceleration for 8K h.264, h.264, ProRes

M3 Family discussion here:


M4 Family discussion here:

 
Last edited:

FlameTail

Diamond Member
Dec 15, 2021
4,384
2,754
106
Apple Rumour from The Elec via Jukanlosreve:

"Regarding that news, The Elec has uploaded a YouTube video. It’s in Korean, but I’ll translate it for you.


Samsung, in collaboration with Apple, is developing a technology to transition from the conventional LPDDR low-power DRAM packaging method to a discrete approach. This is an essential technology for enhancing on-device AI performance required by iPhones and the next generation of foldable phones.

Previously, the POP (Processor On Package) method focused on placing memory directly on top of the processor to reduce communication latency. While this method was advantageous for miniaturization and communication efficiency, it had the following limitations:

1. Limited bandwidth: Because the memory and processor were directly connected, it was difficult to increase the number of data transmission channels.

2. Overheating issues: Since the processor and memory overlapped, heat tended to accumulate, making effective cooling challenging.

The discrete packaging method improves upon the existing POP approach by physically separating the memory from the processor, offering the following benefits:

1. Increased bandwidth: By placing the memory independently on the substrate, it becomes possible to add more data transmission channels, thereby improving data transfer speeds.

2. Improved heat dissipation: Separating the memory and processor prevents overlapping heat accumulation and enhances the device’s overall stability.

However, this approach introduces a trade-off: increasing the physical distance between the memory and CPU can lead to higher communication latency.

Apple requires this technology to implement generative AI functionalities in its foldable phones and on-device AI models rumored to launch in 2026. Because on-device AI processes computations directly on the device, high bandwidth and efficient thermal management are essential. Discrete packaging enables increased bandwidth and efficient cooling, thus meeting these critical requirements."

//

It has been speculated that Apple might be switching to LLW DRAM or Mobile HBM.

What is LLW DRAM?

What is Mobile HBM?
 
Last edited:
Reactions: Mopetar

okoroezenwa

Member
Dec 22, 2020
120
125
116
Apple Rumour from The Elec via Jukanlosreve:

"Regarding that news, The Elec has uploaded a YouTube video. It’s in Korean, but I’ll translate it for you.


Samsung, in collaboration with Apple, is developing a technology to transition from the conventional LPDDR low-power DRAM packaging method to a discrete approach. This is an essential technology for enhancing on-device AI performance required by iPhones and the next generation of foldable phones.

Previously, the POP (Processor On Package) method focused on placing memory directly on top of the processor to reduce communication latency. While this method was advantageous for miniaturization and communication efficiency, it had the following limitations:

1. Limited bandwidth: Because the memory and processor were directly connected, it was difficult to increase the number of data transmission channels.

2. Overheating issues: Since the processor and memory overlapped, heat tended to accumulate, making effective cooling challenging.

The discrete packaging method improves upon the existing POP approach by physically separating the memory from the processor, offering the following benefits:

1. Increased bandwidth: By placing the memory independently on the substrate, it becomes possible to add more data transmission channels, thereby improving data transfer speeds.

2. Improved heat dissipation: Separating the memory and processor prevents overlapping heat accumulation and enhances the device’s overall stability.

However, this approach introduces a trade-off: increasing the physical distance between the memory and CPU can lead to higher communication latency.

Apple requires this technology to implement generative AI functionalities in its foldable phones and on-device AI models rumored to launch in 2026. Because on-device AI processes computations directly on the device, high bandwidth and efficient thermal management are essential. Discrete packaging enables increased bandwidth and efficient cooling, thus meeting these critical requirements."

//

It has been speculated that Apple might be switching to LLW DRAM or Mobile HBM.

What is LLW DRAM?

What is Mobile HBM?
There was a rumour a few months ago that they’d be switching to LLW RAM in 2026, so I’m curious what the differences are between it and mobile HBM and if that rumour is correct.
 

SpudLobby

Golden Member
May 18, 2022
1,027
695
106
There was a rumour a few months ago that they’d be switching to LLW RAM in 2026, so I’m curious what the differences are between it and mobile HBM and if that rumour is correct.
I am skeptical of the LLW RAM claim. Seems niche and forfeiting LPDDR6 volumes, which also offers new power modes and gains from node shrinks along with more standard datarate & bus width opportunity. Doubt they’re thing to hitch their wagon to Samsung exclusively either.

LPDDR6 with a 96-bit bus @ 13000 MT/S would get you 156 GB/ before the metadata adjustment. Even going down to 72-bit at that speed — or assuming the metadata subtraction puts it here — it’s 117GB/s
 

SpudLobby

Golden Member
May 18, 2022
1,027
695
106
I just don’t see it for the LLW RAM. Too speciality. We’ll see though.

What I could see possibly is the iPhone A and A Pro chips bifurcate on bus width or datarates in the RAM at the same generation of LPDDR, but they didn’t do this so far for the same generation of A-chips anyway. They did (different RAM, LPDDR5 vs LPDDR4x) with the A16 vs keeping old A15 chips, but they did not with the A18 Pro vs A18 — both on LPDDR5x. Seems like they decided the volume was worth doing for both.
 

FlameTail

Diamond Member
Dec 15, 2021
4,384
2,754
106
Mobile HBM is interesting.

Regular HBM is made by stacking normal DDR. Mobile HBM is made by stacking LPDDR.

Even if the iPhone doesn't get Mobile HBM, Apple might deploy it in one of their future M series chips.
 

LightningZ71

Platinum Member
Mar 10, 2017
2,018
2,455
136
Given Apple's volumes on these devices, things that don't seem to make sense on the surface may make sense when you consider the volume and the amortization over that number of units for whatever gain they see. At this point, not much would shock me with them.
 
Reactions: johnsonwax

johnsonwax

Member
Jun 27, 2024
118
195
76
Given Apple's volumes on these devices, things that don't seem to make sense on the surface may make sense when you consider the volume and the amortization over that number of units for whatever gain they see. At this point, not much would shock me with them.
Yep. Apple is its own volume if they choose to ship across their product lines. And this would fit how they operate. They generally have two operating modes:
1) Own the IP and vertically integrate, and use the IP moat to make it difficult for competitors to keep up.
2) Prepay for tech they don't control the IP for and buy their economies becoming a short term monopsony (usually about 3 years).

The latter is what they do with TSMC - buying all the volume on leading node making it effectively an Apple exclusive. I could easily see them picking a supplier, dumping a few billion on them to build out capacity, treat that as a zero interest loan that the supplier pays back in components and a contract that locks up all the production. Apple has gone so far as to own the equipment in the factory to ensure that. Competitors would need to then create the economies of scale without Apple and Apple's supplier.

Not only should it not shock you, you should expect something like this. It just depends on whether the tech buys them enough advantage for a long enough period of time to be worth it.
 
Reactions: gregoritsch

The Hardcard

Senior member
Oct 19, 2021
300
386
106
I am skeptical of the LLW RAM claim. Seems niche and forfeiting LPDDR6 volumes, which also offers new power modes and gains from node shrinks along with more standard datarate & bus width opportunity. Doubt they’re thing to hitch their wagon to Samsung exclusively either.

LPDDR6 with a 96-bit bus @ 13000 MT/S would get you 156 GB/ before the metadata adjustment. Even going down to 72-bit at that speed — or assuming the metadata subtraction puts it here — it’s 117GB/s
As others have noted, Apple simply choosing a technology inherently gives it volumes. The issue is power. Something has to be done. If the tech industry successfully gets people to adopt AI models running on device, it’s going to siam battery life. The parameters of a model have to be constantly cycled through the processor keeping the memory bus fully lit.

The figures I found claim that LPDDR4 uses 5 pj/bit and LPDDR5 uses 4 pj/bit. I haven’t found anything for LPDDR6 yet, but 1.2 pj/bit would be a big energy savings for someone relying heavily upon an on device language model.
 

SpudLobby

Golden Member
May 18, 2022
1,027
695
106
As others have noted, Apple simply choosing a technology inherently gives it volumes.
This is true but has limits and we’ve seen this over the years
The issue is power.
Yes, LPDDR does this well
Something has to be done. If the tech industry successfully gets people to adopt AI models running on device, it’s going to siam battery life.
On-device AI and other parallel workloads like with Apple’s on-device voice memo transcription or natural language indexing in photos, or video and photo features is useful in certain cases but will not be the go-to for bleeding edge models you communicate with like in Sci-Fi.

Apple is already investing in co-designed servers with Broadcom for this reason and today works with ChatGPT as well as of iOS 18.2 in Apple Intelligence.

Batched inference is more hardware efficient anyway and muh latency is a dumb argument, latency will be lower at the same quality threshold with a fast datacenter & setup — try Claude 3.5 or ChatGPT.

What on-device AI will be used for has only so much overlap with what the hype caucus believes.

You might have on-screen context intelligent multimodal models for example that can take very basic commands and execute them, or excellent on-device transcription, Siri adapting to your routines (these already here today) but LLW isn’t going to move the needle on making any kind of substantial MM LLM usable on-device that would justify running say 4-8GB of weights day in day out on a phone — yes maybe something like that for a laptop, but people here are going full moron about AI in both directions and it’s tiresome.

The AMD bum camp is ticked Nvidia is making a killing and the gravy train isn’t stopping anytime soon, AI is legit, temp bubble or not.


But then we also have some I think buying into hype about local AI too much. Local AI is great for capabilities that realize a threshold effect of quality not benefitted by server instances or for the consumer, the cost associated (one way or another someone will pay for it.).

So you’ll still see proliferation of local models and yes LLMs, but a lot of this industry hype about trying to find a way to do the most cutting edge models possible locally constantly will end poorly. Apple’s server plans and existing AI models in local form show all you need to know about exactly where this fault line will end, and LLW won’t make a lick of a difference to the end game here because capacity + bus width is another issue on that note.
The parameters of a model have to be constantly cycled through the processor keeping the memory bus fully lit.

The figures I found claim that LPDDR4 uses 5 pj/bit and LPDDR5 uses 4 pj/bit. I haven’t found anything for LPDDR6 yet, but 1.2 pj/bit would be a big energy savings for someone relying heavily upon an on device language model.
 

The Hardcard

Senior member
Oct 19, 2021
300
386
106
This is true but has limits and we’ve seen this over the years

Not any limits that would stop Apple. The only limit for Apple is supplier ability to scale.

Yes, LPDDR does this well

If 4 pj/bit for LPDDR5 is accurate, then unless LPDDR6 is cutting that in half, then LLW is absolutely a consideration, 65 percent less power for RAM transfers is huge.

Outside of a worldwide economic and sociopolitical collapse, on device AI is happening. People keep looking at the current limitations and problems of LLMs and other generative AI as if they are permanent. They are not.

Not only are the current models going to become significantly more capable, reliable, performant, and efficient, but new models and algorithms are incoming. The gap between 2025 Apple Intelligence and Gemini and what will be available in 2035 will be greater than the gap between the Apple II and the M4 Pro Mac Mini.

Cloud AI will always be more powerful and efficient, but the lack of trust and cost structure will make that a corporate service by and large. Not that there won’t be heavy individual use of cloud AI, but it won’t beat on device as the primary AI use for the mainstream public.
 
Last edited:

lopri

Elite Member
Jul 27, 2002
13,310
687
126
I traded in my old mini for the new M4 mini, and while I can feel the speed difference but macOS is still too bloated. Only upgrade I went for is the 10GBase-T. For some reason mouse hack (SaneSideButtons) does not work, which probably aggravates the situation. macOS's native mouse handling is criminal.
 
Reactions: poke01

poke01

Diamond Member
Mar 8, 2022
3,036
4,010
106
macOS's native mouse handling is criminal.
Oh definitely. I use an app called Scroll Reverser when using a USB mouse on macOS. Natural scrolling sucks.
For some reason mouse hack (SaneSideButtons) does not work
Have you tried v1.2.0?
 

retnuh

Member
Mar 3, 2004
33
6
71
Oh definitely. I use an app called Scroll Reverser when using a USB mouse on macOS. Natural scrolling sucks.

Have you tried v1.2.0?
For the usb mouse, did they take the scroll direction config option away? Should be able to turn “Nature scrolling” off for mice in system settings -> mouse and not need an extra app. It’s been a minute as I got too used to the trackpad but scroll wheels on mice 1000% agree that setting needs to be flipped.
 

pj-

Senior member
May 5, 2015
499
276
136
i have 2, maybe 3, apps running on my macbook to 'fix' macos mouse behavior. i set them up a while ago so don't remember what they do but i believe they disable scroll acceleration and cursor acceleration and something else i'm forgetting

it's pretty wild won't they let you control
 
Reactions: retnuh

name99

Senior member
Sep 11, 2010
565
463
136
The people wondering whether and how "AI" might influence "the real OS", and what relevant APIs might look like should view the very recent
which answers these questions in the context of Google.

Many examples are given, but the sort of thing that's of immediate relevance is a memory allocator which uses input variables (in particular a hash of the call stack) to decide in which of several heaps of varying lifetime a new object should be allocated. This not only works well to very well, substantially reducing the footprint of some code, but is also apparently shipping on Pixel 6.
 
Reactions: igor_kavinski
Mar 11, 2004
23,410
5,823
146
For the usb mouse, did they take the scroll direction config option away? Should be able to turn “Nature scrolling” off for mice in system settings -> mouse and not need an extra app. It’s been a minute as I got too used to the trackpad but scroll wheels on mice 1000% agree that setting needs to be flipped.

Likely are intending people to use Magic Mouse where you'd be swiping like on a trackpad, where that behavior feels natural.
 

name99

Senior member
Sep 11, 2010
565
463
136
Let me follow up the above with two more references.

The first is

You don't have to buy into the argument being made for why AI matters economically. What matters is that plenty of people DO buy into the argument. Which means, as I see it, there are two takeaways relevant to recent discussions:

1- if energy usage is going to grow as rapidly as expected, those with performance advantages in respect of inferences/joule will have a substantial advantage. This would appear to work to Apple's favor, both in terms of (we expect) being able to offload more inference locally [which may still mean higher energy usage, but Apple isn't paying for it] AND in terms of Apple probably being able to provide the highest inferences/joule, even at scale.
This latter is not certain, but seems likely given Apple's obsessive (in the past and still) concern with reducing energy anywhere and everywhere. One could imagine that new architectures designed from the ground up for inference might be more efficient, but I've not yet seen an indication of such.

Which suggests that things like Apple clusters, and Apple-sold compute services have perhaps a more promising future (in terms of being cheaper TCO) than it seems right now. Remember, our concern is say half a decade out; not just today's LLM's but the (possible? ridiculous?) future in which LLMs are no longer just a cute trick but the equivalent of the spreadsheet or the compiler, the tool that defines the work (and output, and compensation) of various professionals...

2- the talk includes a slide 13 minutes in that I have not seen elsewhere giving the amount of energy used in the US by the largest data warehouse companies. The interesting item I see there is that Apple comes in at 2GW - substantially behind Google and MS, but 2/3 of Amazon, or the same size as Meta/Facebook (and twice the size of X/Twitter).

People have frequently scoffed that Apple's native data center footprint is insignificant (or, more sensibly, have wondered what it is). This gives us elements of an answer -it's as large as Meta, and not too different from Amazon.
Which in turn suggests that if it makes business sense for those companies to develop various chips (eg Meta's inference server chip, or Graviton+Trainium+Nitro) it makes as much sense for Apple to do so -- REGARDLESS of issues of whether these "server" chips are sold externally... Apple may be slightly smaller but their server chip development is probably also cheaper given the infrastructure they can reuse. And Apple's footprint may grow rapidly, not just once Private Cloud Compute takes off, but also if/as they see value in moving other Apple services off AWS storage or Trainium training or whatever else they currently outsource.

My second recommendation link is

Again you don't have to buy into my love of Mathematica, that's not the point. The point is that Mathematica is a vast system, ridiculously powerful but also, as a consequence, difficult [or at least slow, in terms of constant lookups] to use as soon as you move out your the area in which you usually work. This provides an extremely powerful and useful tool for improving that situation. I've not used things like Copilot for say C++, but this feels to me like not just what I'd hope for from Copilot but a whole lot more in terms of handling optimization, refactoring, providing quick solutions, and so much more.

Now imagine something similar for other tools that are too complex for one person to fully understand - Photoshop, or Blender, or even Linux system management, or (and these may well exist as prototype internal tools) "assistants" for working on the Apple code base, or the MS code base -- tools that make use of the company-wide conventions, can easily point you to possibly already written versions of the function you want, that can at least provide a first pass at possible performance, security, or obsolescense issues, etc. Presumably most of the training that went into the Wolfram Assistant (internal documentation, stackoverflow posts, code repositories, etc) is available in more or less similar form inside Apple or MS.

It's with this in mind that the first part of my comment, I think, might make more sense. Look, sure, it's possible that we have, in 2024, gone as far as this particular set of ideas will take us, that Wolfram Assistant's successes (and failures), like ChatGTP 4o-whateverItIsTheseDays is as good as it gets for sline-level interactive chat, and nVidia's or Google's chip layout experiments are also as good as it gets. But it seems foolish to assume that given the past two years.
Meaning that even IF you don't see anything that excites you in the current crop of LLM assistants, all that *probably* means is that someone hasn't yet created one for your particular interests.
But 2025 might be the year such an assistant is released for Windows sysadmins... Or for Linux kernel coders... Or for Star Wars fan fiction writers... Or...

Wolfram basically have everything in place to do this "first". Well, sure, maybe Copilot is first, but Wolfram is an "independent developer" in a way that possibly suggests to people who are not Microsoft or Apple or Google some combination of "hey I could do that" and "OMG, if we don't do this but our competitors do".
The other tricky thing is that Wolfram has a history of charging for its products, so no-one is surprised (or ranting all over the internet) that this is an add-on cost. The same idea is *possible* for organizations that work with open source (for example Blender could be free but charge a monthly fee for an assistant, likewise for Ubuntu. Even Google could offer a Google Premium [cf X/Twitter Premium] that gives you ad-free search, a much more powerful AI element to the search, and various other things - some amount of image generation or video generation? Summaries of web sites based on knowledge of what interests you?).

Would these all then back down in the face of the usual mindless screams and rants from the masses? Hmm. We have a once-in-generation chance to restructure after the known issues of the free (ie ad-supported) web...
 
Reactions: Vattila

LightningZ71

Platinum Member
Mar 10, 2017
2,018
2,455
136
On AI, the only thought I keep coming back to is that utility scale positive energy balance nuclear fusion plants are only 20 years away, and have been for the last 40 or so...
 

Doug S

Diamond Member
Feb 8, 2020
3,005
5,167
136
Not exactly on-topic, but:


Commonwealth Fusion has been saying they'll have a reactor by 2031 or so for awhile now. And I think General Fusion has made the same claim.

It says nothing about producing power in any remotely economic way. They are just saying they're ready to big something at scale, but it will undoubtedly require more energy than it generates. It isn't going to power any homes, not in the early 2030s at least.
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |