Discussion Apple Silicon SoC thread

Page 179 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Eug

Lifer
Mar 11, 2000
23,752
1,285
126
M1
5 nm
Unified memory architecture - LP-DDR4
16 billion transistors

8-core CPU

4 high-performance cores
192 KB instruction cache
128 KB data cache
Shared 12 MB L2 cache

4 high-efficiency cores
128 KB instruction cache
64 KB data cache
Shared 4 MB L2 cache
(Apple claims the 4 high-effiency cores alone perform like a dual-core Intel MacBook Air)

8-core iGPU (but there is a 7-core variant, likely with one inactive core)
128 execution units
Up to 24576 concurrent threads
2.6 Teraflops
82 Gigatexels/s
41 gigapixels/s

16-core neural engine
Secure Enclave
USB 4

Products:
$999 ($899 edu) 13" MacBook Air (fanless) - 18 hour video playback battery life
$699 Mac mini (with fan)
$1299 ($1199 edu) 13" MacBook Pro (with fan) - 20 hour video playback battery life

Memory options 8 GB and 16 GB. No 32 GB option (unless you go Intel).

It should be noted that the M1 chip in these three Macs is the same (aside from GPU core number). Basically, Apple is taking the same approach which these chips as they do the iPhones and iPads. Just one SKU (excluding the X variants), which is the same across all iDevices (aside from maybe slight clock speed differences occasionally).

EDIT:



M1 Pro 8-core CPU (6+2), 14-core GPU
M1 Pro 10-core CPU (8+2), 14-core GPU
M1 Pro 10-core CPU (8+2), 16-core GPU
M1 Max 10-core CPU (8+2), 24-core GPU
M1 Max 10-core CPU (8+2), 32-core GPU

M1 Pro and M1 Max discussion here:


M1 Ultra discussion here:


M2 discussion here:


Second Generation 5 nm
Unified memory architecture - LPDDR5, up to 24 GB and 100 GB/s
20 billion transistors

8-core CPU

4 high-performance cores
192 KB instruction cache
128 KB data cache
Shared 16 MB L2 cache

4 high-efficiency cores
128 KB instruction cache
64 KB data cache
Shared 4 MB L2 cache

10-core iGPU (but there is an 8-core variant)
3.6 Teraflops

16-core neural engine
Secure Enclave
USB 4

Hardware acceleration for 8K h.264, h.264, ProRes

M3 Family discussion here:


M4 Family discussion here:

 
Last edited:

moinmoin

Diamond Member
Jun 1, 2017
4,994
7,765
136
Not much to find in there except this:

"Apple uses more specialized caches instead of trying to optimize one cache for both the CPU and GPU. M1 implements a 12 MB L2 cache within the Firestorm CPU cluster, which fills a similar role to Intel’s L3 from the CPU’s perspective. A separate 8 MB system level cache helps reduce DRAM bandwidth demands from all blocks on the chip, and acts as a last stop before hitting the memory controller. By dividing up responsibilities, Apple can tightly optimize the 12 MB L2 for low latency to the CPU cores. Because the L2 is large enough to absorb the bulk of CPU-side requests, the system level cache’s latency can be higher in order to save power.

M1 still has a bit of room for improvement. Its cache bandwidth to compute ratio could be a touch higher. Transfers between the CPU and GPU could take full advantage of the system level cache to improve bandwidth. But these are pretty minor complaints, and overall Apple has a pretty solid setup."

Not sure if M2 already improved on the latter suggestions for future improvements.
 
Jul 27, 2020
17,933
11,697
116

Search this string: Apple GPUs have much smaller data caches than other vendors.

And an interesting quote on Apple's weird design choice regarding cache sizes:

One strange pattern, is smaller GPUs having larger cache sizes. The M1 has much more L2 than the M1 Pro, and the A14 has much larger L3 than M1. Extra cache helps to make up for lower bandwidth. Apple likely had to trade off between losing energy efficiency to L2 thrashing, and losing energy efficiency to extra static L2 power.

Giving the M1 Pro smaller cache size is just...WRONG! It costs a ton of money!
 

LightningZ71

Golden Member
Mar 10, 2017
1,660
1,945
136
With the considerably higher RAM bandwidth of the Pro and MAX, it makes a certain amount of sense for the gpu to have a smaller cache. I don't know if they have the exact same latency between the regular and larger parts, but, if the smaller gpu cache allows it to have a lower latency, the larger RAM bandwidth should cover for the smaller size easily. As for the A14, given it's mobile target, saving as much power as possible on memory accesses is an important thing. Getting data from the L3 is cheaper from a power standpoint than going to main memory.
 
Jul 27, 2020
17,933
11,697
116
For workloads that can fit data inside the M1's cache but not the M1 Pro's cache, I wonder how bad things get for the M1 Pro. Worst case, developer may have to create separate codepaths for the different cache sizes, to minimize the cache misses on the M1 Pro.
 

Doug S

Platinum Member
Feb 8, 2020
2,486
4,050
136
Giving the M1 Pro smaller cache size is just...WRONG! It costs a ton of money!

I imagine Apple knows what they are doing, and made these decisions based on simulation not pulling them out of thin air. The M1 Pro has more memory bandwidth (which is what GPUs really need) so the caches wouldn't need to be as big.

They kept the 1/3 sized cache in M2 Pro, and they probably had enough lead time from early M1 Pro silicon to change that decision for M2 Pro. If not, they certainly would have for M3 Pro so if they do it there as well it will be hard for the peanut gallery to argue with.
 

Mopetar

Diamond Member
Jan 31, 2011
8,005
6,453
136
Apple has larger L1 caches, so the dependency on L2 is already a lot less than other chips. 192 KB for instructions and 128 KB for data. You don't need a massive L2 cache when you can a lot of the working set in the L1 cache.
 

Doug S

Platinum Member
Feb 8, 2020
2,486
4,050
136
M3 spotted: https://www.engadget.com/apples-m3-pro-chipset-could-feature-12-cpu-cores-205959150.html

The M2 Pro already offered 12 cores so I wonder if this was the "lower end" version. Also skeptical of the even split - the M2 Pro with 12 cores was 8+4, a 6+6 would be a step down even with faster cores and a 7+7 or even 8+8 wouldn't be much of a boost. Unless the A17/M3 little core is getting a much bigger jump than the big core (i.e. like Intel's where it is 50% of the big core performance instead of 33%)

The 36 GB of memory is interesting, I'm not sure how you'd get that with the current 256 bit wide setup - that would imply 18 Gbits per LPDDR5/5X channel which I'm not aware of any DRAM OEM selling (correct me if I'm wrong) Is it possible it is 384 bits wide instead? And if so, is the Max 768 and Ultra and "Extreme" 1536 & 3072? That would be another way to increase capacity in the Mac Pro, as well as deliver even more bandwidth to the larger number of GPU cores - 18 in this presumably "low end" version of the M2 Pro. Thus you'd expect at least 20 in the top tier, and therefore as many as 160 in the Mac Pro. These will be the cores that were supposed to go in A16 but didn't make the cut - or maybe even more advanced - which were supposed to deliver 20-30% more performance.
 

smalM

Member
Sep 9, 2019
63
66
91
The M2 Pro already offered 12 cores so I wonder if this was the "lower end" version.
So this goes from 6+4 & 16 in the low-end M2 Pro to 6+6 & 18 in the low-end M3 Pro. The question is, did Apple add a second efficiency cluster (and used binning on all clusters) or did they just add 2 cores to the efficiency cluster? If the later could we see a 2+6 configuration in an A-SoC?

The 36 GB of memory is interesting.
There are 18GB packages out there on a 64bit bus. SK Hynix startet production two years ago.
I don't know what die configuration is used in these packages but there are 12Gbit dies which are produced since 2019.
 

mikegg

Golden Member
Jan 30, 2010
1,815
445
136
The M2 Pro already offered 12 cores so I wonder if this was the "lower end" version. Also skeptical of the even split - the M2 Pro with 12 cores was 8+4, a 6+6 would be a step down even with faster cores and a 7+7 or even 8+8 wouldn't be much of a boost. Unless the A17/M3 little core is getting a much bigger jump than the big core (i.e. like Intel's where it is 50% of the big core performance instead of 33%)
It wouldn't surprise me if M3 had more cores but it makes a lot of sense to me that M3 would have the same number of cores as M2.

M1: New node, new architecture
M2: More cores
M3: New node, new architecture
M4: More cores
M5: New node, new architecture
M6: More cores
 

richardskrad

Member
Jun 28, 2022
55
61
61
Apple assembled the Avengers of the chip industry and made the M1, which changed the game. The chip made such a big impact in the tech world and t's still fresh in the minds of consumers and tech enthusiasts. However, the M2 really hurt Apple's goodwill. It has noticeably worse performance per watt than the M1, it's more expensive for Apple to manufacture and it's a much hotter chip.

In some ways, the M1 family is still Apple's best chips, 2.5 years later. Apple shouldn't have released the M2 family of chips, IMO. They should've just waited and jumped to the M3. The recent Apple quarterly earnings showed that the M2 did nothing to increase Mac sales.

Now, a lot hangs on the M3. We shall see if Apple losing their Avenger-level chip designers have really impacted the org or not.
 
Last edited:

Doug S

Platinum Member
Feb 8, 2020
2,486
4,050
136
It wouldn't surprise me if M3 had more cores but it makes a lot of sense to me that M3 would have the same number of cores as M2.

M1: New node, new architecture
M2: More cores
M3: New node, new architecture
M4: More cores
M5: New node, new architecture
M6: More cores

I think the increases would have more to do with what the process allows, and N3 is a fairly decent step in density (maybe not so much for cache, but certainly for logic) Of course they might use the area they are gaining for bigger (i.e. more complex / more transistors) CPU or GPU cores, or putting more resources elsewhere on the SoC like for instance more memory controllers.
 
Reactions: A///

A///

Diamond Member
Feb 24, 2017
4,352
3,155
136
Let's hope it works out better than most post-MCU movies.
adults my age now bitched about cgi in the 90's and 2000's when it became more pravelent due to increasing processing power but the same 50-60 year olds go gah gah over the mcu and dc films coming out even the poorly fx'd ones where they young ones are whining more.
 

A///

Diamond Member
Feb 24, 2017
4,352
3,155
136
I think the increases would have more to do with what the process allows, and N3 is a fairly decent step in density (maybe not so much for cache, but certainly for logic) Of course they might use the area they are gaining for bigger (i.e. more complex / more transistors) CPU or GPU cores, or putting more resources elsewhere on the SoC like for instance more memory controllers.
Adding more cores would increase power use per core but would need to show a greater processing power leads of m1 and m2 but mostly m1 because m2 was not much impressive to begin with! i can see more propietary accelerators used by the os to perform functions faster than a normal processing core could. sad nothing will be shown at wwdc.
 

Doug S

Platinum Member
Feb 8, 2020
2,486
4,050
136
Adding more cores would increase power use per core but would need to show a greater processing power leads of m1 and m2 but mostly m1 because m2 was not much impressive to begin with! i can see more propietary accelerators used by the os to perform functions faster than a normal processing core could. sad nothing will be shown at wwdc.

Why would more cores increase power use per core? Switching to N3 will reduce the power use per core though that would be counterbalanced by the cores being more complex or clocked higher so it depends on whether the power savings of the process is a larger factor than increases in the number of cores and their power consumption.

Plus the CPU usage is only part of the picture, you have the GPU which is often active at the same time depending on the app, and the memory controllers which may be more numerous (no rumors I'm aware of to indicate that, just my puzzlement in the 36 GB of installed RAM in the M3 prototype) but will draw less power if they switch to LPDDR5X.

They seem to have a lot of cooling headroom designed in some products like the Studio, which tells me either they are expecting to have higher power draw in a future 'Ultra' or that the Studio may support an "Extreme" someday. Because they use the same layout for the Pro & Max, and the Max is the building block for Ultra & Extreme they may have cores that are never activated in products with cooling constraints like a Macbook Pro. They are already binning on cores, that would just take a step further - and there are other possible uses (i.e. the persistent hope/belief that Apple will build their own servers for internal use) where power budgets are higher than in a laptop with a near silent fan.
 

smalM

Member
Sep 9, 2019
63
66
91
[...] and the memory controllers which may be more numerous (no rumors I'm aware of to indicate that, just my puzzlement in the 36 GB of installed RAM in the M3 prototype) but will draw less power if they switch to LPDDR5X.
RAM packages do not need to come in powers of 2.
Get over it...

Because they use the same layout for the Pro & Max, and the Max is the building block for Ultra & Extreme they may have cores that are never activated in products with cooling constraints like a Macbook Pro.
The Max was the building block for Ultra only once and the Max die is already to big. Apple will have to split it at the latest when TSMC starts using EXE:5200 systems. I expect the M design to become much more modular.
 
Last edited:

A///

Diamond Member
Feb 24, 2017
4,352
3,155
136
Why would more cores increase power use per core? Switching to N3 will reduce the power use per core though that would be counterbalanced by the cores being more complex or clocked higher so it depends on whether the power savings of the process is a larger factor than increases in the number of cores and their power consumption.
I tuned out halfway through my post and was thinking of intel and amd pushing their processors to the brink. amd and intel will use any headroom to catapult their chips into the lead, and with asus's help they may get there with a sudden boom of power.
 

Glo.

Diamond Member
Apr 25, 2015
5,762
4,666
136
M3 spotted: https://www.engadget.com/apples-m3-pro-chipset-could-feature-12-cpu-cores-205959150.html

The M2 Pro already offered 12 cores so I wonder if this was the "lower end" version. Also skeptical of the even split - the M2 Pro with 12 cores was 8+4, a 6+6 would be a step down even with faster cores and a 7+7 or even 8+8 wouldn't be much of a boost. Unless the A17/M3 little core is getting a much bigger jump than the big core (i.e. like Intel's where it is 50% of the big core performance instead of 33%)

The 36 GB of memory is interesting, I'm not sure how you'd get that with the current 256 bit wide setup - that would imply 18 Gbits per LPDDR5/5X channel which I'm not aware of any DRAM OEM selling (correct me if I'm wrong) Is it possible it is 384 bits wide instead? And if so, is the Max 768 and Ultra and "Extreme" 1536 & 3072? That would be another way to increase capacity in the Mac Pro, as well as deliver even more bandwidth to the larger number of GPU cores - 18 in this presumably "low end" version of the M2 Pro. Thus you'd expect at least 20 in the top tier, and therefore as many as 160 in the Mac Pro. These will be the cores that were supposed to go in A16 but didn't make the cut - or maybe even more advanced - which were supposed to deliver 20-30% more performance.
You won't get that on 256 bit setup.

Yes, 384 bit bus is possible, and 36 GB of RAM appears to be confirming this.
 

Ajay

Lifer
Jan 8, 2001
16,094
8,106
136
You won't get that on 256 bit setup.

Yes, 384 bit bus is possible, and 36 GB of RAM appears to be confirming this.
Still, if the 384b bus theory is correct, why not just increase RAM by 50%? Not like it’s expensive right now. Why use some bizzare fractional increase. Apple could always clock down the LPDDR5 on laptops to manage power.
 

Glo.

Diamond Member
Apr 25, 2015
5,762
4,666
136
Still, if the 384b bus theory is correct, why not just increase RAM by 50%? Not like it’s expensive right now. Why use some bizzare fractional increase. Apple could always clock down the LPDDR5 on laptops to manage power.
Two possibilities.

Either memory chips capacity in 2024 will be 6 GB per 64 bit memory chip, and higher, and simply Apple does not want to get caught with their pants down in their supply chain, or 4 GB/64 bit memory chips are right now on the verge of going extinct.

Both are supply chain matters and timeframe when the M3 and M3 pro will be released.

P.S. 384 bit memory bus on M3 Pro makes very interesting configuration discussion for M3 SOC. Is it 192 bit, or... 256 bit?
 

Ajay

Lifer
Jan 8, 2001
16,094
8,106
136
That, and there is always the chance that the info is bogus. The system dump could have reported incorrectly. Could be a prototype had a weird configuration. Leaks drive me crazy sometimes.
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |