Discussion Apple Silicon SoC thread

Page 235 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Eug

Lifer
Mar 11, 2000
23,752
1,284
126
M1
5 nm
Unified memory architecture - LP-DDR4
16 billion transistors

8-core CPU

4 high-performance cores
192 KB instruction cache
128 KB data cache
Shared 12 MB L2 cache

4 high-efficiency cores
128 KB instruction cache
64 KB data cache
Shared 4 MB L2 cache
(Apple claims the 4 high-effiency cores alone perform like a dual-core Intel MacBook Air)

8-core iGPU (but there is a 7-core variant, likely with one inactive core)
128 execution units
Up to 24576 concurrent threads
2.6 Teraflops
82 Gigatexels/s
41 gigapixels/s

16-core neural engine
Secure Enclave
USB 4

Products:
$999 ($899 edu) 13" MacBook Air (fanless) - 18 hour video playback battery life
$699 Mac mini (with fan)
$1299 ($1199 edu) 13" MacBook Pro (with fan) - 20 hour video playback battery life

Memory options 8 GB and 16 GB. No 32 GB option (unless you go Intel).

It should be noted that the M1 chip in these three Macs is the same (aside from GPU core number). Basically, Apple is taking the same approach which these chips as they do the iPhones and iPads. Just one SKU (excluding the X variants), which is the same across all iDevices (aside from maybe slight clock speed differences occasionally).

EDIT:



M1 Pro 8-core CPU (6+2), 14-core GPU
M1 Pro 10-core CPU (8+2), 14-core GPU
M1 Pro 10-core CPU (8+2), 16-core GPU
M1 Max 10-core CPU (8+2), 24-core GPU
M1 Max 10-core CPU (8+2), 32-core GPU

M1 Pro and M1 Max discussion here:


M1 Ultra discussion here:


M2 discussion here:


Second Generation 5 nm
Unified memory architecture - LPDDR5, up to 24 GB and 100 GB/s
20 billion transistors

8-core CPU

4 high-performance cores
192 KB instruction cache
128 KB data cache
Shared 16 MB L2 cache

4 high-efficiency cores
128 KB instruction cache
64 KB data cache
Shared 4 MB L2 cache

10-core iGPU (but there is an 8-core variant)
3.6 Teraflops

16-core neural engine
Secure Enclave
USB 4

Hardware acceleration for 8K h.264, h.264, ProRes

M3 Family discussion here:


M4 Family discussion here:

 
Last edited:

FlameTail

Diamond Member
Dec 15, 2021
3,150
1,800
106
And then what? Half of the entire SoC will be NPU? How pointless.
I just don't understand all this hype about AI. Call me If It can really do something useful.
I think you guys are overestimating how much die area the NPU takes.

Idk about Meteor Lake and Phoenix, but on the Apple A17 Pro, the NPU has 35 TOPS and is a mere 5mm².


And M3 has half of that at 17 TOPS.

Similar story in Qualcomm chips:
 
Reactions: Schmide and Mopetar

Doug S

Platinum Member
Feb 8, 2020
2,481
4,037
136
It can be. Max chips are around ~430mm2 right now. All Apple has to do is make it slightly smaller. It seems like Apple has been designing the Max chip in anticipation of this limit all along.

Not really. The M1 Max was claimed to be around that size based on die photos Apple had supplied. Then someone actually looked at one, and noticed all that extra space at the bottom that turns out to be for connecting two Max dies together that was cut out of Apple's die photo. The actual size was around 500 mm^2.

That's not to say they couldn't release a more compact version by the time high NA comes around. The caches may not shrink much (though BSPD may help there) but logic will, even if not as much as it used to. So Apple could probably produce one in TSMC's A14 or A10 with more cores for everything than today that fits the smaller reticle. And while they may want to add more CPU/GPU/NPU cores they aren't going to need to increase the number of memory controllers, TB ports, displays supported and so forth. So to the extent that stuff can shrink they will claw back some area.
 
Reactions: SpudLobby

FlameTail

Diamond Member
Dec 15, 2021
3,150
1,800
106
"The timeline for N2 involves risk production in 2025, leading to volume ramp in the second half of the year.

TSMC will offer a version of N2 with BSPDN later in the cycle - approximately six months later ."

- https://morethanmoore.substack.com/p/tsmc-oip-forum-fabs-n3n2bspn

Based on this timeline, I don't think Apple A19 will be using N2.

A19 is a 2025Q3 chip, and N2 is ramping in 2025H2. The timing does not align.

So,

2023 = A17 = N3B
2024 = A18 = N3E
2025 = A19 = N3P
2026 = A20 = N2 (+BSPDN?)
2027 = A21 = N2P (+BSPDN?)
 

SpudLobby

Senior member
May 18, 2022
961
655
106
Eventually Apple's M Max chips will also have to use chiplets.

In the generation that begins to use High-NA EUV.

High-NA EUV halves the reticle size to 429 mm².

M2 Max and M3 Max are well over 430 mm² I believe. So future Max chips cannot be monolithic.
A good question is: what kind of advanced low power chiplet tech is on the table? There are a lot of different ways to do this afaict. My hope is Apple, MediaTek, Nvidia, Qualcomm find ways to do chiplets without utterly screwing idle power and active power.

Idk there’s so many d*mn packaging technologies from AMD’s cheap BS to Intel’s EMIB and Foveros Omni stuff.

But looking at Lunar Lake Intel apparently thinks the 4 tile stuff was too far — and apparently Panther Lake is more similar to Lunar Lake than MTL/ARL.

I think a CPU and GPU tile, with SOC and IO function siphoned into one of those, and on some kind of very very low power interposer might have a lot of potential going forward for mobile APUs or desktop. It’s the only way we’ll see some beefy M Pro/Max type setup from Nvidia or Intel etc.
 

mikegg

Golden Member
Jan 30, 2010
1,815
445
136
Not really. The M1 Max was claimed to be around that size based on die photos Apple had supplied. Then someone actually looked at one, and noticed all that extra space at the bottom that turns out to be for connecting two Max dies together that was cut out of Apple's die photo. The actual size was around 500 mm^2.

That's not to say they couldn't release a more compact version by the time high NA comes around. The caches may not shrink much (though BSPD may help there) but logic will, even if not as much as it used to. So Apple could probably produce one in TSMC's A14 or A10 with more cores for everything than today that fits the smaller reticle. And while they may want to add more CPU/GPU/NPU cores they aren't going to need to increase the number of memory controllers, TB ports, displays supported and so forth. So to the extent that stuff can shrink they will claw back some area.
Interesting. I did not know the actual size was that big. Then separate dies seem to be the only logical choice.
 

FlameTail

Diamond Member
Dec 15, 2021
3,150
1,800
106
A good question is: what kind of advanced low power chiplet tech is on the table? There are a lot of different ways to do this afaict. My hope is Apple, MediaTek, Nvidia, Qualcomm find ways to do chiplets without utterly screwing idle power and active power.

Idk there’s so many d*mn packaging technologies from AMD’s cheap BS to Intel’s EMIB and Foveros Omni stuff.

But looking at Lunar Lake Intel apparently thinks the 4 tile stuff was too far — and apparently Panther Lake is more similar to Lunar Lake than MTL/ARL.

I think a CPU and GPU tile, with SOC and IO function siphoned into one of those, and on some kind of very very low power interposer might have a lot of potential going forward for mobile APUs or desktop. It’s the only way we’ll see some beefy M Pro/Max type setup from Nvidia or Intel etc.
Yes.

**Tile A** : CPU, NPU, ISP, Security Processor, I/O, Display Engine

**Tile B** : GPU, Media Engine,

With RAM buses and SLC on both tiles.

This allows for good product segmentation too, as you can mix and match CPU and GPU tiles, to create several SoCs.
 

FlameTail

Diamond Member
Dec 15, 2021
3,150
1,800
106

What's taking up all the unlabeled area in the M3?

One would assume the CPU, GPU, NPU would take up much of the die area in such an SoC. But looking at this die shot, there's a lot of unlabeled area. What's in all that?

I can think of the Media Engine, Secure Enclave and Image Signal Processor. What else?
 

FlameTail

Diamond Member
Dec 15, 2021
3,150
1,800
106
"The timeline for N2 involves risk production in 2025, leading to volume ramp in the second half of the year.

TSMC will offer a version of N2 with BSPDN later in the cycle - approximately six months later ."

- https://morethanmoore.substack.com/p/tsmc-oip-forum-fabs-n3n2bspn

Based on this timeline, I don't think Apple A19 will be using N2.

A19 is a 2025Q3 chip, and N2 is ramping in 2025H2. The timing does not align.

So,

2023 = A17 = N3B
2024 = A18 = N3E
2025 = A19 = N3P
2026 = A20 = N2 (+BSPDN?)
2027 = A21 = N2P (+BSPDN?)
Will BSPDN even be available on time for A20?
 

GC2:CS

Member
Jul 6, 2018
27
19
81
It can draw cat pics. What more do you want? You can has all the cat pics you could ever want.
Can it do a 10/10 loaf ? If not then the real cat will always reign supreme over the AI cat.

Will be there also BDSM version?

Those abreviations are getting ridicoulous. Somebody was explaining to me how BSPDN is amazing and is going to change the semi landscape in a year. And I was like hmm… that sounds like some old backside power delivery stuff Lets call it dual flip for double the jam !
A good question is: what kind of advanced low power chiplet tech is on the table? There are a lot of different ways to do this afaict. My hope is Apple, MediaTek, Nvidia, Qualcomm find ways to do chiplets without utterly screwing idle power and active power.

Idk there’s so many d*mn packaging technologies from AMD’s cheap BS to Intel’s EMIB and Foveros Omni stuff.

But looking at Lunar Lake Intel apparently thinks the 4 tile stuff was too far — and apparently Panther Lake is more similar to Lunar Lake than MTL/ARL.

I think a CPU and GPU tile, with SOC and IO function siphoned into one of those, and on some kind of very very low power interposer might have a lot of potential going forward for mobile APUs or desktop. It’s the only way we’ll see some beefy M Pro/Max type setup from Nvidia or Intel etc.
Nah I am pretty sure you dont need EUV to pattern for the die to die interconects. That is some aluminium stuff. Would be an Apple way of cheating the reticle limit by a bit.
 

FlameTail

Diamond Member
Dec 15, 2021
3,150
1,800
106
So the M3 Pro and Max ditched the 4 core CPU cluster and migrated to a 6 core CPU cluster.

4 cores/16 MB L2 -> 6 cores/24 MB L2

What are the pros/cons of this change?

For one, I guess single threaded performance would benefit as one core has access to 24 MB L2.

But then what is the effect on MT performance? The amount of L2 per core is the same in both the 4C/16 MB and 6C/24 MB clusters.
 

Doug S

Platinum Member
Feb 8, 2020
2,481
4,037
136
But then what is the effect on MT performance? The amount of L2 per core is the same in both the 4C/16 MB and 6C/24 MB clusters.

It would hurt MT performance, because you have six cores snooping on that cache rather than four. For parallel tasks that don't talk to each other (i.e. Cinebench type stuff) it wouldn't make a difference, but where the cores are competing to access the same cache lines it would have a measurable impact.
 

soresu

Platinum Member
Dec 19, 2014
2,951
2,170
136
I wonder why they have to call it backside and not underside.
Probably it's related to how they architect image sensors with 'backside illumination' for greater sensitivity because the wiring has moved from the front side (top) of the sensor stack to the back side (bottom).

Honestly I'm surprised given how old BSI sensor tech is that it's took semiconductor logic this long to start using the same technique.

Perhaps the benefits weren't as clear prior to GAA/MBCFET/Nanosheet/RibbonFET.
 
Last edited:
Reactions: igor_kavinski

smalM

Member
Sep 9, 2019
63
66
91
I wonder why they have to call it backside and not underside.
Depending on how the die is bonded, the underside may be the frontside or the backside of the die.

So the M3 Pro and Max ditched the 4 core CPU cluster and migrated to a 6 core CPU cluster.
They all use the same uncore shared logic, M3 is just missing two cores.
I wouldn't be surprised if it is an 8 port cluster unit so they are more flexible in their chip design.
 

FlameTail

Diamond Member
Dec 15, 2021
3,150
1,800
106
It would hurt MT performance, because you have six cores snooping on that cache rather than four. For parallel tasks that don't talk to each other (i.e. Cinebench type stuff) it wouldn't make a difference, but where the cores are competing to access the same cache lines it would have a measurable impact.
Then it case I think Apple should have stuck with 4 core clusters. Comparing M3 vs M3 Pro/Max, the ST performance difference is minimal, which shows that the extra cache isn't helping that much.

But more importantly it would have allowed them to this:

M3: 4P + 4E
M3 Pro : 8P + 4E
M3 Max : 12P + 4E

How neat is that!
 

FlameTail

Diamond Member
Dec 15, 2021
3,150
1,800
106
Reactions: Mopetar and Eug
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |