Discussion Apple Silicon SoC thread

Eug · Nov 10, 2020

M1
5 nm
Unified memory architecture - LP-DDR4
16 billion transistors

8-core CPU

4 high-performance cores
192 KB instruction cache
128 KB data cache
Shared 12 MB L2 cache

4 high-efficiency cores
128 KB instruction cache
64 KB data cache
Shared 4 MB L2 cache
(Apple claims the 4 high-effiency cores alone perform like a dual-core Intel MacBook Air)

8-core iGPU (but there is a 7-core variant, likely with one inactive core)
128 execution units
Up to 24576 concurrent threads
2.6 Teraflops
82 Gigatexels/s
41 gigapixels/s

16-core neural engine
Secure Enclave
USB 4

Products:
$999 ($899 edu) 13" MacBook Air (fanless) - 18 hour video playback battery life
$699 Mac mini (with fan)
$1299 ($1199 edu) 13" MacBook Pro (with fan) - 20 hour video playback battery life

Memory options 8 GB and 16 GB. No 32 GB option (unless you go Intel).

It should be noted that the M1 chip in these three Macs is the same (aside from GPU core number). Basically, Apple is taking the same approach which these chips as they do the iPhones and iPads. Just one SKU (excluding the X variants), which is the same across all iDevices (aside from maybe slight clock speed differences occasionally).

EDIT:

M1 Pro 8-core CPU (6+2), 14-core GPU
M1 Pro 10-core CPU (8+2), 14-core GPU
M1 Pro 10-core CPU (8+2), 16-core GPU
M1 Max 10-core CPU (8+2), 24-core GPU
M1 Max 10-core CPU (8+2), 32-core GPU

M1 Pro and M1 Max discussion here:

Page 78 - Discussion - Apple Silicon SoC thread

Page 78 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

forums.anandtech.com

M1 Ultra discussion here:

Page 109 - Discussion - Apple Silicon SoC thread

Page 109 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

forums.anandtech.com

M2 discussion here:

Page 127 - Discussion - Apple Silicon SoC thread

Page 127 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

forums.anandtech.com

Second Generation 5 nm
Unified memory architecture - LPDDR5, up to 24 GB and 100 GB/s
20 billion transistors

8-core CPU

4 high-performance cores
192 KB instruction cache
128 KB data cache
Shared 16 MB L2 cache

4 high-efficiency cores
128 KB instruction cache
64 KB data cache
Shared 4 MB L2 cache

10-core iGPU (but there is an 8-core variant)
3.6 Teraflops

16-core neural engine
Secure Enclave
USB 4

Hardware acceleration for 8K h.264, h.264, ProRes

M3 Family discussion here:

Page 215 - Discussion - Apple Silicon SoC thread

Page 215 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

forums.anandtech.com

M4 Family discussion here:

Page 263 - Discussion - Apple Silicon SoC thread

Page 263 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

forums.anandtech.com

poke01 · Dec 7, 2024

gdansk said:
And yet it doesn't matter. No one but Apple has made a core competitive even with the flaming piles of x86 garbage. Weird for something so inherently better.

In some specific measurements it may have advantages but in the end SPECint still shows x86 is good enough despite all these weird limitations. And in the server market no one is even on the performance per watt level of Zen 5. Strange for something inherently better. The only one who will make ARM look great here is Apple.

Commodity ARM designers aren't doing any better than commodity x86 designers.

Are we just ignoring Qualcomm?

gdansk · Dec 7, 2024

poke01 said:
Are we just ignoring Qualcomm?

No, they simply haven't even beat HX 370 nor M3 except in penny pinching. They must be throwing away the inherent advantages hard.

poke01 · Dec 7, 2024

gdansk said:
No, they simply haven't even beat HX 370 nor M3 except in penny pinching. They must be throwing away the inherent advantages hard.

I'll wait till their X Elite 2 to see if their team is competent in the PC space

Doug S · Dec 7, 2024

FlameTail said:
A19 Pro
2P + 6E
8-core GPU

A19
2P + 4E
6-core GPU

Apple doesn't care about winning MT pissing contests on a phone, they aren't going to add more E cores. I'm also highly skeptical they can bump the number of GPU cores by 33%. What good does it do to add them if they will only be forced to thermal throttle more quickly?

Qualcomm can afford to be a bit more "aggressive" power wise with their phone SoCs because the Android OEMs do care about winning benchmark pissing contests and some of them offer phones tailored for gaming with more aggressive cooling solutions that can use more power to drive more CPU/GPU cores before they begin throttling.

FlameTail · Dec 7, 2024

Doug S said:
What good does it do to add them if they will only be forced to thermal throttle more quickly?

What is this silly statement?
More cores means a better efficiency curve

(A) 8 GPU cores @ 1.2 GHz
(B) 6 GPU cores @ 1.6 GHz

A and B delivery the same performance theoretically, but A will consume less power due to the nature of V/F curves.

Doug S said:
Qualcomm can afford to be a bit more "aggressive" power wise with their phone SoCs because the Android OEMs do care about winning benchmark pissing contests and some of them offer phones tailored for gaming with more aggressive cooling solutions that can use more power to drive more CPU/GPU cores before they begin throttling.

Agreed. Even ordinary flagship Android phones have sophisticated cooling solution with vapour chambers and the like.

FlameTail · Dec 7, 2024

poke01 said:
Are we just ignoring Qualcomm?

gdansk said:
No, they simply haven't even beat HX 370 nor M3 except in penny pinching. They must be throwing away the inherent advantages hard.

The efficiency of 1st generation Oryon CPU in the Snapdragon X Elite was actually subpar;

Worse than even Apple M1 in SPEC INT!

Then Qualcomm released the 2nd generation Oryon CPU in Snapdragon 8 Elite, just 5 months after X Elite debuted.

A massive 2x performance-per-watt uplift.

The node upgrade (N4P -> N3E) alone cannot explain this huge improvement. It means that much of this improvement is coming from architectural fixes. That validates speculation that there is some design flaw in the X Elite/1st gen Oryon, which is the reason for it's subpar efficiency. This may have been caused to due to rushed development (Qualcomm acquired Nuvia only 3 years ago, in 2021), and troubles due to the ARM lawsuit (court filings indicate that Qualcomm had to redesign the Phoenix core several times).

3rd party testing confirm the 2x PPW uplift;

Efficiency of 2nd gen Oryon CPU is comparable to Apple A17/M3 (slightly worse in INT, slightly better in FP).

So Qualcomm has the necessary technology in their arsenal to blow Intel/AMD out of the water (where efficiency is concerned).

Sadly, 2nd gen Oryon CPU won't be coming to laptops. But at the end of next year, Qualcomm will reveal X Elite Gen 2 with the 3rd generation Oryon CPU.

3rd gen Oryon CPU will build on the foundation of 2nd gen Oryon CPU, and scale performance and efficiency to new heights.

How will Intel/AMD compete against this?

poke01 · Dec 7, 2024

FlameTail said:
How will Intel/AMD compete against this?

simple AMD has nothing. Intel has Panther Lake

SpudLobby · Dec 8, 2024

FlameTail said:
Rumour: Apple M5 will be on TSMC 3nm

Apple Orders M5 Chips From TSMC Ahead of Late 2025 Production

Apple has ordered M5 chips from TSMC as the company begins production development of its next-generation processor for future devices, reports...

www.macrumors.com

going to be funny when the M5 is on N3P because it’s the right balance of volume and cost at the timetables they want, and Apple has like a + 5% frequency tweak and 2% IPC gain without pushing power up more (maybe E Core arch gains or something) but still stays so far ahead of Intel/AMD, while Qualcomm narrows the ST perf/W @ high perf gap more than anything else.

Pretty likely at this point IMO.

Not even dogging Apple here at all, because they’re still doing fantastic and we all know AMD and Intel will ship some sloppy tweak of their own and the deficits in energy, idle/PMIC management will remain (or even return for Lunar -> Panther).

But I seriously doubt we see another M3 -> M4 jump without N2P and another modest arch change. They won’t push power much beyond this.

And I do believe QC will narrow the gap massively.

FlameTail · Dec 8, 2024

Even with the minor node upgrade (N3E -> N3P), I think Apple engineers will try to achieve 5 GHz, just for the M5GHz memes

SpudLobby · Dec 8, 2024

FlameTail said:
Even with the minor node upgrade (N3E -> N3P), I think Apple engineers will try to achieve 5 GHz, just for the M5GHz memes

Uhh, doubt this. Will go ahead and bet against it.

M4 is already at about 10-11W total power (idle normal) in SpecInt measured properly by Geekerwan in the iPad or Mac Mini. From software readouts of just the CPU (again stupid way to cross-compare but potentially useful between generations of Apple) you can also tell they pushed power up with the CPU nearly doubling. Is it still the most efficient in the industry for the performance? Yes especially through the curve. But they don’t have room for this, biggest Mac frequency jump Apple has made was M2 to M3 when the node jump was there with some other tweaks. They did another 10% with M3 to M4 and while the A17 to A18 didn’t show the kind of blowup this did, there’s just no reason to believe a +5% frequency gain ISO power will result in Apple taking a 12-14% boost instead especially at this stage and given they don’t offer granular control over frequency and scheduling. It would make energy efficiency worse.

Also it really doesn’t matter what Apple does re: Qualcomm in laptops that much. If they reduce the gap but Apple has a + 10-15% perf iso power lead somehow still, it’s whatever. On a cluster map they’d be the same thing with the kind of numbers AMD and Intel put up.

DZero · Dec 8, 2024

FlameTail said:
How will Apple A19 Pro fare against this?

Apple Qualcomm
2021 A15 Bionic

5 cores
Family 8 8 Gen 1

1024 ALUs
Adreno 7
2022 A16 Bionic

5 cores
Family 8 8 Gen 2

1536 ALUs
Adreno 7
2023 A17 Pro

6 cores
Family 9 8 Gen 3

1536 ALUs
Adreno 8
2024 A18 Pro

6 cores
Family 9 8 Elite

1536 ALUs
Adreno 8
2025 A19 Pro

8 Elite Gen 2

2304 ALUs
Adreno 8

It seems likely that A19 will introduce Apple's next generation Family 10 GPU architecture.

(Speculation)

A19 Pro
2P + 6E
8-core GPU

A19
2P + 4E
6-core GPU

Oh boy, now there would be a marked difference between the Pro and non Pro models.

igor_kavinski · Dec 8, 2024

FlameTail said:
Even with the minor node upgrade (N3E -> N3P), I think Apple engineers will try to achieve 5 GHz, just for the M5GHz memes

I wonder if there's a paper about frequency vs. transistor performance that shows anything approaching 5+ GHz is just too much for any transistor to keep it from leaking electrons

SpudLobby · Dec 8, 2024

FlameTail said:
How will Apple A19 Pro fare against this?

Apple Qualcomm
2021 A15 Bionic

5 cores
Family 8 8 Gen 1

1024 ALUs
Adreno 7
2022 A16 Bionic

5 cores
Family 8 8 Gen 2

1536 ALUs
Adreno 7
2023 A17 Pro

6 cores
Family 9 8 Gen 3

1536 ALUs
Adreno 8
2024 A18 Pro

6 cores
Family 9 8 Elite

1536 ALUs
Adreno 8
2025 A19 Pro

8 Elite Gen 2

2304 ALUs
Adreno 8

It seems likely that A19 will introduce Apple's next generation Family 10 GPU architecture.

(Speculation)

A19 Pro
2P + 6E
8-core GPU

A19
2P + 4E
6-core GPU

This is way too detailed speculation lol. We can be reasonable about some inferences of course as we have been (see Oryon V2 improvement) but this is just fan fiction basically.

And neither of these will be broken or made by a 20% GPU performance difference between them at this stage anyway because QC’s is good now as is Immortalis (really good actually) and Apple is also still great but has superior software and support from developers. They’re all fairly and neck anyway on paper tests of efficiency where it counts and this will continue roughly speaking (give or take 20-30% some years I’m sure), but at this stage software support for hardware features will be the differentiator absent someone really falling off.

CPU is even less important at this stage at least between Apple and QC in phones given the gap is going to likely get so much narrower and no one is going to switch from an iPhone for better battery life from an Android phone that has an Oryon V3 and the usual Android bigger battery (though to be fair that would be a win). It would just make switching less icky for other reasons.

SpudLobby · Dec 8, 2024

Also I doubt they will throw 6 cores into the Pro SoC and make that the distinguishing feature on the CPU for a phone or tablet SoC SKU division. It’s antithetical to the whole enterprise, “more E core throughput” could very well improve efficiency if you can run more of a gluttony of background tasks at a more efficient speed instead of: faster with more context switching to compensate for having fewer cores, but the effect is likely to be negligible at this level of horsepower from Apple’s P & E cores and on iOS and it really depends on what modeling of iOS use cases look like. Eventually will they do it?

Yes of course, and probably for that reason and when any extra cluster power overhead of it is meaningless as well (which it probably has been now with node gains) to where it’s just yesteryear’s background performance at much lower power from node gains + more cores at even lower voltage.

But this sell isn’t really about throughput at all. As such unless the entire CPU changes too and they don’t share designs anymore, I really doubt we will see them add 2 more cores and advertise that in the Pro model for an iPhone. What are they going to do, show people GB6 MT curves?

SpudLobby · Dec 8, 2024

They’ll just do it for the whole lineup at once imo. I almost want to say we’re more likely to see Apple start artificially frequency binning for the Pro vs Non-Pro stuff than end up with like

Pro Mobile

2 big cores and 8 littles!

Regular

2 + 4 slop!!

The different caches are a bit more like the frequency binning than the extra e core idea in some sense tbh

SpudLobby · Dec 8, 2024

Btw. Do we have any good idea of why Apple did 8/12 MB L2/SLC for the A18, and 16/24MB L2/SLC for the A18 Pro that isn’t just “marketing” (most people don’t know that or even see the slide about it, it’s an entirely different die too).

The efficiency gain in SpecInt is minimal, like maybe + 5-6% performance/W or something iirc and inconsistently so.

But I wonder if under pressure and with more background stuff going in your working sets or fetched from DRAM to the SLC, you can start seeing more substantial efficiency differences (say twice that or more).

Alternatively, photography and videographers might be the key here.

It isn’t just the SLC for the GPU or whatever before someone says that, remember the L2 is doubled too.

The Hardcard · Dec 8, 2024

FlameTail said:
How will Intel/AMD compete against this?

Intel and AMD are worse, but not yet gapped by Qualcomm. I don’t see Oryon v3 as creating the needed gap. Remember, Intel and AMD don’t need to take anything. They have it. Qualcomm is the one who has to do all the taking. They have an even bigger challenge than Apple. Macs had several million sales per year before Apple Silicon.

Betamax was better than VHS. But they weren’t able to get the mainstream feeling that they were really missing out on anything.

Apple Silicon enabled all day thin and lights when thin and light x86 were hard pressed to last half a day. Everyone is at that barrier now. Apple and Qualcomm are chasing 2 day thin and light use. But that is not as impactful.

Apple Silicon offers fanless, silent computing. I noticed no mention of that yet with Oryon v3. While M4s with fans are bringing noise, that noise comes with near full size desktop performance. Oryon is just even on multicore with AMD. And probably Arrow Lake mobile will be about even as well.

In the AI battle, software stacks are more impactful than hardware grunt and I see Apple, Intel, and AMD pushing harder than Qualcomm, since they all have datacenter funding to compliment the building of their AI stack, which Qualcomm doesn’t have.

The Hardcard · Dec 8, 2024

SpudLobby said:
Btw. Do we have any good idea of why Apple did 8/12 MB L2/SLC for the A18, and 16/24MB L2/SLC for the A18 Pro that isn’t just “marketing” (most people don’t know that or even see the slide about it, it’s an entirely different die too).

The efficiency gain in SpecInt is minimal, like maybe + 5-6% performance/W or something iirc and inconsistently so.

But I wonder if under pressure and with more background stuff going in your working sets or fetched from DRAM to the SLC, you can start seeing more substantial efficiency differences (say twice that or more).

Alternatively, photography and videographers might be the key here.

It isn’t just the SLC for the GPU or whatever before someone says that, remember the L2 is doubled too.

Video and photography are the key. It’s not about CPU performance. Both phone chips have more SLC than the M4. What can their CPUs do better than the M4?

SpudLobby · Dec 8, 2024

The Hardcard said:
Video and photography are the key. It’s not about CPU performance. Both phone chips have more SLC than the M4. What can their CPUs do better than the M4?

But they have double the L2 as well. 8 to 16MB.

johnsonwax · Dec 8, 2024

FlameTail said:
The efficiency of 1st generation Oryon CPU in the Snapdragon X Elite was actually subpar;

Worse than even Apple M1 in SPEC INT!

Then Qualcomm released the 2nd generation Oryon CPU in Snapdragon 8 Elite, just 5 months after X Elite debuted.

A massive 2x performance-per-watt uplift.

The node upgrade (N4P -> N3E) alone cannot explain this huge improvement. It means that much of this improvement is coming from architectural fixes. That validates speculation that there is some design flaw in the X Elite/1st gen Oryon, which is the reason for it's subpar efficiency. This may have been caused to due to rushed development (Qualcomm acquired Nuvia only 3 years ago, in 2021), and troubles due to the ARM lawsuit (court filings indicate that Qualcomm had to redesign the Phoenix core several times).

3rd party testing confirm the 2x PPW uplift;

Efficiency of 2nd gen Oryon CPU is comparable to Apple A17/M3 (slightly worse in INT, slightly better in FP).

So Qualcomm has the necessary technology in their arsenal to blow Intel/AMD out of the water (where efficiency is concerned).

Sadly, 2nd gen Oryon CPU won't be coming to laptops. But at the end of next year, Qualcomm will reveal X Elite Gen 2 with the 3rd generation Oryon CPU.

3rd gen Oryon CPU will build on the foundation of 2nd gen Oryon CPU, and scale performance and efficiency to new heights.

How will Intel/AMD compete against this?

Didn't X Elite perform far worse at launch than Qualcomms early promises suggested? If I'm remembering that correctly, why would anyone put stock in any future promises?

johnsonwax · Dec 8, 2024

The Hardcard said:
Intel and AMD are worse, but not yet gapped by Qualcomm. I don’t see Oryon v3 as creating the needed gap. Remember, Intel and AMD don’t need to take anything. They have it. Qualcomm is the one who has to do all the taking. They have an even bigger challenge than Apple. Macs had several million sales per year before Apple Silicon.

Betamax was better than VHS. But they weren’t able to get the mainstream feeling that they were really missing out on anything.

Apple Silicon enabled all day thin and lights when thin and light x86 were hard pressed to last half a day. Everyone is at that barrier now. Apple and Qualcomm are chasing 2 day thin and light use. But that is not as impactful.

Apple Silicon offers fanless, silent computing. I noticed no mention of that yet with Oryon v3. While M4s with fans are bringing noise, that noise comes with near full size desktop performance. Oryon is just even on multicore with AMD. And probably Arrow Lake mobile will be about even as well.

In the AI battle, software stacks are more impactful than hardware grunt and I see Apple, Intel, and AMD pushing harder than Qualcomm, since they all have datacenter funding to compliment the building of their AI stack, which Qualcomm doesn’t have.

I would qualify that somewhat as it appears that Microsoft is at least aspirationally invested in ARM as a platform they have more influence in. They wanted AI PCs and ARMs is how they delivered on that. As such, they don't need a performance gap as they have a control gap.

That was the same reasoning for Apple, btw. They didn't need to be faster, they just needed the ability to make the thing they actually needed, not the thing that Intel wanted to build.

Meteor Late · Dec 8, 2024

johnsonwax said:
Didn't X Elite perform far worse at launch than Qualcomms early promises suggested? If I'm remembering that correctly, why would anyone put stock in any future promises?

The only promise here is that next X Elite chip will use 3rd gen Oryon, no other promise beyond that.

Gideon · Dec 8, 2024

johnsonwax said:
Didn't X Elite perform far worse at launch than Qualcomms early promises suggested? If I'm remembering that correctly, why would anyone put stock in any future promises?

They failed to reach their targets (just like RDNA 3 did). Happens with new architectures (AMD also fixed loads of stuff with Zen+ and Zen 2)
But Oryon V2 pretty much fixes that. It's all but guaranteed V3 delivers much better than V1 in that regard

Doug S · Dec 8, 2024

SpudLobby said:
Btw. Do we have any good idea of why Apple did 8/12 MB L2/SLC for the A18, and 16/24MB L2/SLC for the A18 Pro that isn’t just “marketing” (most people don’t know that or even see the slide about it, it’s an entirely different die too).

The efficiency gain in SpecInt is minimal, like maybe + 5-6% performance/W or something iirc and inconsistently so.

Yes. Cost.

If you're going to do a different die on the exact same process, the only reason for it is cost since the non-Pro models sell for less than the Pro models. And don't forget the A18 'non'-P is going into the SE next spring, which will have a significantly lower price than the base iPhone 16.

The SE coming out next spring is probably the only reason they did separate dies for A18. It makes sense when they're jumping to a new process in the Pro line (like when they go N2 for Pro, the non-Pro will still be N3) but otherwise it is a fairly small difference. At least by chopping down the cache they recover some area, which is nowadays taking up a constant percentage of the die area instead of shrinking like cache did before.

mvprod123 · Dec 8, 2024

M4 Max core-to-core latency

https://twitter.com/x/status/1865783520217432487

https://twitter.com/x/status/1865787843659899052

	Apple	Qualcomm
2021	A15 Bionic 5 cores Family 8	8 Gen 1 1024 ALUs Adreno 7
2022	A16 Bionic 5 cores Family 8	8 Gen 2 1536 ALUs Adreno 7
2023	A17 Pro 6 cores Family 9	8 Gen 3 1536 ALUs Adreno 8
2024	A18 Pro 6 cores Family 9	8 Elite 1536 ALUs Adreno 8
2025	A19 Pro	8 Elite Gen 2 2304 ALUs Adreno 8

	Apple	Qualcomm
2021	A15 Bionic 5 cores Family 8	8 Gen 1 1024 ALUs Adreno 7
2022	A16 Bionic 5 cores Family 8	8 Gen 2 1536 ALUs Adreno 7
2023	A17 Pro 6 cores Family 9	8 Gen 3 1536 ALUs Adreno 8
2024	A18 Pro 6 cores Family 9	8 Elite 1536 ALUs Adreno 8
2025	A19 Pro	8 Elite Gen 2 2304 ALUs Adreno 8

Discussion Apple Silicon SoC thread

Lifer

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Golden Member

Diamond Member

Golden Member

Senior member

Lifer

Golden Member

Golden Member

Golden Member

Golden Member

Senior member

Senior member

Golden Member

Member

Member

Senior member

Golden Member

Diamond Member

Senior member