Discussion Apple Silicon SoC thread

Eug · Nov 10, 2020

M1
5 nm
Unified memory architecture - LP-DDR4
16 billion transistors

8-core CPU

4 high-performance cores
192 KB instruction cache
128 KB data cache
Shared 12 MB L2 cache

4 high-efficiency cores
128 KB instruction cache
64 KB data cache
Shared 4 MB L2 cache
(Apple claims the 4 high-effiency cores alone perform like a dual-core Intel MacBook Air)

8-core iGPU (but there is a 7-core variant, likely with one inactive core)
128 execution units
Up to 24576 concurrent threads
2.6 Teraflops
82 Gigatexels/s
41 gigapixels/s

16-core neural engine
Secure Enclave
USB 4

Products:
$999 ($899 edu) 13" MacBook Air (fanless) - 18 hour video playback battery life
$699 Mac mini (with fan)
$1299 ($1199 edu) 13" MacBook Pro (with fan) - 20 hour video playback battery life

Memory options 8 GB and 16 GB. No 32 GB option (unless you go Intel).

It should be noted that the M1 chip in these three Macs is the same (aside from GPU core number). Basically, Apple is taking the same approach which these chips as they do the iPhones and iPads. Just one SKU (excluding the X variants), which is the same across all iDevices (aside from maybe slight clock speed differences occasionally).

EDIT:

M1 Pro 8-core CPU (6+2), 14-core GPU
M1 Pro 10-core CPU (8+2), 14-core GPU
M1 Pro 10-core CPU (8+2), 16-core GPU
M1 Max 10-core CPU (8+2), 24-core GPU
M1 Max 10-core CPU (8+2), 32-core GPU

M1 Pro and M1 Max discussion here:

Page 78 - Discussion - Apple Silicon SoC thread

Page 78 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

forums.anandtech.com

M1 Ultra discussion here:

Page 109 - Discussion - Apple Silicon SoC thread

Page 109 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

forums.anandtech.com

M2 discussion here:

Page 127 - Discussion - Apple Silicon SoC thread

Page 127 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

forums.anandtech.com

Second Generation 5 nm
Unified memory architecture - LPDDR5, up to 24 GB and 100 GB/s
20 billion transistors

8-core CPU

4 high-performance cores
192 KB instruction cache
128 KB data cache
Shared 16 MB L2 cache

4 high-efficiency cores
128 KB instruction cache
64 KB data cache
Shared 4 MB L2 cache

10-core iGPU (but there is an 8-core variant)
3.6 Teraflops

16-core neural engine
Secure Enclave
USB 4

Hardware acceleration for 8K h.264, h.264, ProRes

M3 Family discussion here:

Page 215 - Discussion - Apple Silicon SoC thread

Page 215 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

forums.anandtech.com

M4 Family discussion here:

Page 263 - Discussion - Apple Silicon SoC thread

Page 263 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

forums.anandtech.com

Eug · Oct 20, 2021

Red_m said:
Apple likely did not want to push clocks as that more heat and more power draw for minimal speed increase.

Keep in mind thats OpenCL. OpenCL had been depreciated in macOS and is rotting.

Metal benchmarks are the ones we need to look out for.

OK, I found one M1 Max Metal score. There are no others, so I don't know how representative this is:

MacBookPro18,2 - Geekbench

Benchmark results for a MacBookPro18,2 with an Apple M1 Max processor.

browser.geekbench.com

M1 non-Pro Mac mini gets around 21000-22000.

Doug S · Oct 20, 2021

Red_m said:
Apple likely did not want to push clocks as that more heat and more power draw for minimal speed increase.

Plus they didn't need the extra speed to reach their launch goal, which was to beat all the PC laptop competition and blow away the existing x86 Macbook Pro models.

Apple has always highly valued power efficiency. They switched from PPC to x86 over it, when PPC became more workstation focused once IBM was the only serious chipmaker in that market. They had to pressure Intel at the CEO level for a line of lower TDP laptop CPUs to allow them to develop Macbook Air.

Squeezing out a bit more performance at the cost of a louder fan, hotter lap and shorter battery life isn't worth it to them. For what, so it could beat that 10 pound DTR that requires two power supplies some dork was whining about being a little faster in the comments on Andrei's M1 Pro/Max article?

Roland00Address · Oct 21, 2021

igor_kavinski said:
Apple is in a truly unique position to launch a Nintendo Switch killer with 20 hours battery life.

The Nvidia chip in the Nintendo switch is 118 mm² on TSMC 20nm (it has since gotten a die shrink to 16nm) with only 2 billion transistors.

The Apple M1, not the M1Pro or M1Max is 119 mm² on TSMC 5nm EUV process with 33.7 billion transistors. I hope Apple with 16x the transistor budget and 5 years plus 10 months of process shrinks can make a much better chip.

Regardless with 5nm process and this die size, even with 100% yields (which the yields will not be this high) Apple will not be wanting to ship a $200 dollar device. Maybe in a few years but not now. When Nintendo was finally shipping Nintendo Switches (2017, the chip came out in 2015) Apple was shifting from 16nm to 10nm.

insertcarehere · Oct 21, 2021

tempestglen said:
New GFX bench scores: Aztec Ruins High Tier off screen 310 fps! closed to 3080 mobile!

View attachment 51689

I do wonder what Apple's motivation is for integratimg such a powerful GPU in the SoC. It's clear that gaming isn't a market they care a lot about, so presumably any GPU would be for acceleration purposes in applications. A lot of the same applications could also (theroetically) be served well with smaller dedicated fixed-function units.

Eug · Oct 21, 2021

insertcarehere said:
I do wonder what Apple's motivation is for integratimg such a powerful GPU in the SoC. It's clear that gaming isn't a market they care a lot about, so presumably any GPU would be for acceleration purposes in applications. A lot of the same applications could also (theroetically) be served well with smaller dedicated fixed-function units.

For example, After Effects runs natively on M1 now. Same with Premiere Pro. Same with Blender. Etc. These support Metal.

Apple joins Blender Development Fund — blender.org

Apple has joined the Blender Development Fund as a Patron Member to support continued core development for Blender.

www.blender.org

And of course there are Apple's own applications like Final Cut.

igor_kavinski · Oct 21, 2021

https://www.facebook.com/permalink.php?story_fbid=2146412825593223&id=100006735798590

Seems to me that Jobs gathered up like minded people at Apple, or at least, Tim Cook has enough of Jobs in him to disregard gaming as important. Maybe the only reason they have a potent GPU is so it looks good in benchmarks and avoids the embarrassment of an Apple product being at the bottom of a chart comparing it with other devices. I really wish Carmack would get obsessed with creating something groundbreaking that really works well on the M1. That would get other major developers on the bandwagon.

Pix12 · Oct 21, 2021

Eug said:
OK, I found one M1 Max Metal score. There are no others, so I don't know how representative this is:

MacBookPro18,2 - Geekbench

Benchmark results for a MacBookPro18,2 with an Apple M1 Max processor.

browser.geekbench.com

View attachment 51690

M1 non-Pro Mac mini gets around 21000-22000.

Seeing this and the other results, I wonder whether what we are seeing is the M1 Max with 24 cores, at least in Geekbench The numbers would suggest so.

Test	M1 8 cores	M1 Max (24 cores?)
Geekbench (OpenCL)	~20'000	~60'000 (~x3)
Geekbench (Metal)	~22'000	~68'800 (~x3)

It is true that Geekbench in the OpenCL test lists the M1 Max as having 32 "compute units", but could very well be a software mistake, especially since if we check the M1 results there is no "7 compute units" (please correct me if I'm wrong).

MB Air (probably 7 GPU cores): https://browser.geekbench.com/v5/compute/3558391

MB Air (probably 8 GPU cores): https://browser.geekbench.com/v5/compute/3558359

But they are all shown as 8 core units.

So my guess is that these scores are from a 24 core M1 Max GPU, which would indicate really good scaling. If this is true, we should get a 80'000-ish result for OpenCL and 90'000-ish Metal score soonish.

Eug · Oct 21, 2021

Pix12 said:
Seeing this and the other results, I wonder whether what we are seeing is the M1 Max with 24 cores, at least in Geekbench The numbers would suggest so.

Test M1 8 cores M1 Max (24 cores?)
Geekbench (OpenCL) ~20'000 ~60'000 (~x3)
Geekbench (Metal) ~22'000 ~68'800 (~x3)

It is true that Geekbench in the OpenCL test lists the M1 Max as having 32 "compute units", but could very well be a software mistake, especially since if we check the M1 results there is no "7 compute units" (please correct me if I'm wrong).

MB Air (probably 7 GPU cores): https://browser.geekbench.com/v5/compute/3558391

MB Air (probably 8 GPU cores): https://browser.geekbench.com/v5/compute/3558359

But they are all shown as 8 core units.

So my guess is that these scores are from a 24 core M1 Max GPU, which would indicate really good scaling. If this is true, we should get a 80'000-ish result for OpenCL and 90'000-ish Metal score soonish.

That’s what people on other sites are saying, but as I said elsewhere, that assumes three things:

1. Apple sent out gimped 24-core models to reviewers for one of the Mac’s most important product launches in computing history
2. Geekbench is misidentifying 24-core chips as 32-core
3. 24-core is scaling better than 300% vs 8-core at the same clock speed.

I wouldn’t say that’s impossible but it does seem unlikely. But we shall see.

igor_kavinski said:
Maybe the only reason they have a potent GPU is so it looks good in benchmarks and avoids the embarrassment of an Apple product being at the bottom of a chart comparing it with other devices.

Uh what?!?

Pix12 · Oct 21, 2021

Eug said:
That’s what people on other sites are saying, but as I said elsewhere, that assumes three things:

1. Apple sent out gimped 24-core models to reviewers for one of the Mac’s most important product launches in computing history
2. Geekbench is misidentifying 24-core chips as 32-core
3. 24-core is scaling better than 300% vs 8-core at the same clock speed.

I wouldn’t say that’s impossible but it does seem unlikely. But we shall see.

1. is true, I have no explanation for this. Unless these are some guidelines for some reason ("saving the best for last"), or these scores are from Apple themselves. But I clearly have no idea.
2. As I said in my post, this happened for the 8-core M1 and 7-core M1 as well. At least I did not find a result with M1 reported as having 7 "compute units".
3. 66k vs 68k is clearly not very significantly better, especially with a so little sample size.

But yeah, I agree, the first point is kinda suspicious. The rest I see them as non-issues tbh.

igor_kavinski · Oct 21, 2021

Eug said:
Uh what?!?

Why would someone go to the trouble of designing an awesome GPU with first-of-its-kind 512-bit LPDDR5 and then not proudly show it off with a AAA gaming title? It's like having the most beautiful chick in the world as your wife and NOT sleeping with her! Who else in the industry has a powerful GPU specifically not marketed for gaming?

insertcarehere · Oct 21, 2021

Eug said:
3. 24-core is scaling better than 300% vs 8-core at the same clock speed.

Its 3x the compute resources but ~6x the bandwidth going from a M1 to a (gimped) M1 Max, is the geekbench metal benchmark reliant on memory bandwidth?

igor_kavinski said:
Maybe the only reason they have a potent GPU is so it looks good in benchmarks and avoids the embarrassment of an Apple product being at the bottom of a chart comparing it with other devices.

Given Tim Cook is accountable to public shareholders in an immensely valuable company, somehow I doubt that "embarrassment avoidance" is a reason that'd sit well with the board when the decision will have such significant cost implications, both in manufacturing and packaging, down the line.

IntelUser2000 · Oct 21, 2021

insertcarehere said:
I do wonder what Apple's motivation is for integratimg such a powerful GPU in the SoC.

The scores are very good, but you gotta divide those numbers by 1.3 or so because the mobile parts have advantages of running on FP16 on those benchmarks.

That's why you see the M1 beating the 560X to a pulp in GFXBench but losing slightly to it in Rise of the Tomb Raider.

For CPU, Alderlake on the mobile parts will also be significantly faster than Tigerlake and come pretty close to the M1 Pro and Max.

jeanlain · Oct 21, 2021

Doug S said:
I think everyone is reading way too much into benchmarks with a sample size of 1 on zero day hardware.

We now have 3 scores, all close to 60000, for the M1 Max with 32 "compute units".
Its 1.5x better than the M1 Pro score, which is 2x better than the M1 score.
It's puzzling.

EDIT: if you compare subtests, there is at least one that does not show any improvement from M1 Pro to M1 Max. It seems there's a problem somewhere.

jeanlain · Oct 21, 2021

IntelUser2000 said:
The scores are very good, but you gotta divide those numbers by 1.3 or so because the mobile parts have advantages of running on FP16 on those benchmarks.

The M1 doesn't have double FP16 rate. It's the same rate as FP32. That's a difference with the A14. The A15 is like the M1 on that regards.

IntelUser2000 · Oct 21, 2021

jeanlain said:
The M1 doesn't have double FP16 rate. It's the same rate as FP32. That's a difference with the A14. The A15 is like the M1 on that regards.

Ok. I stand corrected.

Still you see that gap. Anandtech's test also shows that GFXBench doesn't use the CPU core, which is unrealistic in an actual game. There are mobile specific optimizations done that AMD and Nvidia won't bother with, and something Intel did(at least back in those days) when they were trying to enter the mobile space. The Bay Trail graphics were doing noticeably better in mobile benches than they did on games compared to Ivy Bridge graphics despite the same base uarch.

igor_kavinski · Oct 21, 2021

insertcarehere said:
Given Tim Cook is accountable to public shareholders in an immensely valuable company, somehow I doubt that "embarrassment avoidance" is a reason that'd sit well with the board when the decision will have such significant cost implications, both in manufacturing and packaging, down the line.

Why isn't the same board questioning Tim Cook about his lack of interest in promoting their hardware as a viable gaming platform to enable them to gain a sizeable slice of the $100 billion gaming pie? They are already an entertainment company with Apple TV+. How come their definition of entertainment not involve videogames? Casual games don't count because they can work great even without hardware accelerated 3D and people play them for their addictiveness, not their realistic graphics.

nxre · Oct 21, 2021

I doubt they would send a 24 core part for reviewers unless they are having some big problem with the yields on the 32 core part. The scaling from M1s 8-core to M1s Pro 16-core seems almost perfect based on those scores. It's just the 16-core to 32-core that is less than expected(and advertised).

Pix12 · Oct 21, 2021

nxre said:
I doubt they would send a 24 core part for reviewers unless they are having some big problem with the yields on the 32 core part. The scaling from M1s 8-core to M1s Pro 16-core seems almost perfect based on those scores. It's just the 16-core to 32-core that is less than expected(and advertised).

Honestly, while I also think this is strange, to me it's much more likely to be the case than having a terrible scaling of 3x (instead of 4x) AND having Apple advertise it. I understand the scaling cannot always be perfect, I would have expected 3.6x-3.8x or something in that neighborhood. At this point, it would make no sense to advertise 4x instead of 3x.

My money is still on the fact that this is the 24 cores, the numbers match too well. But we will see very soon! As a tech nerd, I'm just too excited.

tempestglen · Oct 21, 2021

insertcarehere said:
I do wonder what Apple's motivation is for integratimg such a powerful GPU in the SoC. It's clear that gaming isn't a market they care a lot about, so presumably any GPU would be for acceleration purposes in applications. A lot of the same applications could also (theroetically) be served well with smaller dedicated fixed-function units.

I guess the answer is VR, especially when you install chipest on the headset, it must be efficient and high performance. Assume Oculus Quest 2 with M1 pro...

tempestglen · Oct 21, 2021

igor_kavinski said:
Why would someone go to the trouble of designing an awesome GPU with first-of-its-kind 512-bit LPDDR5 and then not proudly show it off with a AAA gaming title? It's like having the most beautiful chick in the world as your wife and NOT sleeping with her! Who else in the industry has a powerful GPU specifically not marketed for gaming?

Yes, Apple is going to sleep with VR.

The Hardcard · Oct 21, 2021

IntelUser2000 said:
The scores are very good, but you gotta divide those numbers by 1.3 or so because the mobile parts have advantages of running on FP16 on those benchmarks.

That's why you see the M1 beating the 560X to a pulp in GFXBench but losing slightly to it in Rise of the Tomb Raider.

For CPU, Alderlake on the mobile parts will also be significantly faster than Tigerlake and come pretty close to the M1 Pro and Max.

FP16 was the result of benchmarks based in OpenGL ES. The GPUs had FP32. All Mac Metal benchmarks are FP32.

Semel · Oct 21, 2021

Well, it's an impressive piece of hardware, and I have no doubt it will be great for video editing and such..however... gaming will still suck... unfortunately..

And there are two major reasons for that.

1) Metal API.

There are two major APIs out there: DX12 and Vulkan.

They are both low-level API so they much better utilize hardware than the old and abandoned OpenGL and the not so abandoned but dying out DX11.

DX12 works only on Windows and it's proprietary., so it's not really an option for any OS other than Windows

Vulkan supports all major OS and is cross-platform. And it supports all major features of DX12 if not more.

Thanx to Vulkan Linux users can now use Steam Proton or Wine+dxvk and play 80+% of all native Windows games on Linux with little to no performance loss. And Valve is pushing hard for Linux (or rather Linux+Proton) support especially now with their Steam Deck.

Apple decided not to adopt Vulkan and invented their own proprietary API- Metal.

It's worth noting though that Metal was released several years earlier than Vulkan's first release. However, later on nothing prevented Apple from embracing Vulkan when it got much better than Metal especially after seeing what wonders it did for Linux gaming, but Apple just didn't..

Keeping going Metal route instead of Vulkan was a huge mistake.

I'll let myself quote one guy from Reddit:

Metal is missing much of the newer functionality available in Vulkan and DX12. Vulkan isnt the easiest to kick-off in but it's one of the best APIs out there for multicore + added Ray tracing support now. Now mind you MoltenVK doesn't support any of these RT extensions as you can't really translate BVH intersections from one api to another on the fly (it's an intense process anyways). So could Apple put some skin in the game and add vulkan support ? Maybe. Will they do it to kill off their own Metal? eff no. So unless they add hardware Ray tracing accelerators to their Silicon, there is no effin way software ray tracing will render 40-60 fps on M1, M2, whatever in the next few years.

2) They decided to make a drastic move and change architecture from x86 to ARM.

Metal API was issue enough but now ARM? Developers of console\PC games just don't want to bother with this. The tiny market of Macbook fans who can afford a $4-5k laptop and even tinier % of these people who wanna play AAA games on these $4-5k laptops just isn't worth the hassle..

No wonder we didn't see any REAL GPU intensive applications tested..

And another thing... You can't compare TFLOPS directly when using different architecture. It's nonsensical.
3090 - 35.6 TFLOPS, 6900xt -20.6 TFLOPS. Almost 40% difference yet the real performance difference is 3-10% depending on a game\task.

igor_kavinski · Oct 21, 2021

Trail Redefines the Future of Gaming with Zero-friction Gaming Platform for the Browser | Business Wire

We need more competition in this space to make games and even applications, truly cross platform, with the browser essentially becoming the OS for all our needs, running on top of whatever OS the computer manufacturer chooses due to available drivers and other factors etc. I wouldn't mind a game running at 3 year old speeds on the latest hardware with 3 year old graphics fidelity as long as I can play that game essentially anywhere a browser is able to run in a performant manner.

Mopetar · Oct 21, 2021

Eug said:
???

The M1 Mac mini has the same clock speed as the M1 MacBook Pro and even M1 MacBook Air. Same goes for the 24" iMac.

All are 3.2 GHz. In fact, even the iPad Pro M1 is 3.2 GHz.

For some reason I had thought it had a higher clock speed, but after looking at it I was clearly wrong about that. Perhaps I was thinking of the different binning between the two in terms of GPU cores or something.

Thanks for the clarification.

Heartbreaker · Oct 21, 2021

Mopetar said:
For some reason I had thought it had a higher clock speed, but after looking at it I was clearly wrong about that. Perhaps I was thinking of the different binning between the two in terms of GPU cores or something.

Thanks for the clarification.

The only performance difference between any of the M1 machines, is that the fanless MBA, will heat up and throttle at higher loads. Otherwise they all perform the same.

Discussion Apple Silicon SoC thread

Lifer

Lifer

Diamond Member

Platinum Member

Senior member

Lifer

Lifer

Junior Member

Lifer

Junior Member

Lifer

Senior member

Elite Member

Member

Member

Elite Member

Lifer

Member

Junior Member

Member

Member

Senior member

Junior Member

Lifer

Diamond Member

Diamond Member