Discussion Apple Silicon SoC thread

Page 97 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Eug

Lifer
Mar 11, 2000
23,752
1,285
126
M1
5 nm
Unified memory architecture - LP-DDR4
16 billion transistors

8-core CPU

4 high-performance cores
192 KB instruction cache
128 KB data cache
Shared 12 MB L2 cache

4 high-efficiency cores
128 KB instruction cache
64 KB data cache
Shared 4 MB L2 cache
(Apple claims the 4 high-effiency cores alone perform like a dual-core Intel MacBook Air)

8-core iGPU (but there is a 7-core variant, likely with one inactive core)
128 execution units
Up to 24576 concurrent threads
2.6 Teraflops
82 Gigatexels/s
41 gigapixels/s

16-core neural engine
Secure Enclave
USB 4

Products:
$999 ($899 edu) 13" MacBook Air (fanless) - 18 hour video playback battery life
$699 Mac mini (with fan)
$1299 ($1199 edu) 13" MacBook Pro (with fan) - 20 hour video playback battery life

Memory options 8 GB and 16 GB. No 32 GB option (unless you go Intel).

It should be noted that the M1 chip in these three Macs is the same (aside from GPU core number). Basically, Apple is taking the same approach which these chips as they do the iPhones and iPads. Just one SKU (excluding the X variants), which is the same across all iDevices (aside from maybe slight clock speed differences occasionally).

EDIT:



M1 Pro 8-core CPU (6+2), 14-core GPU
M1 Pro 10-core CPU (8+2), 14-core GPU
M1 Pro 10-core CPU (8+2), 16-core GPU
M1 Max 10-core CPU (8+2), 24-core GPU
M1 Max 10-core CPU (8+2), 32-core GPU

M1 Pro and M1 Max discussion here:


M1 Ultra discussion here:


M2 discussion here:


Second Generation 5 nm
Unified memory architecture - LPDDR5, up to 24 GB and 100 GB/s
20 billion transistors

8-core CPU

4 high-performance cores
192 KB instruction cache
128 KB data cache
Shared 16 MB L2 cache

4 high-efficiency cores
128 KB instruction cache
64 KB data cache
Shared 4 MB L2 cache

10-core iGPU (but there is an 8-core variant)
3.6 Teraflops

16-core neural engine
Secure Enclave
USB 4

Hardware acceleration for 8K h.264, h.264, ProRes

M3 Family discussion here:


M4 Family discussion here:

 
Last edited:

NTMBK

Lifer
Nov 14, 2011
10,269
5,134
136

If you read into the details- it's down to the fact that the M1 has unified memory architecture. For this workflow, not shuttling data back and forth across the PCIe bus is a huge win. This is what AMD have been saying since the Llano days

Hopefully this will light a fire under PC manufacturers to take integrated graphics more seriously, and for AMD and Intel to actually bring those products to market.
 

Qwertilot

Golden Member
Nov 28, 2013
1,604
257
126
If you read into the details- it's down to the fact that the M1 has unified memory architecture. For this workflow, not shuttling data back and forth across the PCIe bus is a huge win.

Still rather amazing to be beating such a total monster of a dGPU

This is what AMD have been saying since the Llano days

Hopefully this will light a fire under PC manufacturers to take integrated graphics more seriously, and for AMD and Intel to actually bring those products to market.

It seems really, really hard though. AMD have been making the console chips for a while now so there's obvious theoretical capability but still nothing 'real' coming out of it. Never mind getting the software to play along.
 

Hitman928

Diamond Member
Apr 15, 2012
5,603
8,806
136
If you mis-remember then it is your duty to remember it properly and not make misleading claims.


These are the flags used by Andrei:

Code:
-Ofast -fomit-frame-pointer
-march=x86-64
-mtune=core-avx2
-mfma -mavx -mavx2

Unless something has changed within the last year or two, for x264 those flags won't turn on SSE/AVX. If you turn off ASM then there is no code path to enable SSE/AVX, those instructions sets are built into the ASM code path and are left out if ASM is disabled no matter what your other flags are set as. So no, what I said the second time wasn't misleading. You can download and compile the software yourself and check if you want to.

Edit: Most likely those flags are fine for the rest of the software but I'm not as familiar with all of the sub-test software so I didn't want to speak on them specifically and don't know what specific flags/optimizations may be used in the release builds versus Anandtech's builds.
 
Last edited:

Hitman928

Diamond Member
Apr 15, 2012
5,603
8,806
136
I am not cherry picking. I am quoting Andrei Frumusanus general assessment of the competitive landscape regarding power and performance efficiency. Has it occurred to you that people like him are able to weight benchmark results not only based on some average means but also by their technical knowledge and experience? And has it occurred to you that you are the one cherry picking when he clearly states that CB is a negative outlier for the M1 Max?
The amount of ignorance you are showing tells me that you are not really interested in a grown-up discussion but instead will try to grasp at straws until the end of days.

Appeal to authority and ad hominem attacks don't mean much to me.

There is a lot I could discuss about why you are cherry picking, but I'll just stick with the initial argument and ask where are Andrei's numbers for Cezanne showing the M1 with a 2.5-3x efficiency lead since that is what was actually being discussed?
 
Jul 27, 2020
17,956
11,706
116

A guy with his own renderer and some pure CPU power comparisons. Unfortunately, few modern CPUs, however very interesting is versus the Xeon W-3245, i7 9750, and the Threadripper 3990X.
The battery results for M1 Max are excellent. Only at 4K rendering does it slow down slightly when battery powered.
 

Hitman928

Diamond Member
Apr 15, 2012
5,603
8,806
136

A guy with his own renderer and some pure CPU power comparisons. Unfortunately, few modern CPUs, however very interesting is versus the Xeon W-3245, i7 9750, and the Threadripper 3990X.

Really impressive showing. I do wish he had something that is an actual modern competitor, but it is a great result nonetheless.
 

biostud

Lifer
Feb 27, 2003
18,401
4,965
136
The 5950x is - 10bn transistors on 12nm I/O + 7nm CCX, and a 3080 is ~28bn on 8nm Samsung, M1 max is 57bn transistors on 5nm TSMC. Wouldn't it be pretty bad if it wasn't far more powerful and efficient than a 5950x+3080. Also how can Apple create such a huge monolithic chip? I wonder how many they have to have to scrap.
 

The Hardcard

Member
Oct 19, 2021
132
183
86
Really impressive showing. I do wish he had something that is an actual modern competitor, but it is a great result nonetheless.

He said it’s his hobby renderer. He could probably make a few bucks if he wanted by allowing other people to run it with those scenes. They are several orders of magnitude more complex than Cinebench, a real test of what current and near future architectures can do with rendering.
 

biostud

Lifer
Feb 27, 2003
18,401
4,965
136
Appeal to authority and ad hominem attacks don't mean much to me.

There is a lot I could discuss about why you are cherry picking, but I'll just stick with the initial argument and ask where are Andrei's numbers for Cezanne showing the M1 with a 2.5-3x efficiency lead since that is what was actually being discussed?
Especially how close the Threadripper was to the m1 in energy efficiency in 4k rendering.
 

JoeRambo

Golden Member
Jun 13, 2013
1,814
2,105
136
The 5950x is - 10bn transistors on 12nm I/O + 7nm CCX, and a 3080 is ~28bn on 8nm Samsung, M1 max is 57bn transistors on 5nm TSMC. Wouldn't it be pretty bad if it wasn't far more powerful and efficient than a 5950x+3080. Also how can Apple create such a huge monolithic chip? I wonder how many they have to have to scrap.

It's a sea of SRAM and GPU units that are pretty much redundant. Being able to sell cutdown versions works wonders for yield as well.
 

Viknet

Junior Member
Nov 14, 2020
9
10
51
Especially how close the Threadripper was to the m1 in energy efficiency in 4k rendering.
But this Threadripper 3990X has similar energy-efficiency only at 70% per-core performance of M1 Max.
Apple M1 in Macbook Air loses about 20% with the throttling from 15W to 7W, so I would expect M1 Max to be 2-3x more efficient than Threadripper at the same per-core performance.
 

naukkis

Senior member
Jun 5, 2002
782
636
136
Unless something has changed within the last year or two, for x264 those flags won't turn on SSE/AVX. If you turn off ASM then there is no code path to enable SSE/AVX, those instructions sets are built into the ASM code path and are left out if ASM is disabled no matter what your other flags are set as. So no, what I said the second time wasn't misleading. You can download and compile the software yourself and check if you want to.

Edit: Most likely those flags are fine for the rest of the software but I'm not as familiar with all of the sub-test software so I didn't want to speak on them specifically and don't know what specific flags/optimizations may be used in the release builds versus Anandtech's builds.

That's because it is spec. Spec is meant to benchmark difference in hardware not in software, so they offer source code without hand tuned assembly. And no, nobody could use any hand-tuned assembly in any subset of spec tests. Compiler is though free auto-vectorize what it can so SSE/AVX isn't turned off, only hand-tuned asssembly parts doesn't exists in spec testing.


Put it shorter - as Spec ia CPU benchmark designed to be used to cross-benchmark cpus with different instruction sets there's absolutely no way that any subset of spec includes any kind of assembly.
 
Last edited:

Hitman928

Diamond Member
Apr 15, 2012
5,603
8,806
136
That's because it is spec. Spec is meant to benchmark difference in hardware not in software, so they offer source code without hand tuned assembly. And no, nobody could use any hand-tuned assembly in any subset of spec tests. Compiler is though free auto-vectorize what it can so SSE/AVX isn't turned off, only hand-tuned asssembly parts doesn't exists in spec testing.


Put it shorter - as Spec ia CPU benchmark designed to be used to cross-benchmark cpus with different instruction sets there's absolutely no way that any subset of spec includes any kind of assembly.

Yes, in my previous post I agreed that it makes sense to disable ASM. A couple years ago, when I was running some of these tests myself, the one that really stuck out to me was x264 as once I turned off ASM, whether I had SSE/AVX and Zen flags set or not made no difference in performance. Now, this was a couple of years ago with Zen+ where I was matching the compiler Anandtech was using at the time, so it is possible that their move to a more modern version allows for better auto-vectorization on the Zen family. If I have time this week I'll try try and get the LLVM compiler setup to match theirs and try it again. I am pleased to see that Anandtech is using LLVM for x86 which at least is a better match than GCC when comparing against Apple's LLVM based compiler.
 

Doug S

Platinum Member
Feb 8, 2020
2,493
4,060
136
Here's a review of an x86 mobile APU:


Look at all of the benchmarks in that article. Now look back at their article on the M1 Pro/Max. See any differences?


I see them running a bunch of Windows specific applications, plus some cross platform stuff like GIMP which may or may not have a macOS port and a lot of the Windows stuff probably lacks a Windows/ARM port too. What are you expecting Anandtech to do, port a bunch of Windows stuff to the Mac so they can give you more benchmarks?
 

Doug S

Platinum Member
Feb 8, 2020
2,493
4,060
136
Interesting how the AT review states that Apple hasn't defined any max TDP for their SoC and it just goes as far as it can until thermal conditions prevent going any further. Someone needs to test an M1 Max in Alaska in minus temperatures. This also suggests that the Mac Pro with water cooling could be formidable.


I think you're misunderstanding what he's saying. There is no turbo, so it won't run faster with better cooling. It will only throttle less - and he notes that it is really hard to reach a situation where that happens - see his comment about the "high power mode" only being useful for something like running an overnight render and would have made no difference in the results he reported.
 

jeanlain

Member
Oct 26, 2020
159
136
86
I see them running a bunch of Windows specific applications, plus some cross platform stuff like GIMP which may or may not have a macOS port and a lot of the Windows stuff probably lacks a Windows/ARM port too. What are you expecting Anandtech to do, port a bunch of Windows stuff to the Mac so they can give you more benchmarks?
++
To compare CPUs with the same ISA on the same OS, you have a lot of apps and benchmark tools at your disposal. Performance differences should mostly reflect hardware differences.
Comparing CPUs with different ISAs running different OSes is another matter entirely. Different APIs (DX vs Metal for instance), different degrees of optimisation, etc.
 

StinkyPinky

Diamond Member
Jul 6, 2002
6,830
877
126
++
To compare CPUs with the same ISA on the same OS, you have a lot of apps and benchmark tools at your disposal. Performance differences should mostly reflect hardware differences.
Comparing CPUs with different ISAs running different OSes is another matter entirely. Different APIs (DX vs Metal for instance), different degrees of optimisation, etc.

I think you guys are missing the point here. It doesn't have to be a like for like comparison. If they use a different API, so be it. I'm not interested in just the hardware being tested, but how the hardware using said software performs against Windows based laptops.

I strongly disagree that reviews should "mostly reflect hardware differences". On the contrary, they should reflect the ecosystem the hardware is based in just as much as the hardware itself.
 
Reactions: Tlh97 and scannall

Eug

Lifer
Mar 11, 2000
23,752
1,285
126
Teardown:


Ports appear to be modular swappable parts, not soldered to the logic board. If true, colour me surprised. If its not true, then it looks like those ports are extremely heavily reinforced at least.



Also, the batteries are no longer glued in. They are stuck on with iPhone-style adhesive tape with regular pull tabs to ease removal.

The fans are not on top of the logic board. Instead, the logic board is shaped like a W with curved cutouts in which the fans sit.



M1 Max package is giant of course.
 

insertcarehere

Senior member
Jan 17, 2013
639
607
136
Especially how close the Threadripper was to the m1 in energy efficiency in 4k rendering.

The guy did a back of the napkin calculation using maximum TDP figure * time run, and assumed the M1 Max has a 60w TDP. So this efficiency comparison only makes sense if TDP == Power consumption
- We know that the M1 Max (including DRAM) consumes nowhere near 60w in CPU tasks
-The other chips in the comparison also have increasingly tenuous relationships between listed TDP and power consumption as well.
 
Last edited:

yottabit

Golden Member
Jun 5, 2008
1,375
240
116
So any speculation on how they will handle desktop Mac Pro? Starting with "will there be one"?

Next, would it be of an SOC design? Or separated higher core count (16-24) M1 family CPU with PCIe bus for NVMe drive and discrete graphics?

It seems like it would be a shame to give up the benefits of the unified memory from what we have seen so far. But at the same time I can't imagine them making a die much larger than the m1 max.

I could see an SOC with similar die space but a higher power, higher frequency design. But what about expandability people would expect from Mac Pro?

Could it be something expandable with additional m1 "compute modules" on some proprietary slot?

It makes me excited to armchair speculate this one since it all seems so wild to me, and I really don't know what to expect. I personally think an m1 CPU with some on package memory and PCIe slot expandabiity is most likely.

And something like a Mac Mini Pro with 0 traditional expandability beyond ( maybe drive bays) is second most likely.
 

Eug

Lifer
Mar 11, 2000
23,752
1,285
126
I thought this test was interesting since it includes a score that caps power utilization for an Intel mobile chip.

V-Ray CPU running through Rosetta on 10-core M1 Pro, compared against reported scores for i9-11980HK running at full tilt and i9-11980HK capped at 45 Watts.



 

Roland00Address

Platinum Member
Dec 17, 2008
2,196
260
126
Teardown:
Wow my brain is still processing this.

Part of me days earlier was wishing for more ports, and we probably could jam some extra ports in there with some redesign. But my brain is still processing how they made everything mostly modular and replaceable at apple that my earlier wish is no longer registering for I feel happiness.
 

Eug

Lifer
Mar 11, 2000
23,752
1,285
126
Wow my brain is still processing this.

Part of me days earlier was wishing for more ports, and we probably could jam some extra ports in there with some redesign. But my brain is still processing how they made everything mostly modular and replaceable at apple that my earlier wish is no longer registering for I feel happiness.
I guess that is one reason why the machine is thicc and heavy.

BTW, the modular ports are definitively confirmed. Here is the W shaped motherboard, with big SoC + RAM package, with the ports removed. The 1 yuan coin is 2.5 cm (1 inch) across. Note the fan cutouts in the mobo.



(That top part in the pic is the removed heat sink / heat pipe flipped upwards.)

 
Last edited:
Reactions: lightmanek

Doug S

Platinum Member
Feb 8, 2020
2,493
4,060
136
I think you guys are missing the point here. It doesn't have to be a like for like comparison. If they use a different API, so be it. I'm not interested in just the hardware being tested, but how the hardware using said software performs against Windows based laptops.

I strongly disagree that reviews should "mostly reflect hardware differences". On the contrary, they should reflect the ecosystem the hardware is based in just as much as the hardware itself.

If you use a specific app then by all means compare how that app performs on various hardware, and if some hardware is saddled with a poor quality port or has to run under emulation that's a disadvantage you want to know about - you should get what performs best for you.

If however you are trying to compare two platforms against one another, doing your comparisons in that way makes the results much less useful. If you and I had a contest to see how many pushups we could do in a minute, and you do proper military style pushups with someone watching you and only counting the ones you do with proper form, and I'm just doing whatever I consider a pushup and counting them myself, would you consider that fair? That's basically what you're suggesting by "it should reflect the ecosystem" - so in our bet if we do them differently well that's too bad for whoever is following stricter form.
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |