Discussion Apple Silicon SoC thread

Page 333 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Eug

Lifer
Mar 11, 2000
23,797
1,370
126
M1
5 nm
Unified memory architecture - LP-DDR4
16 billion transistors

8-core CPU

4 high-performance cores
192 KB instruction cache
128 KB data cache
Shared 12 MB L2 cache

4 high-efficiency cores
128 KB instruction cache
64 KB data cache
Shared 4 MB L2 cache
(Apple claims the 4 high-effiency cores alone perform like a dual-core Intel MacBook Air)

8-core iGPU (but there is a 7-core variant, likely with one inactive core)
128 execution units
Up to 24576 concurrent threads
2.6 Teraflops
82 Gigatexels/s
41 gigapixels/s

16-core neural engine
Secure Enclave
USB 4

Products:
$999 ($899 edu) 13" MacBook Air (fanless) - 18 hour video playback battery life
$699 Mac mini (with fan)
$1299 ($1199 edu) 13" MacBook Pro (with fan) - 20 hour video playback battery life

Memory options 8 GB and 16 GB. No 32 GB option (unless you go Intel).

It should be noted that the M1 chip in these three Macs is the same (aside from GPU core number). Basically, Apple is taking the same approach which these chips as they do the iPhones and iPads. Just one SKU (excluding the X variants), which is the same across all iDevices (aside from maybe slight clock speed differences occasionally).

EDIT:



M1 Pro 8-core CPU (6+2), 14-core GPU
M1 Pro 10-core CPU (8+2), 14-core GPU
M1 Pro 10-core CPU (8+2), 16-core GPU
M1 Max 10-core CPU (8+2), 24-core GPU
M1 Max 10-core CPU (8+2), 32-core GPU

M1 Pro and M1 Max discussion here:


M1 Ultra discussion here:


M2 discussion here:


Second Generation 5 nm
Unified memory architecture - LPDDR5, up to 24 GB and 100 GB/s
20 billion transistors

8-core CPU

4 high-performance cores
192 KB instruction cache
128 KB data cache
Shared 16 MB L2 cache

4 high-efficiency cores
128 KB instruction cache
64 KB data cache
Shared 4 MB L2 cache

10-core iGPU (but there is an 8-core variant)
3.6 Teraflops

16-core neural engine
Secure Enclave
USB 4

Hardware acceleration for 8K h.264, h.264, ProRes

M3 Family discussion here:


M4 Family discussion here:

 
Last edited:

The Hardcard

Member
Oct 19, 2021
199
288
106
That maybe freezer test vs an almost-certainly freezer iPhone 15 PM test: https://browser.geekbench.com/v6/cpu/compare/7735922?baseline=7747963

Interesting to note the spread of losses and gains amongst in the ST tests. Wonder what they tweaked this time. A few of the tests have straight up regressions in IPC.

The MT results imply that the efficiency cores did not get SME but still continues the trend of getting faster at a faster pace.

Doing some napkin math while focusing on the Clang subtest in MT (clang/llvm subtests' characteristics supposedly better resemble day-to-day user tasks), seems like E cores are about 40% faster.
On the M4, SME is there, but the unit is significantly less capable. One important thing to remember about SME is that it on a coprocessor. There is only one unit per cluster, so multicore SME is only two units. This also means that A18 is equal to M4 and M4 Pro, iso frequency.
 

name99

Senior member
Sep 11, 2010
490
379
136
I don’t know and can’t speak factually, but I got the impression that the issues wasn’t based on meeting the standards, but how to do operate a modem efficiently. There were claims the at one point Apple had a oversized modem that was half the size of the iPhone logic board and required far more than practical power to run. Nothing about not handling the standards.
Is this an example of a clueless journalist?
HTF do people think you design a modem? Just write up a chip in verilog, send it to TSMC, and hope it comes back working?

EVERY Apple product - watch, iphone, airpods, and yes, modem starts off life as some huge board (or multiple boards) with multiple FPGAs, and gets refined over time as one element after the next is perfected on those boards.
 

name99

Senior member
Sep 11, 2010
490
379
136
Searched on the GB scores for the various A18/Pro results. They seem to be quite all over the place which I guess isn’t too unexpected so early on.
Did find this gem though, a 16 Pro:
3429 ST, 8790 MT. My guess is this will be near as high as it can go.
Maybe not!
a) SME code gen can probably be improved. (My understanding is the existing SME code in GB6 is handwritten, and it's possible that some of the other benchmarks exhibit some matrix manipulation that could be be auto-detected and SME'd. Like autovectorization how well this works depends on how cleanly the code is written...)
Even the handwritten code present in GB6 may not make optimal use of tricks that allow for, eg, rapid matrix transpose or the very newest SME two- and four-wide instructions.

b) We still don't know whether A18 is "just" the M4 cores or if the time between M4 and A18 was used to improve those cores. Small improvements (eg another fusion target) can be targeted by the compiler giving slight performance boosts, but the big possibilities are SVE and a performant (as opposed to barely functioning) SSVE. If those are present but not being exploited we have another 10% or more available.

Do we have DEFINITIVE word yet on SVE or not?
 

The Hardcard

Member
Oct 19, 2021
199
288
106
Is this an example of a clueless journalist?
HTF do people think you design a modem? Just write up a chip in verilog, send it to TSMC, and hope it comes back working?

EVERY Apple product - watch, iphone, airpods, and yes, modem starts off life as some huge board (or multiple boards) with multiple FPGAs, and gets refined over time as one element after the next is perfected on those boards.
No, the comments came from industry insiders. The description is the metrics of the final chip. It wouldn’t be FPGA work and not likely ever physical. Once you lay out a design based on a node’s PDK, the EDA tools are designed to give a close estimate of the power, performance, and area (PPA) of the final silicon.

To be sure, again these are unverified claims, but there’s no point in talking about anything but the PPA characteristics of final silicon estimates from a node specific layout.
 
Reactions: Jan Olšan

GC2:CS

Member
Jul 6, 2018
31
19
81
Geekerwan interviewed Johny Srouji, who is in charge of Apple Silicon !!!


I am sorry. But this just reinforces me in a

*puts on a tinfoil hat*

conspiracy that Geeker wan basicly does testing and reports on Apples command. They get the hardware and possibly help from the sw side as well to get the results Apple WANTS to let out - my guess is to make it look a bit worse so that competition that is too greedy to do their own testing rests comfortable.

Can you imagine anybody form apple would do this at anadtech ? Or idk Notebook check, gamers nexus ? Or any website that does the reviews for their own money ? If they did the Apple representative would be torn to pieces and stay quiet aggainst criticism he is not allowed to answer.

Did not see the video but I expect it to be a johny talking and Wan just agreeing on all with a smilley face. All very froendly.

The mission is - kill any site doing independent in depth silicon reviews - (this is not only on Apple - all big tech is guilty of this).
And they have resounding sucess.

*puts down the aluminium pan*
 

poke01

Platinum Member
Mar 8, 2022
2,004
2,542
106
Ahh you are ruining the impresion that I understand everything perfectly, by pointing out some little imperfections.
It’s an Apple interview it will be like a political interview and I went in expecting that. Still it was far better than any interview with Tim Cook and John Ternus.

As for your theory let it remain a theory cause it ain’t true. Geekerwan does make fun of Apple interview their videos and as long as you don’t go full Louis Rossman you are fine with having contacts within Apple PR.
 
Reactions: mvprod123

Doug S

Platinum Member
Feb 8, 2020
2,711
4,602
136
So looking at this A17P die shot, you can easily pick out one GPU core, one 12 MB SLC, and one 8MB L2, which are the known differences between A18 and A18P. That makes it easy to calculate how much die space they'd save if A18 its own die instead of being a cut down A18P. Just eyeballing it, it looks like that would save roughly 8% of the die area, or perhaps 8.5 mm^2 assuming it is a bit larger than A17P.

Would that be worth doing a separate design? Perhaps. The amount of recovery they could do from otherwise good A18P dies by cutting off one GPU core, half of the P core L2 and half of the SLC is pretty small (and they could still recover/use those dies as A18s) But it is still a pretty small savings, and probably not worth the effort unless this is the just the first step along the path to potential greater future differences between the designs (i.e. true functional differences not just differing in cores/cache)

 

name99

Senior member
Sep 11, 2010
490
379
136
I wonder how half the L2 and half the SLC affect the performance.
Are those claims of half based on reality, or based on your hypothesis?

A large fraction of die area is made up of caches, and most of those caches (eg CPU L2, GPU L2, and SLC) consist of a number of ways. Suppose, for example that the CPU L2 consists of 12 ways.
It would be a reasonable design procedure to manufacture the L2 with 13 ways, and a flexibility for each set (either by a runtime test or by fusing) to kill one of the ways, so that all dies that suffer a few defects in this L2 (but the defects only hit one way of a set) are still usable, with the L2 cache controller and everything designed to use only 12 ways. This has been standard practice for years.
The next evolution could be to do the same as above, but tweak the cache controller so that it can be programmed to handle either 12 or 11 ways per set. Now you can also recover dies that have the occasional two ways of a set broken.

With a scheme like this, you now define two tiers: one has 5 GPU cores, 11 L2 ways and (I don't) 22 of 24 SLC ways working; the next tier has 6 GPU cores, 12 L2 ways and 24 (of 25 or 26 manufactured) SLC ways working. By doing this you can maybe boost your yield from (again, I don't know, making up numbers)
50% (no backup ways in L2 and SLC) to
70% (one backup way in L2 and SLC) to
95% (still able to harvest dies with a bad GPU or with the occasional two ways dead in an L2 and/or SLC set)

Point is, if we are die harvesting, my guess is the difference between these two cache sizes is ~10%. If it's a factor of 2x, then this is not being driven by die harvesting but by market segmentation decisions.
 

FlameTail

Diamond Member
Dec 15, 2021
3,771
2,224
106
Interesting that both Performance Cores and Efficiency Cores are different in the A18 and A18 Pro: Tahiti vs Tupai.
It would seem that there are more differences than simply the L2 cache and SLC.
Tahiti and Tupai are the SoC codenames. Apple doesn't expose core codenames (such as Avalanche, Sawtooth, Lightning etc..) anymore. For example, M4's SoC codename was Donan, and the P-core was named Donan-P, and the E-core was named Donan-E.

The P-core in Tahiti is called Tahiti-P, and the E-core in Tahiti is called Tahiti-E. Same for Tupai.

That doesn't mean they are different cores. Most probably there are the same cores.
 

The Hardcard

Member
Oct 19, 2021
199
288
106
View attachment 107570
L2 is 8MB and SLC is 12MB on A18 and double on A18 Pro.

There is a lot of interesting information on this Chinese microbenching site. I can’t tell if this is Geekerwan; the font looks like the one he uses in his review videos.

If he’s pulling accurate information, it is another data point showing a new direction for the Apple Silicon design team. It would match the example in the Apple Silicon CPU Optimization Guide.

The first thing here is that not only did they stop naming The performance and efficiency cores separately , but they also appear to give each chip a bespoke CPU and GPU design. Srouji references this in the Geekerwan interview. It caught my eye was that the A18 and the A18 Pro have different names for the CPU cores. It would appear, though that the only difference is the size of the L2 cache for the performance cores.

It would be great if Apple would release an updated guide shortly after the release of the fall Mac. hardware.

A18. Tahiti

A18 Pro Tupai

A17 Pro Coll

M4 Donan

M3 Ibiza

M3 Pro Lobos

M3 Max Palma

I wonder how half the L2 and half the SLC affect the performance.

The difference in performance probably won’t matter for people who are buying theA18. The cache is needed most in the scenarios that would drive users to choose one of the iPhone 16 Pros. Gaming, compute intensive video effects , etc. For basic compute workloads they appear to run the same - the top scores for Geekbench are nearly identical. That would probably be representative of the performance for people who are not buying the Pros.
 
Last edited:

The Hardcard

Member
Oct 19, 2021
199
288
106
Tahiti and Tupai are the SoC codenames. Apple doesn't expose core codenames (such as Avalanche, Sawtooth, Lightning etc..) anymore. For example, M4's SoC codename was Donan, and the P-core was named Donan-P, and the E-core was named Donan-E.

The P-core in Tahiti is called Tahiti-P, and the E-core in Tahiti is called Tahiti-E. Same for Tupai.

That doesn't mean they are different cores. Most probably there are the same cores.
I think they are the same here, just that they don’t have to be in general. From this naming scheme I infer that Apple now makes decisions about each chip without worrying about what is in the others released at the same time.

They will put the same design in two different chips if it suits their goals for each one independently. If they have a different target for another chip, they will change the design accordingly.
 

name99

Senior member
Sep 11, 2010
490
379
136
What? Did you not see the image posted by...
Um, am I missing something?
I watched to the Geekerwan interview w/ Srouji and saw zero NUMBERS mentioned.

OK, I see, it's the successor post.
But question is why should I trust that post? My point is, EVERY FSCKING YEAR we get hysterical web sites putting together pages that look definitive, but they are simply collections of rumors.

Are those claims based on any actual MEASUREMENTS???

(And yeah, I didn't see the original posts bcs @poke01 is on my ignore list -- can't remember why, but quite possibly for posting far too many "rumors claimed as fact"...
 
Last edited:

The Hardcard

Member
Oct 19, 2021
199
288
106
Um, am I missing something?
I watched to the Geekerwan interview w/ Srouji and saw zero NUMBERS mentioned.
He is referring to the image and link posted by poke01 directly above the one you replied to. The post image that is of the charts for A18 and A18 Pro. The link has charts featuring claimed information about all Apple Silicon.

Specifically it claims to have the (measured?) cache sizes for each chip including the just released phone chips.
 

poke01

Platinum Member
Mar 8, 2022
2,004
2,542
106
That site has provided accurate information even down to clocks but for newly released products I would be cautious till others like Geekerwan get actual measurements. Treat them as reference until we get confirmation from multiple sources.
 
Reactions: FlameTail

digitaldreamer

Junior Member
Mar 23, 2007
9
3
81
I also heard we may have ports on the front, too. Yay. One rumor was that only the Pro would have the 5 ports, with the non-pro having only 3 ports.

Either way would be a welcome change from the past. Plugging and unplugging from the back of iMac or Mac mini is a PITA when you don't have a hub.
 

Doug S

Platinum Member
Feb 8, 2020
2,711
4,602
136
I also heard we may have ports on the front, too. Yay. One rumor was that only the Pro would have the 5 ports, with the non-pro having only 3 ports.

Either way would be a welcome change from the past. Plugging and unplugging from the back of iMac or Mac mini is a PITA when you don't have a hub.

Jony Ive took his views on design purity too far after Jobs died and there was no one who could tell him "no".
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |