Discussion Mediatek SoC thread

Page 11 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

FlameTail

Diamond Member
Dec 15, 2021
4,097
2,469
106
Important screenshot from Geekerwan's Dimensity 9400 review;

D9300 vs D9400 architecture

SPEC Scores and Power Curves;



Cortex X925 has a bigger uplift in FP than in INT.

Despite being the same A720 core uarch, Mediatek's D9400's A720 has improved efficiency, probably thanks much in part to the N3E process.

Geekbench 6 CPU Power Curve;

3DMark Steel Nomad Light GPU Power Curve;

Dimensity 9400 brings huge improvements in the lower end of the power curve. This will be very meaningful for phones.
 

FlameTail

Diamond Member
Dec 15, 2021
4,097
2,469
106

You can slash the D9400's power by half, and it still retains 90% of the performance. That means they doubled the power to get 10% more performance!

This kind of behaviour is common in desktop/laptop CPUs, where the cores are often pushed beyond 5 GHz.
But I am surprised to see such scaling in a smartphone chip, which maxes out at only 3.6 GHz.
 
Reactions: jdubs03

MS_AT

Senior member
Jul 15, 2024
311
691
96
Is Geekerwan still using clang 14.0.6 without any march flags? If so just keep in mind that for ARM64 <-> x64 comparisons in Spec, x64 will not use anything better than SSE while ARM will use Neon [as it cannot use anything better anyway], in other words AVX+ will be unused on x64 targets. Which will affect the scores to some degree. Now, I am not saying his methodology is wrong, just to keep this in mind as this is a consequence of the setup he is using.
 
Reactions: FlameTail

Nothingness

Diamond Member
Jul 3, 2013
3,134
2,145
136
Is Geekerwan still using clang 14.0.6 without any march flags? If so just keep in mind that for ARM64 <-> x64 comparisons in Spec, x64 will not use anything better than SSE while ARM will use Neon [as it cannot use anything better anyway], in other words AVX+ will be unused on x64 targets. Which will affect the scores to some degree. Now, I am not saying his methodology is wrong, just to keep this in mind as this is a consequence of the setup he is using.
SSE is 128-bit like NEON so in a way it's like-to-like, but I agree with you he should use native flags when compiling. And switch to a more recent version of clang. The question being: does he still have access to the CPU he makes comparisons against?
 

FlameTail

Diamond Member
Dec 15, 2021
4,097
2,469
106
The GPU isn't going to be the only stellar thing about the Nvidia-Mediatek ARM SoC for PCs.

The Media Engine is also going to be superb if it is based on Nvidia's NVENC/NVDEC.

Nvidia, Apple and Intel have the best media engines in this industry.
 

MS_AT

Senior member
Jul 15, 2024
311
691
96
SSE is 128-bit like NEON so in a way it's like-to-like, but I agree with you he should use native flags when compiling. And switch to a more recent version of clang. The question being: does he still have access to the CPU he makes comparisons against?
On the throughput front yes, but if AVX2 is available then you get 2x the throughput with the same number of instructions, something that will help x64 if compiler decides to use it But of course this then depends if you want to even the playing field, or show best side of each CPU. I wish when presenting SPEC results reviewers would have a brief paragraph about consequences and motivation of their compiler choices, leaving their audience more aware of what is going on.
 

Nothingness

Diamond Member
Jul 3, 2013
3,134
2,145
136
On the throughput front yes, but if AVX2 is available then you get 2x the throughput with the same number of instructions, something that will help x64 if compiler decides to use it But of course this then depends if you want to even the playing field, or show best side of each CPU. I wish when presenting SPEC results reviewers would have a brief paragraph about consequences and motivation of their compiler choices, leaving their audience more aware of what is going on.
You're preaching to the choir

In CPU design we often have to rely on older benchmarks built with obsolete compilers, because what we're interested in are improvements over previous generations, not best absolute performance. From time to time these benchmarks get updated, but the frequency is not high as it does cost a lot to run them on previous designs.

Geekerwan might be in a similar position with the added difficulty that possibly he doesn't have access to older devices anymore, so he can't just recompile with a newer compiler/flags combo and make comparisons (and if he does, then the results are trash).

Andrei did a good job about presenting his choices and results. I'm not sure all reviewers do (perhaps David Huang?).
 
Reactions: Tlh97 and MS_AT

DZero

Member
Jun 20, 2024
193
74
61
The A720 "small" core is the star of this one. Makes me think if they are ready to replace all the small cores with low powered big cores.
 

Raqia

Member
Nov 19, 2008
61
30
91
Yes Arm had to up the game, though it has to be seen how high X925 can be clocked in devices. IPC is not the whole story.

IMHO Arm upped the game starting with Apple first 64-bit chips which caught everyone by surprise. This can be seen for instance in David Huang perf per clock on SPEC for various Arm chips, many of which precedes Qualcomm chip. Just remember a CPU needs several years of development.
It was a massive jolt to the industry:


That Apple released its implementation so soon after the ARMv8 spec announcement started a bit of continuous shock-and-awe. It's likely Apple submitted its implementation as the 64-bit standard working closely with ARM and kept running with that lead for years. ARM's own implementations in the A57 were jokes in comparison and were a disaster for Qualcomm in the 810/808 era. (One almost wonders why Apple didn't just keep the ISA for itself and its ecosystem...)

Qualcomm isn't so different here in that its initial implementations for cellular air interfaces are submitted to the 3GPP as the standards for generations of cellular networks. (Except designing and implementing these designs across net connected towers and mobile subscription stations are far far more sprawling, difficult, and more capital intensive to this day. CPU design is much more relatively self-contained and much more available even for high performance designs.) It is a good thing that Qualcomm has submitted these standards and is bound by SEP licensing terms here or cellular network deployments would be in a more fractured and primitive shape or in the hands of those with politically questionable motives (to some).

Apple's lawsuit and withholding of cellular royalties to Qualcomm was a real kick to the nuts, and its acquisition of Intel's modem unit was a bit of a parting shot on the way out the door when the lawsuit failed. Qualcomm is getting its shot back after the contested departure of lead Apple CPU designers now with its still hotly contested acquisition of Nuvia (very questionable and public lawsuit by ARM; if it cared to truly expand its ecosystem, why not quieter arbitration?). With Qualcomm an essential enabling force for Android SoCs which are often foundationally cheaper due to their better integration of modems (I wouldn't rule out Qualcomm helping android ecosystem partners with modem and RF implementation), I think this is much more of an active but cold-war than people suspect.
 

FlameTail

Diamond Member
Dec 15, 2021
4,097
2,469
106
The Mali GPU in the 9400 wasn't much changed in cluster count and has the same number of compute units as in the 9300 with more cache.
That's what makes the performance and performance-per-watt uplift of the Dimensity 9400's GPU all the more remarkable.
 

Doug S

Platinum Member
Feb 8, 2020
2,836
4,823
136
It was a massive jolt to the industry:


That Apple released its implementation so soon after the ARMv8 spec announcement started a bit of continuous shock-and-awe. It's likely Apple submitted its implementation as the 64-bit standard working closely with ARM and kept running with that lead for years. ARM's own implementations in the A57 were jokes in comparison and were a disaster for Qualcomm in the 810/808 era. (One almost wonders why Apple didn't just keep the ISA for itself and its ecosystem...)

Qualcomm isn't so different here in that its initial implementations for cellular air interfaces are submitted to the 3GPP as the standards for generations of cellular networks. (Except designing and implementing these designs across net connected towers and mobile subscription stations are far far more sprawling, difficult, and more capital intensive to this day. CPU design is much more relatively self-contained and much more available even for high performance designs.) It is a good thing that Qualcomm has submitted these standards and is bound by SEP licensing terms here or cellular network deployments would be in a more fractured and primitive shape or in the hands of those with politically questionable motives (to some).

Apple's lawsuit and withholding of cellular royalties to Qualcomm was a real kick to the nuts, and its acquisition of Intel's modem unit was a bit of a parting shot on the way out the door when the lawsuit failed. Qualcomm is getting its shot back after the contested departure of lead Apple CPU designers now with its still hotly contested acquisition of Nuvia (very questionable and public lawsuit by ARM; if it cared to truly expand its ecosystem, why not quieter arbitration?). With Qualcomm an essential enabling force for Android SoCs which are often foundationally cheaper due to their better integration of modems (I wouldn't rule out Qualcomm helping android ecosystem partners with modem and RF implementation), I think this is much more of an active but cold-war than people suspect.


I don't buy for a second that "Apple submitted its implementation as the 64 bit standard". In fact, this is provably false. ARM announced ARMv8 / AArch64 in October 2011, and Apple shipped their first 64 bit ARM implementation in the iPhone 5S in September 2013, almost two years later. The tapeout deadline would have meant its design was complete a year after ARM's official announcement. What Apple taped out pretty much the same time as ARM's announcement was the A6, which included Apple's first custom core design.

Apple did not have an "implementation" to submit prior to October 2011 - and it would have been well prior to that because you can't just turn around the volume of documentation ARM provides in a week or a month. It is well known that Apple worked with ARM to define AArch64, so they would have had early knowledge of what it contained, allowing them to use that information as A7's design was underway. They were aggressive in pushing it out as early as they reasonably could have, that's all. Not sure why people think that ARM would allow a third party to define something as important for their future as their 64 bit ISA. For an optional feature like SME, sure, I could buy that Apple designed AMX and then told ARM "hey do you want this to be a part of the standard" and it was tweaked a bit, had its name changed, and became SME. But not for AArch64, not a chance!

The reason everyone else was left behind is because while ARM's designers could have used the early information to work on a 64 bit design like Apple did, they chose not to do so. Supposedly people in ARM itself were caught flat footed by how quickly Apple shipped their first 64 bit design, so I presume Apple kept that secret as long as they could. I believe there is some requirement for architectural licensees to pass conformance tests, but Apple could have waited until a couple months before the A7/iPhone 5S was released, and ARM may have assumed what was being submitted to them was an early prototype of something shipping in a year rather than a month.
 

Raqia

Member
Nov 19, 2008
61
30
91
I don't buy for a second that "Apple submitted its implementation as the 64 bit standard". In fact, this is provably false. ARM announced ARMv8 / AArch64 in October 2011, and Apple shipped their first 64 bit ARM implementation in the iPhone 5S in September 2013, almost two years later. The tapeout deadline would have meant its design was complete a year after ARM's official announcement. What Apple taped out pretty much the same time as ARM's announcement was the A6, which included Apple's first custom core design.

Apple did not have an "implementation" to submit prior to October 2011 - and it would have been well prior to that because you can't just turn around the volume of documentation ARM provides in a week or a month. It is well known that Apple worked with ARM to define AArch64, so they would have had early knowledge of what it contained, allowing them to use that information as A7's design was underway. They were aggressive in pushing it out as early as they reasonably could have, that's all. Not sure why people think that ARM would allow a third party to define something as important for their future as their 64 bit ISA. For an optional feature like SME, sure, I could buy that Apple designed AMX and then told ARM "hey do you want this to be a part of the standard" and it was tweaked a bit, had its name changed, and became SME. But not for AArch64, not a chance!

The reason everyone else was left behind is because while ARM's designers could have used the early information to work on a 64 bit design like Apple did, they chose not to do so. Supposedly people in ARM itself were caught flat footed by how quickly Apple shipped their first 64 bit design, so I presume Apple kept that secret as long as they could. I believe there is some requirement for architectural licensees to pass conformance tests, but Apple could have waited until a couple months before the A7/iPhone 5S was released, and ARM may have assumed what was being submitted to them was an early prototype of something shipping in a year rather than a month.
Former iOS kernel programmer:


I think it's fair to say that Apple had a very big part to play in the ISA design, modeling and prototyping. They may very well have had a u-Arch already well underway prior to discussing with ARM the details of how a compatible ISA would look.
 

gdansk

Diamond Member
Feb 8, 2011
3,110
4,840
136
Former iOS kernel programmer:


I think it's fair to say that Apple had a very big part to play in the ISA design, modeling and prototyping. They may very well have had a u-Arch already well underway prior to discussing with ARM the details of how a compatible ISA would look.
Can you paste the Twitter quotes? That site seldom works for me and I can't view it.
 

Raqia

Member
Nov 19, 2008
61
30
91
Replying to David Kanter's post:

The things that make the M1 good have nothing to do with ARM. That cache is impressive.

Shac Ron said in this thread:

1) The premise here is wrong. arm64 is the Apple ISA, it was designed to enable Apple’s microarchitecture plans. There’s a reason Apple’s first 64 bit core (Cyclone) was years ahead of everyone else, and it isn’t just caches.

2) Arm64 didn’t appear out of nowhere, Apple contracted ARM to design a new ISA for its purposes. When Apple began selling iPhones containing arm64 chips, ARM hadn’t even finished their own core design to license to others.

3) ARM designed a standard that serves its clients and gets feedback from them on ISA evolution. In 2010 few cared about a 64-bit ARM core. Samsung & Qualcomm, the biggest mobile vendors, were certainly caught unaware by it when Apple shipped in 2013.

4) Apple planned to go super-wide with low clocks, highly OoO, highly speculative. They needed an ISA to enable that, which ARM provided.M1 performance is not so because of the ARM ISA, the ARM ISA is so because of Apple core performance plans a decade ago.
 
Last edited:

gdansk

Diamond Member
Feb 8, 2011
3,110
4,840
136
Replying to David Kanter's post:

"The things that make the M1 good have nothing to do with ARM. That cache is impressive."
They called it the Apple ISA but then contradict themselves in the very next points?

If I'm reading that right... "Apple contracted ARM" to design a 64 bit ISA extension. And then "ARM designed a standard."
 

Raqia

Member
Nov 19, 2008
61
30
91
If I'm reading that it sounds like Apple contracted ARM to design a 64 bit ISA extension. And that they did. Not that Apple designed ARM v8.
The point: it is clear that ARM's 64-bit v8 ISA adhered closely to the u-Arch apple was developing internally if those posts are to be believed. Apple was far from a standard 3rd party in this relationship. Standardizing and polishing an instruction set / binary interface against that u-arch design for ecosystem compatibility wasn't the heavy lift, but having such clout in shaping the ARMv8 spec certainly set Apple up then for performance domination in the years to come.

ARM's and other vendors' u-arch implementations of this ISA for the Android side were never quite as wide as Apple's each product year and contained questionable structures like micro-op caches that were eventually removed and never saw the light of day in Apple's implementations. They were clearly not allowed to re-implement Apple's designs for the broader ecosystem, and were also restricted by the lesser die space of Android vendors could dedicate to a CPU cluster as most chose to implement modems on die as well. Apple in choosing to go so wide and big with CPU had a good way to keep some kind of parity of bang per die area used with competitors and weren't required to generate the SoC level profit margins which silicon merchant vendors need which necessitated relatively parsimonious CPU block sizes in Android SoCs for years; Apple just needs a fat margin on its end product.
 
Last edited:

poke01

Platinum Member
Mar 8, 2022
2,352
3,081
106
ARM's and other vendors' u-arch implementations of this ISA for the Android side were never quite as wide as Apple's each product year and contained questionable structures like micro-op caches that were eventually removed and never saw the light of day in Apple's implementations. They were clearly not allowed to re-implement Apple's designs for the broader ecosystem, and
It’s not others we’re not allowed to implement Apples designs, it’s that for the scope of a phone. Those cores were enough and ARM didn’t care to make beefy cores. Apple’s whole goal was to replace Intel on the Mac and it needed beefy cores and a laid out a plan.

ARM also went 10-wide decode before Apple did. Cortex X4 was 10-wide but A17/M3 were 9-wide, it wasn’t till the A18/M4 did Apple go 10 wide. I think now ARMs customers wanted a PC level ARM core hence X4 and X925 were created. Nvidia is using X925 in its AI PC launch next year.
 
Reactions: gdansk

Nothingness

Diamond Member
Jul 3, 2013
3,134
2,145
136
Former iOS kernel programmer:


I think it's fair to say that Apple had a very big part to play in the ISA design, modeling and prototyping. They may very well have had a u-Arch already well underway prior to discussing with ARM the details of how a compatible ISA would look.
Being a developer, iOS kernel or not, doesn't give you the full picture, far from it. He might be partly right. Or not.

People involved in the matter are not allowed to talk. The others are just guessing, or risking a professional fault and being fired for disclosure.
 

Raqia

Member
Nov 19, 2008
61
30
91
Being a developer, iOS kernel or not, doesn't give you the full picture, far from it. He might be partly right. Or not.

People involved in the matter are not allowed to talk. The others are just guessing, or risking a professional fault and being fired for disclosure.
I think it's much more plausible than not, much more consistent w/ the unusual course of events, and no one has presented anything convincing to disprove this assertion.

Most notably:
Apple's implementation came much earlier than even ARM's own.

Apple's implementation of ARMv8 beat the pants off of ARM, Qualcomm, Samsung, and nVidia's implementation for years as they continued to ride this lead making steady and incremental improvements on the original, Haswell-sized design. ARM's own implementations were much more choppy in their pace of improvements; compare A72 -> A73 vs A75 -> A76 or X4 -> X925. ARM's designs are also much more heterogeneous in structure with various lineages eg. (A57->A72) (A73->A75) (A76->X3 u-op caches) etc.
 

poke01

Platinum Member
Mar 8, 2022
2,352
3,081
106
it’s no surprise Apple was ahead, they had the right people, plan and acquisitions. The PA Semi being one of the notable ones.

Some still work at Apple. Look I understand some here like to promote grand theories but Apple just had good engineers, architects and a sustainable roadmap for the last 10 years. Similar how Nvidia is successful now because of the road they paved nearly 15 years ago on the GPU side of things.
 

Nothingness

Diamond Member
Jul 3, 2013
3,134
2,145
136
I think it's much more plausible than not, much more consistent w/ the unusual course of events, and no one has presented anything convincing to disprove this assertion.
I agree with you that it looks more or less plausible (but I disagree with some of his claims).

Most notably:
Apple's implementation came much earlier than even ARM's own.
That part is so obvious (just read official release announcements), it's not even worth mentioning , and doesn't prove the rest of the message of the iOS dev is correct. No need to be an iOS kernel dev to know.

Apple's implementation of ARMv8 beat the pants off of ARM, Qualcomm, Samsung, and nVidia's implementation for years as they continued to ride this lead making steady and incremental improvements on the original, Haswell-sized design. ARM's own implementations were much more choppy in their pace of improvements; compare A72 -> A73 vs A75 -> A76 or X4 -> X925. ARM's designs are also much more heterogeneous in structure with various lineages eg. (A57->A72) (A73->A75) (A76->X3 u-op caches) etc.
Again this doesn't prove the Twitter/X message you quoted is correct.

1) The premise here is wrong. arm64 is the Apple ISA, it was designed to enable Apple’s microarchitecture plans. There’s a reason Apple’s first 64 bit core (Cyclone) was years ahead of everyone else, and it isn’t just caches.
Unproven.

2) Arm64 didn’t appear out of nowhere, Apple contracted ARM to design a new ISA for its purposes. When Apple began selling iPhones containing arm64 chips, ARM hadn’t even finished their own core design to license to others.
First part unproven. Second part is obvious, read official announcements.

3) ARM designed a standard that serves its clients and gets feedback from them on ISA evolution. In 2010 few cared about a 64-bit ARM core. Samsung & Qualcomm, the biggest mobile vendors, were certainly caught unaware by it when Apple shipped in 2013.
Obvious: Arm is here to sell to its customers, Apple is just one of them. Before Apple released a 64-bit CPU, no one cared, so Arm didn't have to push it as quickly.

4) Apple planned to go super-wide with low clocks, highly OoO, highly speculative. They needed an ISA to enable that, which ARM provided.M1 performance is not so because of the ARM ISA, the ARM ISA is so because of Apple core performance plans a decade ago.
One could do that with 32-bit Arm.

My whole point is that either that guy really knows things and then he should just keep his mouth shut, or he's just speculating, correctly or not.
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |