Question Zen 6 Speculation Thread

Page 35 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

soresu

Diamond Member
Dec 19, 2014
3,303
2,567
136
and ARM threw nominal PPA out of the window with X5/925
This was telegraphed years back in the Cortex X1 announcement where they outright stated PPA was no longer a goal for this lineage of cores.

They started off slower moving to 5 wide with X1, then got progressively chonkier from there.
 

soresu

Diamond Member
Dec 19, 2014
3,303
2,567
136
On the other hand they will have higher chance to get free performance from intel pushing AVX10/256 as AMD's own software attempts are... well not worth mentioning
Eh?

AVX10/256 just gives parity to AVX512 for 256 bit instructions, plus whatever new instructions the 512 bit version is adding most likely.

If anything AMD putting AVX512 in Zen4 while Intel goofed it with Alder Lake has kept interest for it in some circles.
 

FlameTail

Diamond Member
Dec 15, 2021
4,191
2,537
106
This was telegraphed years back in the Cortex X1 announcement where they outright stated PPA was no longer a goal for this lineage of cores.

They started off slower moving to 5 wide with X1, then got progressively chonkier from there.
Didn't you guys lecture me about using acronyms properly? PPA = Performance-Power-Area, not Performance-Per-Area.
FYI, the acronym PPA already stands for Power-Performance-Area. It's probably best not to use it for something so similar unless you're looking to confuse people.
@adroc_thurston @soresu You both agreed to the convention by liking GTracing's comment. What is this hypocrisy!?
 

Thibsie

Senior member
Apr 25, 2017
899
1,011
136
Disagree with Apple and ARM only making small improvements YoY. CAGR is what matters.
View attachment 111379

Geekbench 6 Single Core202220232024
ARM2000
Cortex X3
(8G2)
2300
Cortex X4
(8G3)
2900
Cortex X925
(D9400)
Apple2600
 M2
3200
 M3
4000
 M4
Qualcomm3200
Oryon-L
(8E)
Intel3100
13900K
3200
14900K
3400
285K
AMD3000
7950X
3500
9950X

If this trend continues, ARM vendors will outpace x86 sooner than later.

I used Geekbench6 for convenenience, but I am sure this trend can been seen in SPEC2017 too.

Sure, ARM CPUs having 20% faster single core isn't going to lead them to taking 50% of the market overnight. X86 CPUs are protected by a moat of advantages such as compatibility and modularity. But what about the long term? Those advantages won't last for perpetuity. App compatibility on ARM is improving day by day, and one day there will be socketable desktop ARM CPUs.

X86 vendors will have to quicken their pace and/or deliver larger improvements with each generation, to keep up.

Once (if) ARM SOCs end in an industry standard socket / mobo, things will move a LOT slower. If partners can't have time to validate, manufacture, sell, support products, you're in a dead end.
 

yuri69

Senior member
Jul 16, 2013
566
1,005
136
For Zen 6, there's been just a single combination listed - Morpheus core used by Monarch CCD. Besides, they can't economically replace cheapish 8-16 P-core desktop SKUs with anything mobile. The same goes for mobile - cheapo 8 P-core mobile SKUs shouldn't be composed from mutiple chips. Gimme some examples of the "desktop-mobile convergence", thx.

Cutting AVX-512 width? Sure why not, but is that it?
 

Thibsie

Senior member
Apr 25, 2017
899
1,011
136
Well, full AVX512 in mobile doesn't make much sense. If desktop is based on mobile, desktop won't have full AVX512.

Moreover, if those desktop CPUs indeed are based on mobile, they won't use the same CCDs as server CPUs. So no full AVX512, again.

We don't even know if mobile will used CCDs or only monolithic implementations.
 
Reactions: marees

StefanR5R

Elite Member
Dec 10, 2016
6,010
9,024
136
There are several segments in mobile. Maybe AMD is going to target, among else, a segment in which spending transistors for wide SIMD does make sense.

A peripheral thought: Do >5.1 GHz f_max make sense in mobile? If desktop is based on mobile, won't desktop have >5.1 GHz f_max?
 
Reactions: Thibsie

soresu

Diamond Member
Dec 19, 2014
3,303
2,567
136
Maybe AMD is going to target, among else, a segment in which spending transistors for wide SIMD does make sense
For HPC it definitely does.

They have definitely made inroads in supercomputer contracts over the last few years.
 

adroc_thurston

Diamond Member
Jul 2, 2023
3,722
5,437
96
Dunno where you got 8% from.
M3 to M4 is about ~8% SIR bump and M3 was a regression by like (2)% before that.
This was telegraphed years back in the Cortex X1 announcement where they outright stated PPA was no longer a goal for this lineage of cores.
Everything up to X4 was still reasonable. X5 is boom boom, you can clearly see it on d9400 dieshot.
 

moinmoin

Diamond Member
Jun 1, 2017
5,117
8,156
136
This IPC talk is meaningless, ARM players (including Apple and Qualcomm) have been rapidly increasing frequency in this timeframe, what I care about is performance improvement, not just IPC improvement. If they hit a frequency wall, they will put more effort into IPC than they have been doing now. The fact is that they are improving at a faster pace than AMD and Intel, AMD needs a faster cadence.
Frequency used to be the advantage of x86 which allowed them to more than make up their IPC weakness for the total performance. ARM players now quickly closing the frequency gap should force x86 players to close the IPC gap to keep the performance advantage. But so far that doesn't seem to work out so swell for them.
 

Meteor Late

Member
Dec 15, 2023
93
80
51
N3e is not that good honestly, we know Apple has shifted to a high performance library instead of high density library, and they got from 4 to 4.5GHz, not that impressive considering this fact. Intel is clocking at 5.7GHz with N3B so yeah it won't be any speeding beast of a process node.
 

poke01

Platinum Member
Mar 8, 2022
2,470
3,248
106
Anything is fast on N3e 3-2 FF.
Are you sure X925 is on N3E and can’t even clock past 3.8GHz. The X Elite reached 4.2GHz on N4P. M3 reached 4.0GHz on N3B. Maybe just maybe FinFlex is over hyped. M2 -> M3 was from 3.5GHz to 4.05GHz. No FinFlex there.
 
Last edited:

MS_AT

Senior member
Jul 15, 2024
347
755
96
Eh?

AVX10/256 just gives parity to AVX512 for 256 bit instructions, plus whatever new instructions the 512 bit version is adding most likely.

If anything AMD putting AVX512 in Zen4 while Intel goofed it with Alder Lake has kept interest for it in some circles.
I meant software adoption. And I mentioned that AMD will have it for free, as Intel is actively driving SIMD library efforts, which hardly can be said about AMD. They cannot even ensure their CPUs have reasonable support in mainstream compilers 3 months past launch...
 

CouncilorIrissa

Senior member
Jul 28, 2023
574
2,245
96
Are you sure X925 is on N3E and can’t even clock past 3.8GHz. The X Elite reached 4.2GHz on N4P.
The X1E84100 is a unicorn SKU tbf. It exists in what, 1 laptop, the Galaxy Book? That should tell you how hard it is to get to that clock on N4P. The vast majority of chips are 78100 with 3.4 GHz boost.

M2 -> M3 was from 3.5GHz to 4.05GHz. No FinFlex there.
They gimped L1$ latency to achieve it.
 

Doug S

Platinum Member
Feb 8, 2020
2,864
4,866
136
This IPC talk is meaningless, ARM players (including Apple and Qualcomm) have been rapidly increasing frequency in this timeframe, what I care about is performance improvement, not just IPC improvement. If they hit a frequency wall, they will put more effort into IPC than they have been doing now. The fact is that they are improving at a faster pace than AMD and Intel, AMD needs a faster cadence.

There's no real difference between improving performance 10% per year or 20% every other year. Waiting a couple years for Zen 6 doesn't matter if it makes a big enough performance gain. Having Zen 6 coming out a year after Zen 5 and Zen 7 coming out a year after that doesn't help you if they're getting only 5% higher performance per year as a consequence of spreading their teams too thin doing a brand new architecture every year.

Apple ships 300+ million SoCs a year across iPhone/iPad/Mac to amortize that cost - more than Intel & AMD's yearly shipments combined.
 
Reactions: poke01

adroc_thurston

Diamond Member
Jul 2, 2023
3,722
5,437
96
Would be nice to have a similar shortcut to increasing IPC, eh.
yes it's called 3DV$.
N3e is not that good honestly, we know Apple has shifted to a high performance library instead of high density library, and they got from 4 to 4.5GHz, not that impressive considering this fact
No it's definitely good at that specific thing exactly
Are you sure X925 is on N3E and can’t even clock past 3.8GHz
It's also the size of a skyscraper with a very anemic SIMD implementation.
The X Elite reached 4.2GHz on N4P.
It in fact did not.
M2 -> M3 was from 3.5GHz to 4.05GHz
That's the IPC regression core with relaxed timings.
 
Reactions: poke01

Doug S

Platinum Member
Feb 8, 2020
2,864
4,866
136
They gimped L1$ latency to achieve it.

At some point higher frequencies require increasing cache latencies. The latency in absolute time for a given L1 size/associativity/etc. is relatively fixed (and no longer improved much if it at all by newer processes) so if you wanted to, for example, double clock rate a cache that has the same latency in wall clock time would require doubling latency measured in cycles.

The goal is maximizing performance after all, so it isn't "gimping" L1 if that change allows other changes that lead to higher overall performance. Apple is leading right now because of the combination of IPC and frequency. If you're afraid to make any changes that will reduce IPC (which increasing L1 latency clearly does) your ability to increase frequency will be greatly diminished.
 
Reactions: moinmoin and poke01

gdansk

Diamond Member
Feb 8, 2011
3,187
5,010
136
Waiting a couple years for Zen 6 doesn't matter if it makes a big enough performance gain.
It isn't though. It's 10%. After 2 years. Not gonna cut it except in the server market where they can spam more cores to make up for it and they're still not at the frequency wall.
 

MS_AT

Senior member
Jul 15, 2024
347
755
96
To be fair that's pretty unusual for them.
I have observed the same situation for zen4. Actually this time it was better as the initial enablement in GCC landed 6 months ahead of release iirc, but it contained some misinformation about cpu capabilities... Then after CPUs were released, GCC maintainers done some benchmarks to provide znver5 specific tunings. In clang, they added the support month after GR release, almost missing the 19.1.0 release window, raising questions from maintainers why they did not bring the support sooner since gcc got it half a year faster [the issue was the support patches landed in at the time the release branch was supposed to accept bug fixes only]. Oh, and the patch did not contain any tunings so znver5 in clang is basically znver4. This is even funnier since AMD's own compiler is based on clang/llvm... MSVC does not contain any chip specific tunings or enablements so there was nothing to enable there.

Contrast with Intel is stark, where Diamond Rapids is getting enablement patches right now.
 

CouncilorIrissa

Senior member
Jul 28, 2023
574
2,245
96
At some point higher frequencies require increasing cache latencies. The latency in absolute time for a given L1 size/associativity/etc. is relatively fixed (and no longer improved much if it at all by newer processes) so if you wanted to, for example, double clock rate a cache that has the same latency in wall clock time would require doubling latency measured in cycles.

The goal is maximizing performance after all, so it isn't "gimping" L1 if that change allows other changes that lead to higher overall performance. Apple is leading right now because of the combination of IPC and frequency. If you're afraid to make any changes that will reduce IPC (which increasing L1 latency clearly does) your ability to increase frequency will be greatly diminished.
All true, poor phrasing on my part.
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |