Question Zen 6 Speculation Thread

Page 49 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

inquiss

Senior member
Oct 13, 2010
267
383
136
As @Hitman928 stated the profiling of Zen 5 done so far had revealed the frontend bandwidth - decoders busy, uOp busy - is NOT the bottleneck. The frontend latency - a L1i cache miss, fetch after a mispredicted branch, ITLB misses, etc. - often is.

So doubling the decoder width won't help that much.

The slide contains marketing speak since the slide was not internal but presented under NDA. This means they definitely do not need to sandbag or anything like that.

* For the then-existing gens, they list real the marketing IPC figure.
* For Zen 5 they listed 10-15+% IPC goal
* For Zen 6 they listed 10+% IPC goal

Compare Zen 5 with Zen 6. If they knew the IPC goal would be higher than 15%, they would present it as such. Higher is better. But they did not.
Would items like reduced latency from the new packaging be included in the IPC number though? That's what I'm curious about. Core could be 10% but benefits from IOD and packaging impact on performance might not be...
 

OneEng2

Senior member
Sep 19, 2022
385
590
106
Note: This problem only applies to x64 in the near term. Apple and ARM will keep delivering decent improvements every year until they hit the power and frequency wall. Probably only a few years later but still.
Physics and economics apply to ARM as well. I disagree.
With greatly diminishing probability for each +1%, where as 10%+ makes it very likely to be in range of 10-15% (mid is only 12.5%), but a lot less so to be in 10-20%

Since Zen 5 underperformed (I reckon dual branch predictor not working as intended so in effect it's less wide than Zen 4) - if that's fixable and gets fixed in Zen 6 then maybe we will see better improvement than on that slide.

I'll be happy if they add Intel APX to it
Indeed, doubling the GPRs would help in many situations and mostly across the board.

It was my understanding that x86 had mitigated this mostly though through the use of extended registers and register renaming?
On a plus side - if they do so well with 4-wide decoder and another one disabled in non-SMT mode, then imagine how much better it will be with just 5-wide decoders? Maybe primary (most likely branch) decoder should be 5-6 wide and secondary is 4-wide...

From the (granted limited) publicly available profiling done on Zen 5, the decoder width was seldom the bottleneck for the Zen 5 core.
I think that a new IOD and faster memory will Open up some use cases so that fixing the decoder will actually help.

I still think 10-15% is a good bet.
 
Reactions: Tlh97

MS_AT

Senior member
Jul 15, 2024
449
971
96
Indeed, doubling the GPRs would help in many situations and mostly across the board.

It was my understanding that x86 had mitigated this mostly though through the use of extended registers and register renaming?
Register renaming solves different problem. Compiler does not know your register file has hundred entries, as this is an implementation detail, it will spill as soon as you run out of architectural GPRs (16 for x64, 32 for aarch64, 32 for x64 + APX)
 

Bigos

Member
Jun 2, 2019
170
424
136
Zen 2 and 4+ can rename simple memory write/read pairs, possibly alleviating the lack of architectural registers a bit (at the cost of unnecessary memory instructions, which take cache space and require load/store resources).
 
Jul 27, 2020
22,300
15,557
146
Would items like reduced latency from the new packaging be included in the IPC number though? That's what I'm curious about. Core could be 10% but benefits from IOD and packaging impact on performance might not be...
Yeah, I think MT throughput doesn't figure into IPC calculations. IPC is usually related to the ST throughput. Reduced latency and higher internal bandwidth could allow the cores to work even better together.
 

Win2012R2

Senior member
Dec 5, 2024
647
609
96
But less immediately when their performance per watt is 3x AMD and Intel. How else did Apple go from merely matching Zen 3 with M1 to eclipsing Zen 5 by 15% (or more) with M4?
N3E is lovely

Maybe in mobile phones ARM it is 3x, in laptops that's not the case and certainly not in server space where wattage also matters, also Apple isn't ARM - it's a very special sauce, for a very special very high price. If only AMD could sell you a laptop where they could charge $600 for couple of TBs of subpar NAND...

Just look at Ampere chips (at their reviews since they ain't actually easily buyable) - they matched core counts, but with all that PCIE5 etc their wattage is up there with AMD and price isn't exactly "dirty cheap enough to port code to diff arch", it basically did not go anywhere other than maybe Oracle who (I think) funds it in the first place.

P.S. I am a very happy Apple M4 user.
It was my understanding that x86 had mitigated this mostly though through the use of extended registers and register renaming?

Having lots of hidden registers to rename isn't exactly cheap thing to have - perhaps explicit support might help reduce necessity to have such a large number, it's certainly a LOT better for programmer.
 
Last edited:

Tuna-Fish

Golden Member
Mar 4, 2011
1,557
2,218
136
Register renaming solves different problem. Compiler does not know your register file has hundred entries, as this is an implementation detail, it will spill as soon as you run out of architectural GPRs (16 for x64, 32 for aarch64, 32 for x64 + APX)

Register renaming allows the compiler to just keep a live set, and offload finding ILP to the CPU. Modern compilers are absolutely assuming renaming and hundreds of registers. Before renaming was common, compilers were designed to try to extract ILP in ways that increased register pressure, through aggressive unrolling and interleaving and the like.
 

gdansk

Diamond Member
Feb 8, 2011
3,768
6,015
136
N3E is lovely
Possibly, but like I said: cost concerns impact ARM less because the phone market justifies designing the core for the latest processes. Physics are the same for both of them but their design allows them to avoid some of the worst of it (for now).
 

Win2012R2

Senior member
Dec 5, 2024
647
609
96
cost concerns impact ARM less because have the phone market
I disagree - mobile market is slowing down, new stuff just isn't offering a lot better features, and it's got REALLY expensive too - not just Apple phones, Samsung also on same high level. The whole mobile market most likely peaked - it's already switched to model of "sell old stuff that retained value OKish", that's like car used market, we might see the same with GPUs going forward.

ARM is getting very tiny amount per unit anyway, even with their price hikes (apart from desire to take %-tage of final price - but I think that went nowhere), they have a problem of getting into new markets and in server space the new stuff is GPUs that gets all the money for the moment.
 

gdansk

Diamond Member
Feb 8, 2011
3,768
6,015
136
I disagree - mobile market is slowing down
Go plot the GB6 1T or SPECint scores of A and X series. It really isn't slowing down like x64. ARM might suck at it doing it in a decent area but their "partners" are doing better.

Zen 6 is a 10% generation so I'm pretty sure AMD won't even catch up to M4 by 2026. And by that fall Cortex X will be ahead of them in performance and performance per watt (probably not on area, however, which matters for server).
 
Reactions: exquisitechar

Win2012R2

Senior member
Dec 5, 2024
647
609
96
It really isn't slowing down like x64
The market is slowing down - people can't drop grand and a half on new toy every year which isn't much different than the one they've got, purchase cycles increased, the "AI" stuff isn't working yet to get people buying.
Go plot the GB6 1T or SPECint scores of A and X series
Yeah, it's amazing what one can do when selling 200 mln+ premium devices per year - personally I'd prefer if they got battery life in my iPhone to 7 days.
 
Jul 27, 2020
22,300
15,557
146
Zen 6 is a 10% generation so I'm pretty sure AMD won't even catch up to M4 by 2026.
Most x86 users won't care. At most, they will get an ARM device to "feel" the snappy performance of the latest ARM SoC from whoever. The number of x86 power users who will ditch their existing x86 device and go full ARM is going to be miniscule especially if they are not mobile warriors.

For ARM to cause the premature death of x86, it needs to offer at least 90% emulated x86 performance. Some may even be OK with 50% emulated performance, as long as the emulation is so robust and solid that it runs almost any x86 executable flawlessly, save for the odd ones.
 
Jul 27, 2020
22,300
15,557
146
P.S. I am a very happy Apple M4 user.
What's your use case?

I have an M1 and so far I'm only using it for viewing movies/TV shows. Don't see the need to upgrade coz I can't think of running anything on it that would fill me with utter, inexplicable bliss.
 

gdansk

Diamond Member
Feb 8, 2011
3,768
6,015
136
Most x86 users won't care. At most, they will get an ARM device to "feel" the snappy performance of the latest ARM SoC from whoever. The number of x86 power users who will ditch their existing x86 device and go full ARM is going to be miniscule especially if they are not mobile warriors.

For ARM to cause the premature death of x86, it needs to offer at least 90% emulated x86 performance. Some may even be OK with 50% emulated performance, as long as the emulation is so robust and solid that it runs almost any x86 executable flawlessly, save for the odd ones.
I didn't say anything about the death of x64. Just an argument that ARM vendors will not slow down until later while the x64 vendors already have:

To reiterate:

I think we need to start getting used to smaller improvements and longer times between generations. Damn Physics!
Note: This problem only applies to x64 in the near term. Apple and ARM will keep delivering decent improvements every year until they hit the power and frequency wall. Probably only a few years later but still.

I.e. A better ISA and more money from an unassailable, captive and massive phone market means they can keep improving for longer.
 
Last edited:
Jul 27, 2020
22,300
15,557
146
Just that the argument that ARM vendors will not slow down until later while the x64 vendors already have:
They can't keep going forever. They will hit a plateau sooner or later but yes, there is a good chance that they may leave the x86 players quite behind and then those players will spend quite sometime catching up. It's actually good for x86 because ARM is showing them that more performance is within reach. Lunar Lake's performance is miraculous compared to Meteor Lake and we probably wouldn't have seen it materialize without M1.
 
Jul 27, 2020
22,300
15,557
146
If Zen 6 goes 24C/48T or even 20C/40T, is the ARM camp gonna have a competitively priced MT monster to challenge it? This is what I'm interested in seeing and both Qualcomm and Nvidia better be ready to take the challenge or go home.
 

gdansk

Diamond Member
Feb 8, 2011
3,768
6,015
136

gdansk

Diamond Member
Feb 8, 2011
3,768
6,015
136
Come on. That's anything but competitively priced. Apple has no interest in competing. They go directly for the "I don't know how but I keep getting richer" crowd.
It includes a GPU of about ~400mm2 of leading edge silicon with raytracing and AI acceleration. How much will that cost in 2026/2027? Jensen's Pricing Law suggest $1500 to $2000 alone.
 
Jul 27, 2020
22,300
15,557
146
It includes a GPU of about ~400mm2 of leading edge silicon with raytracing and AI acceleration. How much will that cost in 2026/2027? Jensen's Pricing Law suggest $1500 to $2000 alone.
That's mostly of interest to developers at this point. Apple isn't even trying seriously to entice studios to port their games to Apple Silicon. Like maybe offering to cover 50% of the porting costs for only AAA games, for start. They have the money but don't want to spend it because they are haunted by Steve Jobs' ghost at night if they try to think seriously about targeting the games industry.
 
Reactions: Joe NYC

gdansk

Diamond Member
Feb 8, 2011
3,768
6,015
136
That's mostly of interest to developers at this point. Apple isn't even trying seriously to entice studios to port their games to Apple Silicon. Like maybe offering to cover 50% of the porting costs for only AAA games, for start. They have the money but don't want to spend it because they are haunted by Steve Jobs' ghost at night if they try to think seriously about targeting the games industry.
Not the subject of this thread nor do I care.
  1. Zen 6 will need to surpass AMD's own expectations to match a 5.1mm thin tablet from 2024 in 1T performance in 2026.
  2. Commodity ARM cores will offer as much 1T performance as Zen 6 in 2026 at lower power. May be costly.
  3. People who want workstation MT will have it and (in whole system analysis) in a price competitive way.
Whether or not people switch to ARM or not I do not care. Personally I rather stay on x64 because I have old programs which fail to run under WoA emulation currently. But it's like living in Detroit in the 60/70s watching everything turn to crap.
 
Reactions: exquisitechar

Win2012R2

Senior member
Dec 5, 2024
647
609
96
What's your use case?
Typical tablet - browsing, video watching, works great, potentially it can do 10x perf editing videos and stuff, but I don't use it for that - would have preferred to cut those bits off and save on battery life!

Zen 6 is a 10% generation so I'm pretty sure AMD won't even catch up to M4 by 2026.

I am ok with that, x86 has been around far longer than current crop of ARMs and it has to deal with backwards compatibility going back to 80s, obviously it's easier to achieve uplifts in CPU arch where you control vertically whole stack and can jettison old stuff if necessary. As far as I am concerned they've won mobile market (phone, tablet), but for proper laptop, desktop and especially servers it's x86 all the way, but if I was hyperscaler I'd also use ARM as a leverage to prevent Intel/AMD from charging me their outrageous full list prices...

The thing that worries me most about x86 is COST - not in client mind you, that's very reasonable, but in servers it's totally outrageously too expensive - like 15 grand for a single bloody chip? Pricing is way out of whack, power usage is also going up to crazy levels, so again - cost, but ongoing.
 
Last edited:
Reactions: Thunder 57
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |