Discussion Zen 5 Speculation (EPYC Turin and Strix Point/Granite Ridge - Ryzen 9000)

Page 289 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

naukkis

Senior member
Jun 5, 2002
768
633
136
I think you need to look back at Excavator's front end , including op cache , before being too surprised at Zen's current IPC despite only 4 decoders. Pushing any more IPC out of it , i.e Zen 5, without going wider? , yeah that will start to get interesting.

Zen main L1i-cache is micro-op cache. Decoder width is pretty much irrelevant - when executing code which totally misses quite large MOP it's pretty much guaranted that data caches will miss too. So 4 wide decode is probably way more than needed - core probably won't be even close to reach 1 IPC in those situations. Apple cores are totally different as they don't have micro op cache at all and need decode width equaling their core's execution width. It's more beneficial to use that 4-wide decode more efficiently - to try predecode branches earlier. I do remember that I have seen some speculation that Zen5 tries to do that by combining it's branch target buffer with decoders.
 

AMDK11

Senior member
Jul 15, 2019
341
235
116
So why did Intel add 6 decoders? Couldn't he have done 2 complex + 3 simple instead of 1 complex + 5 simple?

why load ROB as much as 512 when you can get a similar IPC with fewer resources? Could it be poor design?


I hope that we will not have to wait long for the leaks to be verified with the actual situation to compare Zen5 with LionCove.
 
Last edited:

naukkis

Senior member
Jun 5, 2002
768
633
136
So why did Intel add 6 decoders? Couldn't he have done 2 complex + 3 simple instead of 1 complex + 5 simple?

Complex instructions have to decoded for compatibility - those instructions when used will destroy performance anyway. Simple instructions are what matter. And they go wider if they can't go smarter. So when BTB instruct to decode not micro-op cached instructions Intel probably misses L1i-cache but after that L2, L3 or memory latency can decode 6-instructions in cycle with high power usage. Instead, if going smarter BTB could be used to fetch and decode not cached instructions before they are needed and cache those mops to be used when BTB hits. To be fair for Intel that's actually partially used scheme for their E-cores which can decode 6 instructions too but 3 for current stream and other 3 for possible branching target. But being able to predecode branches to mop cache is order of magnitude more sophisticated solution - and with that scheme decoding 4 instructions per clock is more than enough - and going only 4 wide decode gives much better energy-efficiency.
 

itsmydamnation

Platinum Member
Feb 6, 2011
2,854
3,404
136
So why did Intel add 6 decoders? Couldn't he have done 2 complex + 3 simple instead of 1 complex + 5 simple?

why load ROB as much as 512 when you can get a similar IPC with fewer resources? Could it be poor design?


I hope that we will not have to wait long for the leaks to be verified with the actual situation to compare Zen5 with LionCove.
Intel complex are for microcode instructions + a few exceptions , almost all instructions are "simple"
 

itsmydamnation

Platinum Member
Feb 6, 2011
2,854
3,404
136
Reeks of an unfinished design. It's like they fired the lead or took the project away from one team and handed it to another. Probably from USDC to IDC.
its not just one core that intel has been like this,
i would say anything post skylake as been spending big on resources for not much in terms of IPC gains.
 

Nothingness

Platinum Member
Jul 3, 2013
2,717
1,347
136
Cluelessly shooting in the dark have they been? Don't they use simulators to figure out if their idea is worth pursuing?
They do. But the thing is that for very long their engineers were very specialized on specifics of blocks. And they also were designing so close to metal that big changes were very difficult to do. I don't know where they stand now but I hope they moved away from that culture.
 

itsmydamnation

Platinum Member
Feb 6, 2011
2,854
3,404
136
Seems that's their problem. Keller said huge teams are basically unmanageable. The intra communication overhead becomes too great.
its more then that , in a high performance team/ culture your "average" engineer delivers above average output , in low performance team/ culture an above average engineer will deliver average output. Especially in fields that require lots of problem solving and perseverance.

In all the original Zen1 fluff AMD put out you can hear them talking about these exact types of things as they had to fight really hard to keep IPC from dropping as the core become more mature/complete.
 
Jul 27, 2020
17,712
11,499
106
It's a monster core and it should be phenomenal in AVX-512.

The only people with a predicament are the ones not on AM5 coz they may get very tempted to switch platforms and it would hurt their savings account.

And 4090 owners coz it's so woefully CPU limited.
 

CakeMonster

Golden Member
Nov 22, 2012
1,426
530
136
I figure most people here would like to see ST improve, given that currently Intel has the lead in some applications. Who knows if Intel is able to improve further on that... but it seems sensible to me that AMD would have at least wanted to improve ST by now even as they designed Z5 probably 5+ years ago.
 

AMDK11

Senior member
Jul 15, 2019
341
235
116
I have a feeling granite ridge is only about 15% - 20% > zen 4/rpl
I think that for Zen 5 an average of +15-20% across the entire spectrum of the IPC growth curve is a reasonable and safe assumption. Of course, AVX512 will have the biggest gains at the end of the curve, as will SunnyCove and GoldenCove, although for the latter AVX512 was ultimately disabled.

If there is more growth, just be happy.

Edit:
The question is what about the clock speed. I hope it stays at least at the same level as Zen 4.
 
Last edited:

inf64

Diamond Member
Mar 11, 2011
3,759
4,212
136
Single thread. I think mlid has really good source this time
Well he stated 10-15% IPC improvement (not ST) according to that slide he leaked last year. I don't know what else he claimed since his range is absolutely degenerate (as always) - this is how he can claim he is spot on.

My guess is that Zen 5 will have similar ST jump as Zen 4 (vs Zen 3), so 27+%. MT will likely be lower though, and I expect ~15% on average.
 

Goop_reformed

Senior member
Sep 23, 2023
231
295
96
Well he stated 10-15% IPC improvement (not ST) according to that slide he leaked last year. I don't know what else he claimed since his range is absolutely degenerate (as always) - this is how he can claim he is spot on.

My guess is that Zen 5 will have similar ST jump as Zen 4 (vs Zen 3), so 27+%. MT will likely be lower though, and I expect ~15% on average.
No this is different. I believe mlid is still hiding the slides. Gotta milk the viewers.

Also RGT is doubling down here:


I think the quotes are from here

I’ve had more than 1 person tell me that CB R23 1T is >=2800 which is a >=40% increase. If I had to take a geomean of Zen 5 leakers it’d probably be a 40% 1T perf increase overall.

I don’t personally believe that but it’s at least consistent. The only person saying it’s not hype™ is MLID who’s track record is spotty. Although he seemed to get the details about the Zen 5 delay and 800 series chipset right .. so ymmv.

Circle jerking all around Xd
 
Last edited:
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |