Discussion Zen 5 Speculation (EPYC Turin and Strix Point/Granite Ridge - Ryzen 9000)

Page 735 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

poke01

Platinum Member
Mar 8, 2022
2,004
2,542
106
Any ideas why the 9600X consumes more power than the 7600X in gaming in HUB tests?






 

Attachments

  • 1723118169003.png
    604.7 KB · Views: 18

Timorous

Golden Member
Oct 27, 2008
1,748
3,239
136
I don’t think anyone can be saltier than Zen3 owners like me, who abstained from upgrading their 5950x because of all the rumors Zen5 would be such a monster and the most exciting dream Mike Clark ever had. I remember thinking “wow a ~40% performance upgrade with Zen4 would be great, but combined with another 32% from Zen5 would be awesome, almost doubling my current performance so I should just wait”

Part of my brain knew it couldn’t be such a big jump 2 generations in a row since Zen4 really knocked it out of the park, and Zen5 would be a similar node. But the stupid neanderthal part of my brain gave in to the rumor hype. I wish I could take a time machine back and buy a 7950x on launch

From a pure techie perspective though the new architecture is exciting, and I’m interested to see what it can do with the memory bw uncorked (Turin, STX Halo, Zen5 Threadripper if there is one) and we’ll see regarding the X3D chips but my expectations are pretty low

On the bright side you managed to wait for DDR5 prices to come down and for motherboards to come down in price so the cost of entry to Zen 4 is a lot lower than it was at launch.

On a separate point, I hope v-cache does something interesting beyond just allowing the cores to clock the same as the standard 9700X. A 128MB slab of cache for a core that does seem to want bandwidth (the memory tests of high speed 2:1 Vs maxing out the 1:1 for latency will be interesting) could make it decently better than the 7800X3D.
 

Timmah!

Golden Member
Jul 24, 2010
1,510
824
136
The biggest game changer would be if V-Cache covered the whole CPU, and AMD switched to Wafer over Wafer packaging. This would turn V-Cache models into mainstream, high volume parts. Also, covering the whole CPU die could double the V-Cache size.
The biggest change would be if V-cache covered the whole motherboard 😁
 

CakeMonster

Golden Member
Nov 22, 2012
1,493
653
136
About the V-cache, I'm actually more interested in AMD eliminating the need for scheduling shenanigans than them updating the cache specs/size. Even with the vanilla versions of 9000 series disappointing, I'm really still not too keen on a heterogenous 2 CCX X3D part...
 

Tuna-Fish

Golden Member
Mar 4, 2011
1,474
1,966
136
I'm not expecting anything fancy, already 2nd gen Vcache was a lot better than 1st gen. It's probably a few extra percent, 2-3% tops. Already the amount of cycles lost by using V cache was tiny, I think someone mentioned just 3-4 cycles. There's just not a massive headroom there IMO.

The supposed difference is in clocks. According to AMD statements, clocks on current-gen vCache chips are not limited by power or thermals, but voltage. They dare not raise the voltages on the vCache chips because of degradation (with Intel recently demonstrating why... ), and do not have separate voltage domain for the added cache so this means they have to limit voltage, and thus clocks, for the whole thing.

I see two possible ways this could have been improved with Zen5X3D:

1. Zen5 runs at similar clocks as Zen4, but at slightly lower voltage. This means that even with similar restrictions, they can maintain peak clocks higher on the vCache parts.

2. They implemented a separate voltage domain for the cache. This would likely mean that the latency in cycles of accessing the vCache goes up, but it would allow pushing the cores to a higher clock, possibly all the way to matching the non-X3D parts.

If either option is true it would mean that the performance difference between 9700X and 9800X3D can be greater than between the 7700X and 7800X3D.

Soo... hype train back on tracks?
 

CouncilorIrissa

Senior member
Jul 28, 2023
521
2,002
96
Looks like techpowerup had the same results in gaming. Applications benefit from the lower TDP much more than games with Zen5. Pack your bags gamers and hop on next train to Zen5 X3D.
Was always going to be the case with games tbh. X3D has made the vanilla chips obsolete for the public that uses PCs for browsing and gaming (i.e. most of the population)

Soo... hype train back on tracks?
Please don't, this thread is aneurysm-inducing enough.
 

Det0x

Golden Member
Sep 11, 2014
1,231
3,876
136
From Scatterbencher OC guide, another very weird decision of AMD that complicates non-static OC:
Think i already responded on discord about this, but with +200mhz offset your at 5950/5650mhz
You will pretty much never hit these PBO clockspeeds without LN2, so this is not a clockspeed limitation for 99.9% of the users
5950/5650mhz would net your around ~51.5k points in Cinebench R23 MT


If you really want to bench superpi at PBO 6ghz+, its very easy to use baseclock1 101mhz for a net fmax of 6010mhz as you know
But then again, if your hunting for highscores it better to set static OC 6ghz to a single affinity locked core, from inside windows, when your ready to bench
 

marees

Senior member
Apr 28, 2024
373
435
96
The interesting thing is that there are multiple reviews now that calls the 9600X the best budget gaming cpu. Its weird how some reviewers are calling the release a flop. According to some, AMD will have the best value gaming cpu and once the X3D parts arrive, the best gaming cpu bar none.
PC World delayed their review because of unexpected multi-core performance when compared to single core gains

 

JustViewing

Senior member
Aug 17, 2022
216
382
106
So this is another typical AMD launch. Couple of users over hype the product. Others fall for this hype. When the product is actually released, everyone feels disappointed. For me the performance meets the expectation from the architectural perspective.
It is well know that it is very difficult to increase integer IPC. The number of general purpose registers is a bottle neck. More read/write ports will help, but it may also increase power usage. As I said before, we need to wait for APX instruction set implementation before we see huge IPC increase.
Having said that, there is still lots of potential still left in AVX. With AVX512 they can probably go over 16 execution units.

My real disappointment is there is no 24/32 core AM5 Zen5 CPU.
 
Last edited:

tsamolotoff

Member
May 19, 2019
176
306
136
You will pretty much never hit these PBO clockspeeds without LN2, so this is not a clockspeed limitation for 99.9% of the user
In single threaded situations? how so? In any case, you know that this affects whole curve altogether so it just needlessly complicates tuning - why even implement such a limit if in 'normal' circumstances it is never reached by CCD1 (unless you bclk oc and the cpu crashes in idle). As I tried to explain in Discord, if your ccd1 is bad, this differential in fmax creates issues (for me), but we'll see I guess. Not that'd use boost system if fixed OC was available with x3d, something like 5.5 ghz at 1.2v ish would suite me just fine.
 

DisEnchantment

Golden Member
Mar 3, 2017
1,747
6,598
136
What would you do if your engineering teams are developing something exciting and turns out to be a turd, like Zen 5 for example?

I am wondering this is the reason why David Suggs is no longer at AMD, since 1 and half years ago

I wonder if they realized early on that Z5 is going to suck, but they are already 4 years into development.
He was chief architect of Zen 2 and Zen 5.
Z3 and Z4 seems OK, especially Z4 got helped by clocks a lot.

Z6 is going to suffer the same fate, being a derivative architecture.

More theory crafting ...

If Z4 got delayed to accommodate CXL (as per Forrest) and COVID played some part, that would leave Z5 very long dev time.
It could have been that they were trying hard to polish this turd to not regress so much like BD.

However, they could have done something in the uncore and address the BW and latency shortcomings and shore up the perf a bit.
 
Last edited:

Thunder 57

Platinum Member
Aug 19, 2007
2,960
4,493
136
What would you do if your engineering teams are developing something exciting and turns out to be a turd, like Zen 5 for example?

I am wondering this is the reason why David Suggs is no longer at AMD, since 1 and half years ago

I wonder if they realized early on that Z5 is going to suck, but they are already 4 years into development.
He was chief architect of Zen 2 and Zen 5.
Z3 and Z4 seems OK, especially Z4 got helped by clocks a lot.

Z6 is going to suffer the same fate, being a derivative architecture.

Saying it sucks is a bit harsh and premature considering the whole lineup isn't even out yet. The 9 series may fare better with more traditional TDP's.
 

DisEnchantment

Golden Member
Mar 3, 2017
1,747
6,598
136
Saying it sucks is a bit harsh and premature considering the whole lineup isn't even out yet. The 9 series may fare better with more traditional TDP's.
Well, It is not exactly stellar, saying it is mild improvement is being too generous considering the time frame involved.

I am mostly looking at Alexander Yee's blog to make this statement.

Other than AVX512 there is not much improvement
 
Reactions: exquisitechar

CouncilorIrissa

Senior member
Jul 28, 2023
521
2,002
96
What would you do if your engineering teams are developing something exciting and turns out to be a turd, like Zen 5 for example?

I am wondering this is the reason why David Suggs is no longer at AMD, since 1 and half years ago

I wonder if they realized early on that Z5 is going to suck, but they are already 4 years into development.
He was chief architect of Zen 2 and Zen 5.
Z3 and Z4 seems OK, especially Z4 got helped by clocks a lot.

Z6 is going to suffer the same fate, being a derivative architecture.

More theory crafting ...

If Z4 got delayed to accommodate CXL (as per Forrest) and COVID played some part, that would leave Z5 very long dev time.
It could have been that they were trying hard to polish this turd to not regress so much like BD.

However, they could have done something in the uncore and address the BW and latency shortcomings and shore up the perf a bit.
To me this release feels like a consequence of misreading the room when the development of Zen 5 started, which is 5-6 years ago realistically.

At the time, Intel had a big lead in FP and vector throughput in HEDT/server with SKL-X and then followed it up by bringing it to client with ICL and TGL.

To me it feels like AMD decided to match them in this respect no matter what and dedicated bulk of the resources to FP throughput (L1 -> FP PRF doubled, doubled the FP register file, went for the most overkill AVX-512 implementation known to man).

Little did they know that Intel would ditch the thing and ARM would become a major threat with their ultra-wide OOO machines with ridiculous integer throughput.

Couple that with Suggs' propensity for large FP units and bean counters reverting the Zen 5 to N4P, and you have a perfect storm for the lowest gen-on-gen INT gain.

I'd also add that Zen 3 was more than OK, it was a goated gen-on-gen jump. 16 months after Zen 2, miniscule area increase, massive improvement in INT throughput.
 

yuri69

Senior member
Jul 16, 2013
531
951
136
To me this release feels like a consequence of misreading the room when the development of Zen 5 started, which is 5-6 years ago realistically.

At the time, Intel had a big lead in FP and vector throughput in HEDT/server with SKL-X and then followed it up by bringing it to client with ICL and TGL.

To me it feels like AMD decided to match them in this respect no matter what and dedicated bulk of the resources to FP throughput (L1 -> FP PRF doubled, doubled the FP register file, went for the most overkill AVX-512 implementation known to man).

Little did they know that Intel would ditch the thing and ARM would become a major threat with their ultra-wide OOO machines with ridiculous integer throughput.

Couple that with Suggs' propensity for large FP units and bean counters reverting the Zen 5 to N4P, and you have a perfect storm for the lowest gen-on-gen INT gain.

I'd also add that Zen 3 was more than OK, it was a goated gen-on-gen jump. 16 months after Zen 2, miniscule area increase, massive improvement in INT throughput.
Intel did not ditch exotic and expensive stuff. Intel server chips still keep pushing AVX512, AMX, accelerators, etc. The goal for server SKUs has been set to match those instructions.

In case of AVX512, they provided a very balanced implementation by Zen 4. However, they somehow felt the need to jump the full-speed with the following design. Unlike Zen 3 which kept the vector width.

Zen 5 feels like another big bold design being great at niches but not being an all-rounder.
 
Reactions: exquisitechar

DisEnchantment

Golden Member
Mar 3, 2017
1,747
6,598
136
To me this release feels like a consequence of misreading the room when the development of Zen 5 started, which is 5-6 years ago realistically.

At the time, Intel had a big lead in FP and vector throughput in HEDT/server with SKL-X and then followed it up by bringing it to client with ICL and TGL.


You have an interesting angle.
Couple that with Suggs' propensity for large FP units and bean counters reverting the Zen 5 to N4P, and you have a perfect storm for the lowest gen-on-gen INT gain.
He is no longer at AMD, They knew at least couple of years earlier that it would turn out this way.

One of the weirder stuff I heard during some of the interviews with Mike Clark, was to unify the int schedulers so that you can make do with lesser int PRF.
They are counting their registers there but doubled the FP PRF.

On the other hand, while Z6 would also be a minor iterative core architecturally, it is going to benefit from clocks being on N3E.
So I think the physical implementation team would be able to come to their rescue here. They would have had enough time.
I think there is potential uplift from improving the uncore too which can help.
 

LightningZ71

Golden Member
Mar 10, 2017
1,783
2,139
136
Keep in mind that, while Zen5 appears to be a typical server-first core design, this is the first time we're seeing a tangible difference between the base core of the server chip, and a client-only part (Strix Point). We have heard in the past that there was going to be a divergence between the server parts and client parts with respect to the cores coming, and that by Zen6, it was going to be notable. What we are likely to see, going forward, as has been said in the past, is that client becomes based on the family line from Strix's Zen5 and server continues the progression of the desktop/server CCDs with full dress Zen5.

In a more academic sense, client is always going to be more memory bandwidth constrained than server, and putting a pair of giant, full throughput, AVX-512 units in the client cores is just a tremendous waste in the vast majority of cases (though I can certainly see where limited dataset size tasks might absolutely fly on 9Xx0X3D parts). Switching to half throughput AVX-512 like mobile Zen5 and maybe picking AVX-10.2 from Intel on client would seem to make a lot more sense going forward.

That split will allow the client core to focus more on improvements on client centric tasks and server cores to continue to focus on what they need to do better at.
 

CouncilorIrissa

Senior member
Jul 28, 2023
521
2,002
96
Intel did not ditch exotic and expensive stuff. Intel server chips still keep pushing AVX512, AMX, accelerators, etc. The goal for server SKUs has been set to match those instructions.
They did on client, though. Whereas AMD with their "one size fits all" approach ended up with a core that dedicates a large portion of its area for stuff that's almost irrelevant.

On the other hand, while Z6 would also be a minor iterative core architecturally, it is going to benefit from clocks being on N3E.
So I think the physical implementation team would be able to come to their rescue here. They would have had enough time.
I think there is potential uplift from improving the uncore too which can help.
We'll have our preview with STX Halo soon enough I guess.
The uncore is just pure cope at this point.
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |