Discussion Zen 5 Speculation (EPYC Turin and Strix Point/Granite Ridge - Ryzen 9000)

DisEnchantment · Sep 29, 2022

Speculate at will

StefanR5R · Jun 15, 2024

Hitman928 said:
Compared to Apple at least, AMD's CAGR in IPC is around twice that of Apple's (I get ~~5.5~~ 6.23 CAGR IPC for Apple and 11.8 for AMD).

Edit: Starting in 2020 with Zen 3 and A14.

poke01 said:
yep thats right. Thats both impressive and sad lol. Impressive that Apple still holds the IPC lead after the slowdowns and sad that AMD still cannot beat Apple despite executing better than Apple.

Doug S said:
Those numbers are chosen to make AMD look relatively better. Go back to 2017 and you'd see a different story.

It would be interesting to compare what Apple's growth rate was when their IPC was where Zen 4 is. I'm sure they were doing a lot better than they are now, because it is more difficult to increase IPC the higher it gets.

CPU architecture updates never aim to maximize IPC. Instead, they aim for a target area within a space with several optimization dimensions: IPC, single-core performance, SoC performance, performance per power, performance per area, feature set, security…

For Apple for instance, increasing IPC gets more difficult not only because they are now on a high IPC level, but also because they are increasing clocks too. Making a high IPC core is easier if it doesn't have to clock high at the same time.

(For Apple, increasing IPC perhaps also gets more difficult because persons involved in the former large IPC advances are no longer working at Apple.)

Edit:
The CPU makers are not playing in exactly the same market segments. Thus, they don't have exactly the same optimization goals. This is one of the reasons why they end up advancing IPC at different rates. It's not simply "X increased IPC more than Y = X performs better than Y". Nor, "Z increased IPC by n in 2 years = they under-perform".

PPS,
and it's not IPC anyway, but iso-clock performance. ;-P

inquiss · Jun 15, 2024

Philste said:
I don't even know if V Cache with higher clocks / Voltages is a good thing. It's biggest strength, the low power consumption, is literally because of that limitations. So maybe ZEN5 X3D will be another like 5% faster but lose it's power consumption advantage? Doesn't sound like a good tradeoff for me. Also Both CCDs with V Cache is unnecessary.

Will still be more efficient with the cache as there will be fewer requests to DRAM. At the halo end I think it makes sense to have the capability. Being the outright best means you can charge a premium for that. The end user interested in the extra efficiency can turn the clocks down if they wish.

junjie1475 · Jun 15, 2024

SpudLobby said:
I think these internal measurements be it powermetrics or the internal API geekerwan used are fine for measuring directional shifts *maybe* in CPU power, and just maybe, but overall they’re not that valuable.

https://patents.google.com/patent/US11435798B2/en In this new patent from 2021, Apple introduced adaptive DPE which compares the estimated value with the physically measured value and adjusts the weight in accordance with the error. So the quality of the estimated number is quite good(especially for newer chips).

SpudLobby · Jun 15, 2024

junjie1475 said:
https://patents.google.com/patent/US11435798B2/en In this new patent from 2021, Apple introduced adaptive DPE which compares the estimated value with the physically measured value and adjusts the weight in accordance with the error. So the quality of the estimated number is quite good(especially for newer chips).

For one, we don't actually know they use this simply because they have a patent for it. In 2021, Andrei Frumusanu measured M1 Max platform power vs external polling minus idle and received inconsistent results - he maintains powermetrics is itself still not fantastic.

But, let's say this is true: I can believe this, but one thing you're missing is that they don't measure the DRAM and the rest of the platform power - or even attempt to anymore whereas formerly on powermetrics they did, which is also why those results from the A17 are at 3.62W with the internal Apple measurements (measuring the core/CPU) and 5.7W when measured externally even after removing idle power.

And the M4? We see the same thing. Core power in the 7W range, which checks out and shows a doubling vs the M3 or A17's core, but the platform power is around 11W when polled externally - which checks out.

So even if they are good, they're not especially useful when measuring relative to other systems using full power polling in vrms or even usb-c with the battery out, which is what we actually want given the influence of DRAM choices and access rates, along with power delivery upon actual power use (also a key reason SLC caches are useful for battery life/energy efficiency).

This measurement is useful for comparing Apple CPU-only to other Apple CPU-only (or just core only depending on what the internal API offers), I'll hand it that, and that's about it.

biostud · Jun 15, 2024

poke01 said:
This gen on gen is very convenient for AMD comparisons since it’s every 2 years unlike for Intel, Apple and ARM which is every year.

AMD’s IPC improvements YoY are mediocre.

Zen 3 -> Zen 4: around 22 months, ~6.5% YoY. ~13% total. Max clock increase as well

Zen 4 -> Zen 5: around 22 months, ~8% YoY. ~16% total. No max clock increase.

Apples done ~7% IPC improvement in around 7 months, M3 -> M4 and a clock increase. However, the last four years have been slower for Apple but let’s see if they improve.

I expected better from AMD, since they are proper chip company but it looks like after the OG Zen, only Zen 3 stands out.

Can we like wait for tests. I mean in tests without AVX-512 it won’t be that impressive. We already know CBR23 and CB 2024 scores for Strix Point. Nothing groundbreaking.

IPC gains are all fun for a discussion here in the forum, but to me it makes more sense to compare IPC+clock improvements/regression when comparing generations and there succes.

Hitman928 · Jun 15, 2024

Doug S said:
Those numbers are chosen to make AMD look relatively better. Go back to 2017 and you'd see a different story.

The goal wasn't to make AMD look good at all, but simply to compare. I chose 2020 for two reasons. First, because the discussion was about recent performance and 2020 lined up well with it being recent history and both having major core releases right around the same time that year. Second, the further back you go the harder it is to get good, consistent data for comparison. For instance, Anandtech never tested Zen+ in SPEC, there is a switch from SPEC2006 to SPEC2017, and older mobile tests were all done with SPEC speed rather than SPEC rate 1n. The last two I'm OK working with as approximations, but it starts to get less of a direct comparison, and having no SPEC data at all for Zen+ kind of kills the whole thing. I could go back to Zen 2, but it had a sizeable IPC increase as well and I don't think the conclusion will change very much.

Doug S said:
It would be interesting to compare what Apple's growth rate was when their IPC was where Zen 4 is. I'm sure they were doing a lot better than they are now, because it is more difficult to increase IPC the higher it gets.

For a few years, Apple was way ahead of everyone in IPC increases, I don't think anyone would argue otherwise. I'll see if I can do some rough approximation on when they were at Zen 4 ish IPC but the why wasn't really what my post was about.

StefanR5R said:
CPU architecture updates never aim to maximize IPC. Instead, they aim for a target area within a space with several optimization dimensions: IPC, single-core performance, SoC performance, performance per power, performance per area, feature set, security…

For Apple for instance, increasing IPC gets more difficult not only because they are now on a high IPC level, but also because they are increasing clocks too. Making a high IPC core is easier if it doesn't have to clock high at the same time.

(For Apple, increasing IPC perhaps also gets more difficult because persons involved in the former large IPC advances are no longer working at Apple.)

Edit:
The CPU makers are not playing in exactly the same market segments. Thus, they don't have exactly the same optimization goals. This is one of the reasons why they end up advancing IPC at different rates. It's not simply "X increased IPC more than Y = X performs better than Y". Nor, "Z increased IPC by n in 2 years = they under-perform".

Ultimately, the end users don't care about IPC. It comes down to performance and efficiency. IPC and clock speed are just 2 knobs (connected to a bunch of smaller knobs) to get to the end goal. That doesn't mean it isn't something interesting to analyze and discuss, though, as it relates to the bigger picture.

StefanR5R said:
PPS,
and it's not IPC anyway, but iso-clock performance. ;-P

Yes, we know. It gets brought up like every 6 months. IPC has become a more generalized term for PPC and it is used that way far and wide now, even by the companies themselves (at least in public facing materials).

FlameTail · Jun 15, 2024

New leak dropped:

Jan Olšan · Jun 15, 2024

Hitman928 said:
The goal wasn't to make AMD look good at all, but simply to compare. I chose 2020 for two reasons. First, because the discussion was about recent performance and 2020 lined up well with it being recent history and both having major core releases right around the same time that year.

Also in 2020 after M1, there was a lot of people plotting graphs from the past and claiming how x86 was hopelessly behind and over.
Not a bad idea to basically confront those predictions with reality. I like the part where people looked down on "stupid GHz" (looking at Mr. Masters) and assuming Apple will not raise clocks in the future too. The path Apple's processors too fro then on is kind of amusing in that context.

Nothingness · Jun 15, 2024

Jan Olšan said:
Also in 2020 after M1, there was a lot of people plotting graphs from the past and claiming how x86 was hopelessly behind and over.
Not a bad idea to basically confront those predictions with reality. I like the part where people looked down on "stupid GHz" (looking at Mr. Masters) and assuming Apple will not raise clocks in the future too. The path Apple's processors too fro then on is kind of amusing in that context.

That's interesting to show people that making predictions is error prone. The same applies to people claiming Intel or AMD are back in the game. They barely caught up with Apple.

Now it will be interesting to see how things evolve for all the players.

I place my bet as I already did previously: we're getting dangerously close to the point where large IPC increases can't be done at reasonable area and power costs. So things will move more slowly in the coming years, or at least the major players will have similar IPC.

As all predictions, I will be proven wrong of course, and I hope I will as I want CPU to get faster

FlameTail · Jun 15, 2024

Nothingness said:
That's interesting to show people that making predictions is error prone. The same applies to people claiming Intel or AMD are back in the game. They barely caught up with Apple.

Now it will be interesting to see how things evolve for all the players.

I place my bet as I already did previously: we're getting dangerously close to the point where large IPC increases can't be done at reasonable area and power costs. So things will move more slowly in the coming years, or at least the major players will have similar IPC.

As all predictions, I will be proven wrong of course, and I hope I will as I want CPU to get faster

100x speedup incoming:

Did startup Flow Computing just make CPUs 100x faster? Here’s the white paper and FAQs

Read for yourself.

www.theverge.com

Nothingness · Jun 15, 2024

FlameTail said:
100x speedup incoming:

Did startup Flow Computing just make CPUs 100x faster? Here’s the white paper and FAQs

Read for yourself.

www.theverge.com

I'm talking about all purpose CPU not specialized units

Anyway we've seen enough promises in the past of spectacular breakthroughs, none of which materialized.

yottabit · Jun 15, 2024

Philste said:
I don't even know if V Cache with higher clocks / Voltages is a good thing. It's biggest strength, the low power consumption, is literally because of that limitations. So maybe ZEN5 X3D will be another like 5% faster but lose it's power consumption advantage? Doesn't sound like a good tradeoff for me. Also Both CCDs with V Cache is unnecessary.

It’s rare for me to disagree with every point of a post but I do. You can just adjust Ryzen Master or UEFI if you want higher efficiency. I’d love dual CCD VCache for certain productivity workloads. It would help compensate for the 16 cores being membw constrained by dual channel memory. A lot of folks like myself use the 5950 / 7950x as an affordable prosumer / workstation chip. For me the extra clocks would be worth it and dual CCD vcache would also remove thread scheduling concerns even if it doesn’t boost all workloads.

If you want efficiency or a chip that’s great for gaming alone then there’s no reason to get the 16 core over an 8 core

Edit: also don’t forget the benefits of increasing cache can be more like a step function than a linear slope. With double the Vcache more applications may suddenly benefit as they may then be able to fit their workloads in Vcache

uzzi38 · Jun 15, 2024

H433x0n said:
I was perplexed by that when I read the endnotes. Shouldn’t the result be the same if they compared 7950X v 9950X instead of 7700X v 9950X?

It realistically shouldn't make a difference, but also it means the results don't necessarily correlate to 1T performance, as SMT is included when you do it this way. Plus also a 7700X is less likely to be memory bandwidth constrained than a 9950X as well.

uzzi38 · Jun 15, 2024

uzzi38 said:
+150MHz, to be exact. 7950X FMax is 5.85GHz.

There was a rumour that Zen 5 FMax is 6.1GHz. that could very well end up have been true given the clocks/voltage here, because that 5.7GHz at ~1.25v for relatively low end silicon implies that for the well binned stuff 5.7GHz at 1.2v may be possible. And in case you're wondering why 1.2v matters - that's V-Cache die territory.

People don't seem to have got what I meant, so I'll elaborate with a warning: this is speculation. But if top bin Zen 5 CCDs are capable of hitting 5.7GHz at 1.2v, then it's theoretically possible for a potential 9950X3D to also clock both CCDs at a cap of 5.7GHz.

More importantly than us talking about a very fast X3D part: it nullifies the weird scheduling business needed for Zen 4 parts on X3D parts. If both CCDs can hit the same frequency, then the V-Cache CCD (aside from some extreme niche cases) will always be the faster CCD. Which means AMD just needs to rely on regular old CPPC to handle allocating workloads to cores, no need for the Game Bar nonsense we had last time.

Also means there's no need for a 9950X to have both CCDs to feature V-Cache.

But again, this is just a theory. Could very easily be wrong.

fastandfurious6 · Jun 15, 2024

FlameTail said:
100x speedup incoming:

Did startup Flow Computing just make CPUs 100x faster? Here’s the white paper and FAQs

Read for yourself.

www.theverge.com

yet another scammy mainland-europe "tech startup" selling hot air 😭

fastandfurious6 · Jun 15, 2024

what was the issue of v-cache inability for both CCDs? voltage or thermals or both? I think they may have fixed it by now, just a matter of time

300mb L3 cache?? 😲😲😲 can store entire Stuff on the cpu itself, and with x2 bandwidth in new cache speeds...!!!!!

biostud · Jun 15, 2024

fastandfurious6 said:
what was the issue of v-cache inability for both CCDs? voltage or thermals or both? I think they may have fixed it by now, just a matter of time

300mb L3 cache?? 😲😲😲 can store entire Stuff on the cpu itself, and with x2 bandwidth in new cache speeds...!!!!!

Because the cache is not shared between CCDs (or at least not without huge performance penalty)

Otherwise it would be interesting if the 3D parts had the same clock-speeds as the non 3D parts. That would mean that the 9800X3D would be 20% faster than the 7800X3D.

Joe NYC · Jun 15, 2024

uzzi38 said:
People don't seem to have got what I meant, so I'll elaborate with a warning: this is speculation. But if top bin Zen 5 CCDs are capable of hitting 5.7GHz at 1.2v, then it's theoretically possible for a potential 9950X3D to also clock both CCDs at a cap of 5.7GHz.

More importantly than us talking about a very fast X3D part: it nullifies the weird scheduling business needed for Zen 4 parts on X3D parts. If both CCDs can hit the same frequency, then the V-Cache CCD (aside from some extreme niche cases) will always be the faster CCD. Which means AMD just needs to rely on regular old CPPC to handle allocating workloads to cores, no need for the Game Bar nonsense we had last time.

Also means there's no need for a 9950X to have both CCDs to feature V-Cache.

But again, this is just a theory. Could very easily be wrong.

Assuming V-Cache does not reduce the clock speed in Zen 5, additional advantage of dual CCD for 9950x3d would be ability to use Windows 10.

Joe NYC · Jun 15, 2024

biostud said:
Because the cache is not shared between CCDs (or at least not without huge performance penalty)

Otherwise it would be interesting if the 3D parts had the same clock-speeds as the non 3D parts. That would mean that the 9800X3D would be 20% faster than the 7800X3D.

Well, there is the 16% IPC + (potential) 10% clock speed advantage of 9800x3d vs. 7800x3d

biostud · Jun 15, 2024

Joe NYC said:
Well, there is the 16% IPC + (potential) 10% clock speed advantage of 9800x3d vs. 7800x3d

I couldn't remember the clock difference.

blackangus · Jun 15, 2024

biostud said:
Because the cache is not shared between CCDs (or at least not without huge performance penalty)

I can only think it would still be a net win vs going to main memory.

poke01 · Jun 15, 2024

Jan Olšan said:
Also in 2020 after M1, there was a lot of people plotting graphs from the past and claiming how x86 was hopelessly behind and over.
Not a bad idea to basically confront those predictions with reality. I like the part where people looked down on "stupid GHz" (looking at Mr. Masters) and assuming Apple will not raise clocks in the future too. The path Apple's processors too fro then on is kind of amusing in that context.

It goes both ways. Its not like the x86 gang on twitter didn't make silly statements this year about Apple 's future too.

Ajay · Jun 15, 2024

biostud said:
Because the cache is not shared between CCDs (or at least not without huge performance penalty)

At a minimum, it is via cache coherency protocols.

junjie1475 · Jun 15, 2024

SpudLobby said:
For one, we don't actually know they use this simply because they have a patent for it. In 2021, Andrei Frumusanu measured M1 Max platform power vs external polling minus idle and received inconsistent results - he maintains powermetrics is itself still not fantastic.

It has been proven by measuring directly from the pin of the SoC that the error between the “real power” and the estimated one is less than 5%.

SpudLobby · Jun 15, 2024

junjie1475 said:
It has been proven by measuring directly from the pin of the SoC that the error between the “real power” and the estimated one is less than 5%.

TIL. Geekerwan found that?

Interesting, but doesn’t address the other points about total power.

But that’s quite useful then for intra-Apple CPU comparisons.

Discussion Zen 5 Speculation (EPYC Turin and Strix Point/Granite Ridge - Ryzen 9000)

Golden Member

Elite Member

Member

Junior Member

Senior member

Lifer

Diamond Member

Diamond Member

Senior member

Platinum Member

Diamond Member

Platinum Member

Golden Member

Platinum Member

Platinum Member

Member

Member

Lifer

Platinum Member

Platinum Member

Lifer

Member

Golden Member

Lifer

Junior Member

Senior member