Discussion Speculation: Zen 4 (EPYC 4 "Genoa", Ryzen 7000, etc.)

Page 239 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Vattila

Senior member
Oct 22, 2004
805
1,394
136
Except for the details about the improvements in the microarchitecture, we now know pretty well what to expect with Zen 3.

The leaked presentation by AMD Senior Manager Martin Hilgeman shows that EPYC 3 "Milan" will, as promised and expected, reuse the current platform (SP3), and the system architecture and packaging looks to be the same, with the same 9-die chiplet design and the same maximum core and thread-count (no SMT-4, contrary to rumour). The biggest change revealed so far is the enlargement of the compute complex from 4 cores to 8 cores, all sharing a larger L3 cache ("32+ MB", likely to double to 64 MB, I think).

Hilgeman's slides did also show that EPYC 4 "Genoa" is in the definition phase (or was at the time of the presentation in September, at least), and will come with a new platform (SP5), with new memory support (likely DDR5).



What else do you think we will see with Zen 4? PCI-Express 5 support? Increased core-count? 4-way SMT? New packaging (interposer, 2.5D, 3D)? Integrated memory on package (HBM)?

Vote in the poll and share your thoughts!
 
Last edited:
Reactions: richardllewis_01

inf64

Diamond Member
Mar 11, 2011
3,764
4,222
136
Zen4 is not a die shrink of Zen3 with AVX512 and more cache. As other people have shown, AMD spent 50% more transistors on this thing and it was not only for AVX and cache.
18% IPC number seems a bit high given how low the "ST" R23 uplift they shown , but we have already discussed this before. R23 has seen diminishing IPC returns on Zen1->Zen2->Zen3, and we have no idea at what ST boost their sample worked as they didn't disclose that information. R23 could very well be on the very low end of the IPC scale, just like it was in Zen3's case.

AMD finally has a core that can work well above 5Ghz and IPC is definitely improved by more than 5%. I find it funny that people tend to think intel can tweak ADL core so easily and somehow magically get up to 10% (!) IPC improvement ( like it was an easy thing to do), all on the same 10nm node where they are power and die space restricted.
At the same time they are skeptical of AMD getting that same amount ,hence the stupidly low claims of 5%-7% ST IPC, and all this while AMD moved to 5nm process node and pushed 50% more transistors into the core.
 

Timorous

Golden Member
Oct 27, 2008
1,727
3,152
136
18% IPC with that clock increase is much more than the 15% performance that AMD said, Zen 4 would be a monstrous uplift over Zen 3. No way.

I can see it. CB R23 does not take advantage of DDR5 or the L2 improvements where as other workloads will.

We already see some workloads gaining 15% or so by going from DDR4 to DDR5 on ADL and Zen 4 also has L2 changes.

Depending on the mix of workloads I can see 18%. I presume they are using a similar mix to what reviewers will use so they don't get accused of cherry picking and AMD have been pretty good at that.
 
Reactions: Tlh97 and ftt

Abwx

Lifer
Apr 2, 2011
11,167
3,862
136
If we account for AVX512, bigger cache and lower density of the cores critical parts to raise frequency then about 1.35x density improvement is left for IPC improvement, wich point to something like 14% improvement at worst.

That being said "only" +15% improvement in CB ST is possible but that sound underwhelming given that there should be massive FP execution ressources upgrade for SSE2/SSE4.2/AVX128/AVX256 by the virtue of a the AVX512 unit being decoupled in two 256b units.
 

RTX2080

Senior member
Jul 2, 2018
322
511
136
I'll be very surprised if Zen4 is just a die shrink and bring ~7% IPC by larger L2 and TLB and some other cache changes. In other words it's impossible if you just shrink process and get IPC that much. CB score doesn't care about the L2 size and memory. Skylake-X has doubled the L2 compared to non-avx512 Skylake and nearly 0% IPC gain.

so you have to at least slightly revise the core,,,,, that's why I call the in hand information too fishy, no matter ~7% or ~5% gain, and we don't even need to take TSMC N5 density into account, since Zen3 is only 14% bigger than Zen2.

"18% ipc"

Though in a later tweet he states being skeptical himself.

"Several of my sources, though I remain skeptical."

From Greymon's statement, I guess his one of the "18%IPC" source is the same as I heard months ago which was from China. But Everything is uncertain until launch or by AMD official statement.
 
Reactions: lightmanek

Vope45

Member
Oct 4, 2020
114
168
86
I'll be very surprised if Zen4 is just a die shrink and bring ~7% IPC by larger L2 and TLB and some other cache changes. In other words it's impossible if you just shrink process and get IPC that much. CB score doesn't care about the L2 size and memory. Skylake-X has doubled the L2 compared to non-avx512 Skylake and nearly 0% IPC gain.

so you have to at least slightly revise the core,,,,, that's why I call the in hand information too fishy, no matter ~7% or ~5% gain, and we don't even need to take TSMC N5 density into account, since Zen3 is only 14% bigger than Zen2.



From Greymon's statement, I guess his one of the "18%IPC" source is the same as I heard months ago which was from China. But Everything is uncertain until launch or by AMD official statement.
Now he is saying raptor will win in both st and mt





These leakers gave me a headache
 

RTX2080

Senior member
Jul 2, 2018
322
511
136
Now he is saying raptor will win in both st and mt


These leakers gave me a headache


'Leakers' are just speculating, and they don't have anything in hand or they'll break NDA. You'd better stay away from them because I know where they came from, Greymon, Raichu, etc, many of them are Chinese.
Raptor/Raphael, don't even in QS stage before Computex, confirmed by Ian Cutress and AMD officials. Which means the final silicon from both side is still not released.
Anyone who speculate by using ES, credibility is doubtful.

From I heard, ES Raptor consumes ~260watts and PCI-E bugged, and while nothing news from AMD. I would be very cautious when someone claimed he saw/being told the AMD ES perf.
 

eek2121

Diamond Member
Aug 2, 2005
3,051
4,273
136
18% IPC with that clock increase is much more than the 15% performance that AMD said, Zen 4 would be a monstrous uplift over Zen 3. No way.

AMD did not say "15%". They said "> 15%". If could well be 15.01% - 90% for all we know. We will know when further details are released.

Now he is saying raptor will win in both st and mt





These leakers gave me a headache

I fail to see how Intel can double the number of small cores without hampering the multicore performance of the big cores unless they have completely reworked power management. I also fail to see how they will magically have higher clocks. I could see a 100-200 MHz boost possibly, but the 12900KS could not even reliably hit 5.5 GHz. Note that with no clock/IPC changes, doubling the small cores would only add something like 25% to multicore performance. As it stands, 16-e cores would take 100W of the 230-240W PL2. AMD will have less cores competing for similar power limits.

I also think that if Raptor Lake ends up being faster, AMD will follow up with something even faster.
 

Vope45

Member
Oct 4, 2020
114
168
86
'Leakers' are just speculating, and they don't have anything in hand or they'll break NDA. You'd better stay away from them because I know where they came from, Greymon, Raichu, etc, many of them are Chinese.
Raptor/Raphael, don't even in QS stage before Computex, confirmed by Ian Cutress and AMD officials. Which means the final silicon from both side is still not released.
Anyone who speculate by using ES, credibility is doubtful.

From I heard, ES Raptor consumes ~260watts and PCI-E bugged, and while nothing news from AMD. I would be very cautious when someone claimed he saw/being told the AMD ES perf.

And how do you know of that 260w figure?
 

RTX2080

Senior member
Jul 2, 2018
322
511
136
I fail to see how Intel can double the number of small cores without hampering the multicore performance of the big cores unless they have completely reworked power management. I also fail to see how they will magically have higher clocks. I could see a 100-200 MHz boost possibly, but the 12900KS could not even reliably hit 5.5 GHz. Note that with no clock/IPC changes, doubling the small cores would only add something like 25% to multicore performance. As it stands, 16-e cores would take 100W of the 230-240W PL2. AMD will have less cores competing for similar power limits.

I also think that if Raptor Lake ends up being faster, AMD will follow up with something even faster.

Sounds like these "leakers" are simultaneously repeating the "Raptor GB ST score > 2300 possible" rumor on twitter.

But i'm not going to talk about Raptor, what I wonder is how could they get the perf number of AMD ES and even made a comparison, which is hilarious.

And how do you know of that 260w figure?
From a guy called "Enthusiast Citizen" which is active on chinese community, but he only told that with vague statement. He leaked so many ES like Rocketlake before. You can search his track record by google. But it turned out that not all what he said are accurate.
The 260watts statement could also refering to Sapphire Rapids which is confirmed to be postponed today. But I and many guys guess it's Raptorlake, especially SPR TDP is way more than 260w.

But no matter, everything is just rumors.
 

Timmah!

Golden Member
Jul 24, 2010
1,463
729
136
If the difference in speed is single digit percentage, I could choose either CPU but then I would look at price, platform longevity and power consumption.

Same here. Or better said, even if Raptor Lake is faster by few percents, i would go with 7950x, as i would rather have less power-hungry / difficult to cool CPU, this time around. I would make exception however for that rumored intel HEDT sapphire rapids CPU or 24C TR based on Zen4.

Anyway, i expect RPL and Zen4 to be more or less matched, if they werent, AMD would not stick with just 16 cores. Their rumored high clocks are surely their way to level the field with more cores on RPL and they have to believe its a legit solution to achieve that, if they are going that way.
Unless they were truly sandbagging and lying about the core number and will actually introduce 24C chip, which IMO is unlikely.
 

Mopetar

Diamond Member
Jan 31, 2011
8,005
6,449
136
IPC really comes down to what software is being run and how much it bottlenecks any pet of the chip. AMD certainly could have picked something that hits closer to the low end of the range, but if anything that can take advantage of either the added L2 cache or the additional memory bandwidth that DDR5 provides is going to see a larger bump.

Consider the following result from TPUs comparison of DDR4/5 for Alder Lake:



30% gain in IPC just due to DDR5.

Obviously, not everything is going to do that well (Blender only saw a ~1.5% improvement so it would hide extra IPC from DDR5), but also wonder if AMD will get more of a bump than Intel just because they only have to worry about DDR5 support. I would imagine that simplifies the design.

15% average seems high based on what we know right now, but I suspect it'll be closer to 10% than it will be to 5%, especially if you mix games in considering they'll like the added L2 cache more than most applications.
 

TESKATLIPOKA

Platinum Member
May 1, 2020
2,428
2,914
136
@eek2121
If Intel would put the e cores on a separate power trail they could save a lot of power on them.
I don't know how difficult this is to achieve but with 16 e cores on 0.9-1.0v @ 3.6Ghz they would make significant inroads wrt efficiency and not cripple the p cores.

Set voltage to 1V, and you have additional 8 E-cores for free.
By separating the power rail, I expect, they will save a lot of power by lowering voltage in desktop.
The problem is mobile. I don't expect the same gains there, and 5nm will help AMD a lot.
ADL-S has the advantage in cores compared to Rembrandt and so It wins for example in CBR23.
If Intel can't increase the number of e-cores for Raptor, then the rumored 16C32T Raphael-H will be faster.
It will be interesting to see how Phoenix will fare against Intel with only 8c16T.
 

JoeRambo

Golden Member
Jun 13, 2013
1,814
2,105
136
I think originally E-Cores were meant to operate from same voltage plane as their L2 cache and speed limited to 3.3ghz or so. That is where things make a lot of sense and efficiency is great. I think 30A of AUX VCC socket spec and the fact that they are increasing it to 40A in Raptor Lake makes a lot of sense, as additional ~18W of power would be enough to feed 8 more E-cores at that very efficient region and below.
Except of course a disaster called Intel's marketing happened and asked for 5950x "slayer", so they are fed from shared voltage plane now with Cache/P-Cores and are destined to be area efficient power burners, beaten in efficiency by P-Cores when they are on lower clocks.
 
Reactions: ryan20fun and Tlh97

Atari2600

Golden Member
Nov 22, 2016
1,409
1,655
136
If they did, they did it for the sake of the server market, not for desktops.

Some folks seem to keep forgetting this.

Desktops are an afterthought. Talk on a stage is cheap.

Server/HPC is where the money is. We get whatever easy adjustments they can make to tweak it for a market that cares more about clocks than TDP.
 
Reactions: Tlh97

Atari2600

Golden Member
Nov 22, 2016
1,409
1,655
136
18% IPC with that clock increase is much more than the 15% performance that AMD said, Zen 4 would be a monstrous uplift over Zen 3. No way.

Agreed.

NO way - simply because a PLC cannot hide advantage like that. It'd be a massive law suit by any investor who sold between computex and launch.
 

Saylick

Diamond Member
Sep 10, 2012
3,385
7,151
136
Agreed.

NO way - simply because a PLC cannot hide advantage like that. It'd be a massive law suit by any investor who sold between computex and launch.
Nah, they clearly said greater than 15% ST, so I think it would be hard to sue them if they hit it out of the gate. Zen 1 had 40% claimed IPC uplift (not greater than, it was claimed as is) but achieved 52%, but no lawsuit came about that.

Also, if Nvidia can get away with duping investors regarding crypto revenue and only paying a 5.5 mil settlement, even if AMD got sued, it wouldn't be much.
 

DisEnchantment

Golden Member
Mar 3, 2017
1,687
6,235
136
Nah, they clearly said greater than 15% ST, so I think it would be hard to sue them if they hit it out of the gate. Zen 1 had 40% claimed IPC uplift (not greater than, it was claimed as is) but achieved 52%, but no lawsuit came about that.

Also, if Nvidia can get away with duping investors regarding crypto revenue and only paying a 5.5 mil settlement, even if AMD got sued, it wouldn't be much.
No need to put any kind of explanation behind that.

The starting slide of that presentation says it all.

 

JoeRambo

Golden Member
Jun 13, 2013
1,814
2,105
136
AMD finally has a core that can work well above 5Ghz and IPC is definitely improved by more than 5%. I find it funny that people tend to think intel can tweak ADL core so easily and somehow magically get up to 10% (!) IPC improvement ( like it was an easy thing to do), all on the same 10nm node where they are power and die space restricted.
At the same time they are skeptical of AMD getting that same amount ,hence the stupidly low claims of 5%-7% ST IPC, and all this while AMD moved to 5nm process node and pushed 50% more transistors into the core.

It's important to note than Intel already has a very strong base to work from - they have just moved to 5 ALU architecture and initial iteration of Alder Lake was probably meant for mobile CPUs only. So it is very powerful and wide core married to weak memory subsystem. We don't need Raptor Lake to know that it scales wonderfully with improved memory latency/speed and L3 cache speeds.
So 5-10% IPC is not "magic", but rather reality of improved memory subsystem helping very wide core to perform. More and faster L3, improved IMC and 2MB of L2 go a long way to help IPC.
Think about it as Skylake evolution, where 6700K with DDR4 2133 would have substantially lower ST PPC than Comet Lake with DDR4 3200.

On the other hand, AMD is at the end of what started with Zen core, they've reworked FPU/VEC part of chip, but it is the same basic architecture as Zen3. So we are comparing new core that has lots of low lying fruit versus matured core, where performance increases are hard to find.
Even if we look at the individual improvements that are known, things like L2 TLB coverage are more meant to help server chips with 3D L3, less so on desktop chips where coverage was already good. L2 increase was done without increasing ways - not full perf increase and they might have had to give up some latency. It's of course important, but given AMD's excellent L3, it's less impact than similar change would have on Intel.
 
Reactions: igor_kavinski

inf64

Diamond Member
Mar 11, 2011
3,764
4,222
136
It's important to note than Intel already has a very strong base to work from - they have just moved to 5 ALU architecture and initial iteration of Alder Lake was probably meant for mobile CPUs only. So it is very powerful and wide core married to weak memory subsystem. We don't need Raptor Lake to know that it scales wonderfully with improved memory latency/speed and L3 cache speeds.
So 5-10% IPC is not "magic", but rather reality of improved memory subsystem helping very wide core to perform. More and faster L3, improved IMC and 2MB of L2 go a long way to help IPC.
Think about it as Skylake evolution, where 6700K with DDR4 2133 would have substantially lower ST PPC than Comet Lake with DDR4 3200.

On the other hand, AMD is at the end of what started with Zen core, they've reworked FPU/VEC part of chip, but it is the same basic architecture as Zen3. So we are comparing new core that has lots of low lying fruit versus matured core, where performance increases are hard to find.
Even if we look at the individual improvements that are known, things like L2 TLB coverage are more meant to help server chips with 3D L3, less so on desktop chips where coverage was already good. L2 increase was done without increasing ways - not full perf increase and they might have had to give up some latency. It's of course important, but given AMD's excellent L3, it's less impact than similar change would have on Intel.
I think that you are expecting way too much from Raptor Lake. I expect it to be just like Ice Lake-> Tiger Lake evolution, very minor IPC improvement on average with tweaks here and there (in this case tweaks to the E-cores and not the P-cores, except the cache).

Zen4, on the other hand, has a massive 50% more transistors and was built on a whole new node giving AMD a lot more room to invest in expanded core structures. Just widening the load/store engine usually nets a great improvement to IPC (historically looking Core->Core2, K8->K10), not to mention all other parts that AMD can tweak now that they had transistor budge and time to do it. Anything less than 10% (average) for ST IPC going from Zen3, would mean that AMD severely underperformed versus the previous Zen iterations. Zen1 brought 52% (on a new node), Zen2 around ~15% (on a new node), Zen3 ~19% (on the same 7nm node!).

Zen4 should at least get 10% as a bare minimum, especially looking at the competitive landscape and the new node AMD had on its disposal. If history is anything to go by, we should give AMD a benefit of doubt.
 

moinmoin

Diamond Member
Jun 1, 2017
4,994
7,765
136
On the other hand, AMD is at the end of what started with Zen core, they've reworked FPU/VEC part of chip, but it is the same basic architecture as Zen3. So we are comparing new core that has lots of low lying fruit versus matured core, where performance increases are hard to find.
Intel suddenly is not the latter anymore?

With Zen nobody was foreseeing how much Zen 2 would expand on it despite being built around the same core ported to a smaller node. Zen 3, while going by the same specs as Zen 2, is a fundamental redesign of the actual core where as well we don't know yet to what extend expandability was already being prepared for Zen 4.

That the low hanging fruits are easier to see with ADL this round speaks more to how badly balanced ADL's design is rather than that Zen is a "mature core" with no way forward.
 

JoeRambo

Golden Member
Jun 13, 2013
1,814
2,105
136
Just widening the load/store engine usually nets a great improvement to IPC (historically looking Core->Core2, K8->K10), not to mention all other parts that AMD can tweak now that they had transistor budge and time to do it. Anything less than 10% (average) for ST IPC going from Zen3, would mean that AMD severely underperformed versus the previous Zen iterations. Zen1 brought 52% (on a new node), Zen2 around ~15% (on a new node), Zen3 ~19% (on the same 7nm node!).

Core2 and K10 widened not only "load/store", in case of Core2 it was the first 3 ALU machine and core in general was widened to 4 uOPs throughput, and it was evolved in performance up to Sandy Bridge.
The "widening" of load/store i find it funny concept. I think Intel since IceLake, that is 2015-6 era design was able to load 1024bits / store 512bits. And it was exactly double that of Skylake. With plenty of other improvements to the core it managed to gain what? <20% in IPC ?
The Zen core rate of advancement does not mean anything either, as their baseline was junk CPU of their own. When we align Zen family performance to state of art of ~2013 in Skylake, it is obvious that out of those improvements only Zen3 stands out

That the low hanging fruits are easier to see with ADL this round speaks more to how badly balanced ADL's design is rather than that Zen is a "mature core" with no way forward.

The fun thing is that from AMDs PPT increase, very high clocks at same time it seems that ADL has caught them with pants down uninspired and unambitious design, hopefully they will do better with ZEN5. Maybe their V/F curve is not hockey stick shaped, who knows, but most likely they are about to come up with highly "balanced" marketing SKU themselves.
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |