Discussion Speculation: Zen 4 (EPYC 4 "Genoa", Ryzen 7000, etc.)

Page 166 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Vattila

Senior member
Oct 22, 2004
805
1,394
136
Except for the details about the improvements in the microarchitecture, we now know pretty well what to expect with Zen 3.

The leaked presentation by AMD Senior Manager Martin Hilgeman shows that EPYC 3 "Milan" will, as promised and expected, reuse the current platform (SP3), and the system architecture and packaging looks to be the same, with the same 9-die chiplet design and the same maximum core and thread-count (no SMT-4, contrary to rumour). The biggest change revealed so far is the enlargement of the compute complex from 4 cores to 8 cores, all sharing a larger L3 cache ("32+ MB", likely to double to 64 MB, I think).

Hilgeman's slides did also show that EPYC 4 "Genoa" is in the definition phase (or was at the time of the presentation in September, at least), and will come with a new platform (SP5), with new memory support (likely DDR5).



What else do you think we will see with Zen 4? PCI-Express 5 support? Increased core-count? 4-way SMT? New packaging (interposer, 2.5D, 3D)? Integrated memory on package (HBM)?

Vote in the poll and share your thoughts!
 
Last edited:
Reactions: richardllewis_01

DrMrLordX

Lifer
Apr 27, 2000
21,797
11,143
136
And looks like Intel is doing something similar, splitting the server roadmap, but arriving to market fashionably late, ~2 years behind Bergamo (adding usual delays):

Kind of the wrong thread for that but

woof

2024 for Granite Rapids? Intel is in serious trouble.

More memory and PCIe lanes?
But with EPYC launching with 96 cores, why not +50% in desktop?

It sounds great in theory, but in practice, AMD isn't even selling us Milan as Threadripper yet, so it's a bit premature to expect Genoa.
 
Reactions: Joe NYC

DrMrLordX

Lifer
Apr 27, 2000
21,797
11,143
136
I don't think there is any better thread than this to compare Granite Rapids with its potential competition, because we don't have Zen 5 and Zen 6 threads yet.

There is a generic Intel thread here:


where that slide hasn't been posted (yet, to my knowledge). It really belongs over there.
 

Joe NYC

Platinum Member
Jun 26, 2021
2,324
2,929
106
There is a generic Intel thread here:

I was just kidding. But seriously, Intel has nothing to compete against Genoa and Bergamo, probably for 2 years.

And it increasingly looks like Saphire Rapids ramp is going to coincide with Genoa ramp, while Intel has desperately tried for find a couple of benchmarks where SPR can beat Milan and had to compare even later ramping SPR+HBM to compare with Milan-X...
 

Mopetar

Diamond Member
Jan 31, 2011
8,004
6,446
136
And it increasingly looks like Saphire Rapids ramp is going to coincide with Genoa ramp, while Intel has desperately tried for find a couple of benchmarks where SPR can beat Milan and had to compare even later ramping SPR+HBM to compare with Milan-X...

Well if Alder Lake is anything to go by, Intel can probably muster bigger bars on the power draw tests. I'm not sure the data center customers will be so enthused, but a win is a win right?

Jokes aside it doesn't matter too much since AMD can't supply 100% of the market at this time and Intel is going to be able to pick up sales for anyone who can't get their hands on Genoa or from customers who want Intel for whatever reason they might have.
 
Reactions: Tlh97 and Joe NYC

Joe NYC

Platinum Member
Jun 26, 2021
2,324
2,929
106
Jokes aside it doesn't matter too much since AMD can't supply 100% of the market at this time and Intel is going to be able to pick up sales for anyone who can't get their hands on Genoa or from customers who want Intel for whatever reason they might have.

Yup, and Intel has been very lucky that demand for server chips was extremely high last couple of quarters.

But imagine a scenario, say a year from now, in Q1 2023, when AMD is able to supply 30% of the market, and there is a recession and the demand drops 20%. All of that drop would come out of Intel's share, because the only people buying Intel CPUs would be customers who could not buy AMD.

And on top of it, Intel has to discount heavily their 1600 mm2 SPR, which is not yielding very well the 400mm2 tiles.

That would be a perfect storm for Intel, which was not in any of the rosy predictions that Intel presented today to financial analysts.
 
Last edited:

gruffi

Member
Nov 28, 2014
35
117
106
They'd either need 12-core chiplets or an IO die that can connect up to three chiplets. Not sure if they'd make such an IO die if the majority of the Ryzen lineup is 1 and 2 chiplet processors.
The IOD for Zen 4 desktop is new anyway. I guess it is capable of connecting at least 3 CCDs.
 

DisEnchantment

Golden Member
Mar 3, 2017
1,684
6,227
136
The IOD for Zen 4 desktop is new anyway. I guess it is capable of connecting at least 3 CCDs.
Here is a thought...

Zen 4 IOD may be an intermediate step.
For Z5 Granite Ridge, Z4D may reside in IOD rather in a CCD. Z5 CCD on N3 and Z4D in IOD using cheap N6 (by then).
During background work, only the IOD will be active with its Z4D cores and Z5 CCDs completely powered down.
So AMD is likely not going to design a new small core, but may add a Z4 CCX to IOD designed with high density libraries and power optimized, using a port of Z4 Mobile to N6.

I mean literally you could adapt an RMB CPU design to convert it to IOD, cutting some GPU, add GMI interfaces, improve PCIe etc and there is your big.LITTLE.
The amazing thing about Infinity Fabric is that it does not care where the cores/CCXs are located. They could be on die, off die or off package. It will glue them all just fine.

Btw, there are commits in EDAC to support asymmetric cores (but same ISA), with asymmetric EDAC bank layouts support added to Linux kernel. Seems Borislav Petkov was also surprised about it.
 

Kepler_L2

Senior member
Sep 6, 2020
460
1,895
106
Here is a thought...

Zen 4 IOD may be an intermediate step.
For Z5 Granite Ridge, Z4D may reside in IOD rather in a CCD. Z5 CCD on N3 and Z4D in IOD using cheap N6 (by then).
During background work, only the IOD will be active with its Z4D cores and Z5 CCDs completely powered down.
So AMD is likely not going to design a new small core, but may add a Z4 CCX to IOD designed with high density libraries and power optimized, using a port of Z4 Mobile to N6.

I mean literally you could adapt an RMB CPU design to convert it to IOD, cutting some GPU, add GMI interfaces, improve PCIe etc and there is your big.LITTLE.
The amazing thing about Infinity Fabric is that it does not care where the cores/CCXs are located. They could be on die, off die or off package. It will glue them all just fine.

Btw, there are commits in EDAC to support asymmetric cores (but same ISA), with asymmetric EDAC bank layouts support added to Linux kernel. Seems Borislav Petkov was also surprised about it.
Granite Ridge does not have small cores...
 
Reactions: Joe NYC

DisEnchantment

Golden Member
Mar 3, 2017
1,684
6,227
136
Granite Ridge does not have small cores...
Right, that would be STX. Since STX is chiplet, that would mean the SoC die is on N5 and CCD on N3?
If so then what I wrote about the Z4D on SoC/IO tile would still make some sense (btw I read about IOD based small core somewhere else)
Question would be how AMD is going to win Cinebench without small cores
 

ryanjagtap

Member
Sep 25, 2021
110
132
96
Here is a thought...

Zen 4 IOD may be an intermediate step.
For Z5 Granite Ridge, Z4D may reside in IOD rather in a CCD. Z5 CCD on N3 and Z4D in IOD using cheap N6 (by then).
During background work, only the IOD will be active with its Z4D cores and Z5 CCDs completely powered down.
So AMD is likely not going to design a new small core, but may add a Z4 CCX to IOD designed with high density libraries and power optimized, using a port of Z4 Mobile to N6.

I mean literally you could adapt an RMB CPU design to convert it to IOD, cutting some GPU, add GMI interfaces, improve PCIe etc and there is your big.LITTLE.
The amazing thing about Infinity Fabric is that it does not care where the cores/CCXs are located. They could be on die, off die or off package. It will glue them all just fine.

Btw, there are commits in EDAC to support asymmetric cores (but same ISA), with asymmetric EDAC bank layouts support added to Linux kernel. Seems Borislav Petkov was also surprised about it.
This is a nice thought, even if they have 2C/4T integrated into the I/O die with nuclear display they can sell that I/O die as a stand alone chip for low end systems like chromebooks et al. Then they can add CCX onto that as required. The only problem I see in this is the L3 cache, it's pretty big part of performance improvements in ryzen and adding that much cache on I/O die will increase its size and power requirements.
 

Mopetar

Diamond Member
Jan 31, 2011
8,004
6,446
136
The IOD for Zen 4 desktop is new anyway. I guess it is capable of connecting at least 3 CCDs.

One other problem with the idea is what AM5 itself was designed for. An extra chiplet (or even 2!) won't be as effective in many circumstances if it still has the same number of memory channels.

I'm guessing that Zen 4 tops out at 16C/32T again. Even with future Intel CPUs receiving more E-cores, AMD gaining an IPC edge and having more full cores (with SMT) should allow them to maintain superiority in heavy MT workloads.

Anything they might lose on with Ryzen is going to scale well enough that Threadripper should dumpster both AMD's and Intel's desktop chips. Why worry about trying to sell people a 24c Ryzen CPU when you can sell them a 64c Threadripper instead?
 
Reactions: scineram

HurleyBird

Platinum Member
Apr 22, 2003
2,725
1,342
136
This is a nice thought, even if they have 2C/4T integrated into the I/O die with nuclear display they can sell that I/O die as a stand alone chip for low end systems like chromebooks et al. Then they can add CCX onto that as required. The only problem I see in this is the L3 cache, it's pretty big part of performance improvements in ryzen and adding that much cache on I/O die will increase its size and power requirements.

Probably wouldn't have much L3 at all on the base IOD. Maybe could support stacked cache, which would then act as L4 for all CCXes whether on the IOD or on CCDs. That would be a rather neat setup.

You could also invert it so that the "performance" cores are on the IOD while the "density" cores are on the CCDs. That might not make as much sense economically (unless you also have a more economical IOD on an older node that is used on most products), but makes a lot of sense otherwise. The performance cores benefit more from reduced latency and don't need to pay the die hopping power consumption tax. The density cores are more for raw throughput with lower latency sensitivity. You get the advantages of being monolithic and the advantages of using chiplets.
 
Last edited:
Reactions: Tlh97 and Vattila

Mopetar

Diamond Member
Jan 31, 2011
8,004
6,446
136
I suspect you might see a Zen 4+4c product if the timing works out.

I'll admit that using Zen 4c chiplets for a Ryzen product is a possibility that I hadn't considered. However, why not just have 2x Zen 4c chiplets in that case? If you really care about having more cores, then you probably don't care about having a chiplet with fewer cores that can clock slightly higher.
 

Frenetic Pony

Senior member
May 1, 2012
218
179
116
This is a nice thought, even if they have 2C/4T integrated into the I/O die with nuclear display they can sell that I/O die as a stand alone chip for low end systems like chromebooks et al. Then they can add CCX onto that as required. The only problem I see in this is the L3 cache, it's pretty big part of performance improvements in ryzen and adding that much cache on I/O die will increase its size and power requirements.

Yeah, I think you want IO/Compute/Cache all on their own dies, all on their own processes right for them. But then... the C cores are also small so uhm.. Hmm. You'd get some sort of power savings not going across whatever die links you have. But if you're doing efficiency cores anyway, low power low perf, then you're not spending a lot of power on bandwidth as well.

Who knows what AMD's strategy for building and updating their uhh, C cores is. Think we can guess the "big chiplets" will be 12 normal (N) cores or 16 C cores. I guess maybe a 28 core high end consumer desktop CPU incoming next year? But... I dunno how I'd do them on small SOCs. An IO die for laptops might not even make sense on the smaller ones. 128bit bus (even the M1 Max doesn't saturate its 256bit bus so...) and 2N6C cores might be small enough to be monolithic anyway for 9-15 watts. And a 25 watt+ 256bit bus might just have video stuff on the IO die, then you can attach normal 12N core die and a separate GPU die? I dunno, it all feels messy.
 

jpiniero

Lifer
Oct 1, 2010
14,831
5,444
136
I'll admit that using Zen 4c chiplets for a Ryzen product is a possibility that I hadn't considered. However, why not just have 2x Zen 4c chiplets in that case? If you really care about having more cores, then you probably don't care about having a chiplet with fewer cores that can clock slightly higher.

The max clock speed gap could be enough to make it not that appealing for gaming compared to the regular Zen 4. So if you do 4+4c you'd beat 2x4 in MT while still being comparable in gaming.
 

jamescox

Senior member
Nov 11, 2009
640
1,104
136
I'll admit that using Zen 4c chiplets for a Ryzen product is a possibility that I hadn't considered. However, why not just have 2x Zen 4c chiplets in that case? If you really care about having more cores, then you probably don't care about having a chiplet with fewer cores that can clock slightly higher.
That would be great for applications where high core count is a big boost, but the low power optimized core of Zen 4c will likely not be able to clock anywhere near as high as the regular Zen 4 cores, so it likely isn’t a good fit for the desktop market. A single 16 core Zen 4c might be nice for a high end mobile device though. Although, it would kind of be a big.LITTLE implementation if they paired one high clock regular Zen 4 part with a much more power efficient Zen 4c part for 24 cores (up to 8 performance plus up to 16 efficiency).

The other issue is that if Bergamo uses stacking with silicon bridge chiplets, then it might not be able to use the same IO die as the regular Zen 4 chips. It would require a different IO die with TSVs connections unless they do something weird and include both in a single IO die. From a design standpoint, it may not be that difficult to include both. The TSVs would bypass the PHY for the IO die link to the chiplet. If Zen4c is meant to only be used as a stacked device, then it just would not have a serdes link at all. The TSV area would likely be very small compared to the regular IO pads so it may fit. This seems unlikely unless/until there is a regular Zen 4 die made to be stacked also.
 

jamescox

Senior member
Nov 11, 2009
640
1,104
136
This is a nice thought, even if they have 2C/4T integrated into the I/O die with nuclear display they can sell that I/O die as a stand alone chip for low end systems like chromebooks et al. Then they can add CCX onto that as required. The only problem I see in this is the L3 cache, it's pretty big part of performance improvements in ryzen and adding that much cache on I/O die will increase its size and power requirements.
What you guys are describing is basically an APU that can connect to an extra cpu chiplet. I guess that could be a very good solution for mobile, especially if the extra cpu chiplet is a Zen 4c die. You could have up to 8 performance cores and then make a high end device by stacking up to another 16 “efficiency” cores. Combine that with more cache in a silicon bridge die, and it would make a spectacular mobile device. I think PC laptop makers will want something that more favorably compares to Apple ARM based macs. If they combined an APU with a second cpu chiplet and some on package HBM, then that might do it. I don’t know how likely such a device is though. I have been expecting Bergamo to be a stacked device, which means making it compatible with a desktop IO die would be difficult. If it uses TSVs, then both chips would need to be made at TSMC. It is hard to speculate on what products are plausible when a lot of stacking could be used. The next gen GPUs are supposedly two die with silicon bridge chiplets with cache connecting them together, so perhaps such stacking will go mainstream.
 

Mopetar

Diamond Member
Jan 31, 2011
8,004
6,446
136
The max clock speed gap could be enough to make it not that appealing for gaming compared to the regular Zen 4. So if you do 4+4c you'd beat 2x4 in MT while still being comparable in gaming.

You'd just wind up making a product that's trying to do too many things, but isn't particularly good at any of them. If you want a great gaming chip then presumably Zen 4D is what's going to be your best bet. If you want something that can utilize a lot of cores, you'd want a hypothetical Zen 4c product assuming for some reason you're not just going to get a Threadripper.
 

BorisTheBlade82

Senior member
May 1, 2020
667
1,022
136
The other issue is that if Bergamo uses stacking with silicon bridge chiplets, then it might not be able to use the same IO die as the regular Zen 4 chips. It would require a different IO die with TSVs connections unless they do something weird and include both in a single IO die. From a design standpoint, it may not be that difficult to include both. The TSVs would bypass the PHY for the IO die link to the chiplet. If Zen4c is meant to only be used as a stacked device, then it just would not have a serdes link at all. The TSV area would likely be very small compared to the regular IO pads so it may fit. This seems unlikely unless/until there is a regular Zen 4 die made to be stacked also.
Well, I could imagine the following:
That Genoa/Raphael keep the IFOP was pretty much confirmed by the Gigabyte leak. Bergamo on the other hand could be their first adventure into the EFB world CPU-wise paired with a separate IOD. That would obviously mean that they could not mix and match both CCDs in this generation.
But if they switch to EFB for the big cores with Zen 5 that would be in line with the rumour that they would pair Zen 5 with Zen 4c.

This way Bergamo would be a product to learn and iron out the packaging and introducing it in the main line one generation later.

Also they now are in danger of being overtaken by Intel on the chiplet/advanced packaging front. They had a huge head start with Rome/Matisse. But Meteor Lake could leapfrog them next year if they don't start making any progress.
 

eek2121

Diamond Member
Aug 2, 2005
3,045
4,266
136
One other problem with the idea is what AM5 itself was designed for. An extra chiplet (or even 2!) won't be as effective in many circumstances if it still has the same number of memory channels.

Going to disagree. A single stick of DDR5-5 6400 should provide equivalent bandwidth as 2 sticks of DDR4 3200. 2 sticks of DDR5 is effectively equal to quad chan DDR4.
 
Reactions: Mopetar and Thibsie
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |