Discussion Speculation: Zen 4 (EPYC 4 "Genoa", Ryzen 7000, etc.)

Page 126 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Vattila

Senior member
Oct 22, 2004
805
1,392
136
Except for the details about the improvements in the microarchitecture, we now know pretty well what to expect with Zen 3.

The leaked presentation by AMD Senior Manager Martin Hilgeman shows that EPYC 3 "Milan" will, as promised and expected, reuse the current platform (SP3), and the system architecture and packaging looks to be the same, with the same 9-die chiplet design and the same maximum core and thread-count (no SMT-4, contrary to rumour). The biggest change revealed so far is the enlargement of the compute complex from 4 cores to 8 cores, all sharing a larger L3 cache ("32+ MB", likely to double to 64 MB, I think).

Hilgeman's slides did also show that EPYC 4 "Genoa" is in the definition phase (or was at the time of the presentation in September, at least), and will come with a new platform (SP5), with new memory support (likely DDR5).



What else do you think we will see with Zen 4? PCI-Express 5 support? Increased core-count? 4-way SMT? New packaging (interposer, 2.5D, 3D)? Integrated memory on package (HBM)?

Vote in the poll and share your thoughts!
 
Last edited:
Reactions: richardllewis_01

DisEnchantment

Golden Member
Mar 3, 2017
1,659
6,101
136
Thanks. And again, this has been around for a while. Perhaps AMD has been able to incorporate new technologies that improve the process.
What is new is more automation of the entire process using newer methods augmented by ML/AI for example.
This part of the design process follow specific patterns which the AI can "learn" from similar human designed chips unlike high level architectures which depends more on ideas and creativity and which the AI cannot really learn easily (at least not for now or unless you are designing a similar chip).
Synopsys has products already. Google is working hard on this and Mike Clark did hint of something as well.

Given an RTL design, many tools can generate the final physical design to get the GDSII output, but is far from optimal.
This usually means physical design engineers step in to convert the RTL to silicon for best performance.
What AI can do is look at best ways to get GDSII output from RTL by looking for the GDSII data produced by Physical design engineers.
Give it a thousand designs to learn from and it will get better very fast.
Sounds super simple, but if we are talking about 100 Billion transistors the complexity is staggering. Thankfully the concept of SoC partitioning (using IP blocks and chiplets etc) is helping.
 
Last edited:

DrMrLordX

Lifer
Apr 27, 2000
21,770
11,089
136
Even then full optimization did not happen until Zen3 when AMD was able to extract almost 5 GHz from a process which was not intended to run beyond 4.2 GHz. (from N7 Shmoo plot)

Assuming that is the case - that is, that N7 was never meant to clock over 4.2 GHz - then by simplistic analysis of TSMC's own literature regarding N5, N5 should top out around 4.8 GHz without design-specific optimizations. That may be an oversimplification. TSMC offers uLVT, eLVT, and HP node variations.
 

moinmoin

Diamond Member
Jun 1, 2017
4,993
7,763
136
Just considering that planning on Zen 1-3 started in 2012 and the fact how compute capability exploded since should make clear that the possibilities for further automation are vast for future Zen gens.
 

DisEnchantment

Golden Member
Mar 3, 2017
1,659
6,101
136
Assuming that is the case - that is, that N7 was never meant to clock over 4.2 GHz - then by simplistic analysis of TSMC's own literature regarding N5, N5 should top out around 4.8 GHz without design-specific optimizations. That may be an oversimplification. TSMC offers uLVT, eLVT, and HP node variations.
Hitting the target frequency is not only device design though, but also metal layers, controlling parasitics etc much more than forum dwellers are privy to. uLVT, eLVT are all device related.
Not everything is free though, if you are willing to pay for additional info I can dm you the links.

Sorry, edited out some info which is paywalled
 
Last edited:

Kedas

Senior member
Dec 6, 2018
355
339
136
Since the 24core rumor seems to surface again.
Do you think AM5 may be max 4 or even 5 dies total, (including iGPU and IO)
Is there enough room? I saw that intel had things very close together so you could think why not.
On AM4 there is a lot of free space.
 

DrMrLordX

Lifer
Apr 27, 2000
21,770
11,089
136
Hitting the target frequency is not only device design though, but also metal layers, controlling parasitics etc much more than forum dwellers are privy to. uLVT, eLVT are all device related.
Not everything is free though, if you are willing to pay for additional info I can dm you the links.

Sorry, edited out some info which is paywalled

It's okay. Some things are meant to be proprietary. We will just have to wait and see how the end product clocks and draw conclusions as forum dwellers are wont to do.
 

Ajay

Lifer
Jan 8, 2001
15,981
8,071
136
Since the 24core rumor seems to surface again.
Do you think AM5 may be max 4 or even 5 dies total, (including iGPU and IO)
Is there enough room? I saw that intel had things very close together so you could think why not.
On AM4 there is a lot of free space.

No, unless AMD has managed to shrink the CCDs allot, and/or uses more advanced packaging technology.
The IOD will still be fairly large as I/O won't shrink as much as logic and will have an integrated GPU.
 

soresu

Platinum Member
Dec 19, 2014
2,896
2,107
136
The IOD will still be fairly large as I/O won't shrink as much as logic and will have an integrated GPU.
Unless the GIOD is partially stacked maybe?

Not to mention the IO for the gfx will probably make up part of the IOD anyways if it is well optimised to purpose, so it may not increase the area much even if it is all one unstacked die.
 

Ajay

Lifer
Jan 8, 2001
15,981
8,071
136
Unless the GIOD is partially stacked maybe?

Not to mention the IO for the gfx will probably make up part of the IOD anyways if it is well optimised to purpose, so it may not increase the area much even if it is all one unstacked die.
Uh, the current IOD doesn't have any I/O for the iGPU, as this isn't one - so there will be no saving on that front.
 

eek2121

Diamond Member
Aug 2, 2005
3,032
4,222
136
Zen4 therefore is very interesting in this regard, it is going to give an idea how high clocking designs will look like in terms of efficiency/density with upcoming nodes.
This slide is therefore very interesting, N7-->N5 (efficiency with perf gain) while 14LPP-->N7 (efficiency at same perf)
View attachment 53918
That slide along with similar information provided by AMD is why I expect Zen 4 to have a significant performance uplift over Zen 3.

Since the 24core rumor seems to surface again.
Do you think AM5 may be max 4 or even 5 dies total, (including iGPU and IO)
Is there enough room? I saw that intel had things very close together so you could think why not.
On AM4 there is a lot of free space.
Not likely. I also think that even if they decide to offer a 24 core SKU, it won’t launch until later.
Uh, the current IOD doesn't have any I/O for the iGPU, as this isn't one - so there will be no saving on that front.
If they manage to stack the GPU or other components it won’t take up much space at all.
 

Ajay

Lifer
Jan 8, 2001
15,981
8,071
136
That slide along with similar information provided by AMD is why I expect Zen 4 to have a significant performance uplift over Zen 3.


Not likely. I also think that even if they decide to offer a 24 core SKU, it won’t launch until later.

If they manage to stack the GPU or other components it won’t take up much space at all.
The chiplets would have to be small to get three across the right side of the chip. I suppose if all the dice are small enough (stacked GPU), then AMD could use a 2x2 arrangement.
Supposedly AM5 has ~1700 pins. This doesn't seem, to me, to be enough for 3 CCDs; ~400 more pins with an added iGPU and a third chiplet.
 

moinmoin

Diamond Member
Jun 1, 2017
4,993
7,763
136
Since the 24core rumor seems to surface again.
Do you think AM5 may be max 4 or even 5 dies total, (including iGPU and IO)
Is there enough room? I saw that intel had things very close together so you could think why not.
On AM4 there is a lot of free space.
As long as the standard setup stays 1-2x CCD + 1x cIOD the only way I see it achieve more than 16 cores are CCDs with more than 8 cores. That may well happen with Bergamo.
 

Saylick

Diamond Member
Sep 10, 2012
3,361
7,059
136
IOW they put one Zen 4 chipet and up to one Zen 4c chiplet.
Isn't that the rumor about how AMD will go with their hybrid approach for the Zen 5 generation? One 16C chiplet with Zen 4C cores and one 8C chiplet with Zen 5 cores for a BIG.bigger configuration?
 

Abwx

Lifer
Apr 2, 2011
11,143
3,840
136
Zen4 therefore is very interesting in this regard, it is going to give an idea how high clocking designs will look like in terms of efficiency/density with upcoming nodes.
This slide is therefore very interesting, N7-->N5 (efficiency with perf gain) while 14LPP-->N7 (efficiency at same perf)
View attachment 53918

This slide say that they have no room to improve perf and perf/watt using frequency, and that they have to rely solely on IPC improvement.
 
Last edited:

Ajay

Lifer
Jan 8, 2001
15,981
8,071
136
Isn't that the rumor about how AMD will go with their hybrid approach for the Zen 5 generation? One 16C chiplet with Zen 4C cores and one 8C chiplet with Zen 5 cores for a BIG.bigger configuration?
This is a rumor I don't understand. Makes sense in a mobile context - but not for desktop PCs, particularly with ever finer grained power gating in modern CPUs.
 

Hans de Vries

Senior member
May 2, 2008
321
1,018
136
www.chip-architect.com
Since the 24core rumor seems to surface again.
Do you think AM5 may be max 4 or even 5 dies total, (including iGPU and IO)
Is there enough room? I saw that intel had things very close together so you could think why not.
On AM4 there is a lot of free space.

On AM5 there is enough space for one IOD die and 4 CCD. This actually explains the special metal cap of AM5 which has gaps opened for the capacitors.

 

Hans de Vries

Senior member
May 2, 2008
321
1,018
136
www.chip-architect.com
On AM5 there is enough space for one IOD die and 4 CCD. This actually explains the special metal cap of AM5 which has gaps opened for the capacitors.

View attachment 53988

Alternatively they could use a 16 core Zen4c CCD (The one supposedly used for 128 core Bergamo)

For 24 cores: Combine an 8 core Zen4 CCD with a 16 core Zen4c CCD.
For 32 cores: Use two 16 core Zen4c CCD's

But this would come much later since Bergamo is a 2023 product.
 

jamescox

Senior member
Nov 11, 2009
640
1,104
136
Alternatively they could use a 16 core Zen4c CCD (The one supposedly used for 128 core Bergamo)

For 24 cores: Combine an 8 core Zen4 CCD with a 16 core Zen4c CCD.
For 32 cores: Use two 16 core Zen4c CCD's

But this would come much later since Bergamo is a 2023 product.
I have been assuming that Bergamo is a stacked package. If that is the case, then Zen 4c may not be compatible with a standard Zen 4 desktop IO die. Although, may be plausible to put more than one type of interface on the die. Area for stacked interfaces would be very small, so they might be able to put the stacked interface along one edge and still put pads for non-stacked packaging. There could also be a stacked IO die in the works for mobile and some desktop parts. A Zen 4c might make an excellent mobile chip.

SP5 seems like it is going to be very expensive for Threadripper products, so they actually could use a higher end product made on AM5. It would be nice if they actually could make a half Epyc package / socket, but that seems unlikely unless the IO die itself is a multi-chip device. I don’t know if I would believe 4 die though. The desktop IO die has usually been 1/4 of the Epyc IO die. If Epyc has only 3 cpu connections and 3 DDR5 per quadrant, then I would expect that desktop products would be limited to 3 and 3 also.
 

tomatosummit

Member
Mar 21, 2019
184
177
116
SP5 seems like it is going to be very expensive for Threadripper products, so they actually could use a higher end product made on AM5. It would be nice if they actually could make a half Epyc package / socket, but that seems unlikely unless the IO die itself is a multi-chip device. I don’t know if I would believe 4 die though. The desktop IO die has usually been 1/4 of the Epyc IO die. If Epyc has only 3 cpu connections and 3 DDR5 per quadrant, then I would expect that desktop products would be limited to 3 and 3 also.
zen4c does not seem like a consumer core to me, amd stated it's for cloud providors to get as many cores and in as area allows, does not read to me as something useful for desktop cpus, especially if they can jump to 24cores for raphael. If it's changes reduce it's absolute single core performance for any reason, be it cache changes or reduced target frequency then it's probably a no go for a ryzen product.
3d stacked cache is in a similar situation, it's targetted to hpc style workloads but importantly will give improved gaming performance so can happily go onto consumer desktop.

As for 3ccd and 3memory channels I'd welcome that. Exposing 32pcie lanes also wouldn't go amiss either.
Would finally give a good reason to have a top level shipset over Bx50 motherboards and reintroduce relatively cheaper hedt workstations after tr's price hike.
1700 pin count might be too few for all of that though. My guess is still 128bit memory.
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |