Discussion Speculation: Zen 4 (EPYC 4 "Genoa", Ryzen 7000, etc.)

Page 145 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Vattila

Senior member
Oct 22, 2004
805
1,394
136
Except for the details about the improvements in the microarchitecture, we now know pretty well what to expect with Zen 3.

The leaked presentation by AMD Senior Manager Martin Hilgeman shows that EPYC 3 "Milan" will, as promised and expected, reuse the current platform (SP3), and the system architecture and packaging looks to be the same, with the same 9-die chiplet design and the same maximum core and thread-count (no SMT-4, contrary to rumour). The biggest change revealed so far is the enlargement of the compute complex from 4 cores to 8 cores, all sharing a larger L3 cache ("32+ MB", likely to double to 64 MB, I think).

Hilgeman's slides did also show that EPYC 4 "Genoa" is in the definition phase (or was at the time of the presentation in September, at least), and will come with a new platform (SP5), with new memory support (likely DDR5).



What else do you think we will see with Zen 4? PCI-Express 5 support? Increased core-count? 4-way SMT? New packaging (interposer, 2.5D, 3D)? Integrated memory on package (HBM)?

Vote in the poll and share your thoughts!
 
Last edited:
Reactions: richardllewis_01

Doug S

Platinum Member
Feb 8, 2020
2,470
4,026
136
In the future they'll put the data centers in space. Good access to cheap solar power and no worries about cooling. That and no one can hear the poor techs that have to manage it scream.


Cooling is a big problem in space, because vacuum is a perfect insulator so you can only move heat via blackbody radiation. A datacenter in space would not work very well without a huge surface area facing away from the sun through which to radiate heat.
 

Doug S

Platinum Member
Feb 8, 2020
2,470
4,026
136
For a while I have been asking myself if someone harvests that thermal energy and converts it back to electricity. I mean, besides the processing a chip basically wastes 99% of its energy input. Even if they could harvest only 10% to 20% as electricity that would be worthwhile. And some fluid for thermal transfer seems to be the obvious starting point.

It would certainly make sense in cooler climates to use that energy for heating in the winter, if you have a way to get it there (i.e. steam tunnels or the like) If you could site a datacenter near something that needs huge quantities of hot water (imagine the world's biggest laundromat) that would be perfect.

There are evolving technologies that promise to do a better job of making use of waste heat. It remains to be whether it can do significantly better than the Peltier effect which isn't efficient enough to bother with energy recovery from waste heat in datacenters or industrial processes that produce a lot of waste heat like cement plants or oil refineries.
 

DrMrLordX

Lifer
Apr 27, 2000
21,791
11,131
136
In the future they'll put the data centers in space. Good access to cheap solar power and no worries about cooling. That and no one can hear the poor techs that have to manage it scream.

Cooling in space is weird. You still have to engineer around it. There's no convection so you're 100% reliant on radiation.

Does this answer your question?

Not necessarily. N5P brings with it certain performance/power advantages over N5, but it's not clear where AMD's optimized N5 variant sits in relation to N5P. If they had just come out and said "yeah we're using the same N5P as everyone else" then we'd know what's what, but they didn't exactly say that.
 
Last edited:
Reactions: Joe NYC

DisEnchantment

Golden Member
Mar 3, 2017
1,682
6,197
136
Does this answer your question?
I guess that is about all we are gonna get. Thanks for the link mate

But this is the main takeaway.
it’s not necessarily the same as some other 5nm technologies out there.

AMD's N7-->N5 transition is not the same as advertised by TSMC. They are coming from a much worse efficiency and density standpoint compared to vanilla N7 (of course highly optimized for perf)
AMD's N7 as used by them is not the same as TSMC's stock N7 offered to others. Neither is the N5 as they use it.

This is reason I have interpreted Lisa's sentences exactly the way she said. 2x density, 2x efficiency and 1.25x perf.
However the 2x efficiency gain will get nullified by almost 2x more active silicon per CCD, so it will be a wash. But as said, we shall find out.
They shared some bits and pieces for N7 optimizations during Zen2/3 launched and I guess this time they will too for Zen4.
 

soresu

Platinum Member
Dec 19, 2014
2,934
2,157
136
Cooling is a big problem in space, because vacuum is a perfect insulator so you can only move heat via blackbody radiation. A datacenter in space would not work very well without a huge surface area facing away from the sun through which to radiate heat.
By blackbody I assume you meant thermal heat through infra red radiation?

Also that heat can be put to work keeping other systems or modules from getting too cold if you need it - it wouldn't surprise me if the ISS does such a thing to reduce heating cost in electricity from the solar panels.

But yes, in general it would make more sense to make the power in space and beam it receivers floating above underwater datacenter modules as Microsoft has already piloted (successfully?).
 

itsmydamnation

Platinum Member
Feb 6, 2011
2,859
3,404
136
In the future they'll put the data centers in space. Good access to cheap solar power and no worries about cooling. That and no one can hear the poor techs that have to manage it scream.
As far as i know Cooling in space actually sux, its completely counter intuitive , but you need atoms to transfer energy onto.

edit: lol i was beaten to the punch
 

tomatosummit

Member
Mar 21, 2019
184
177
116
In the future they'll put the data centers in space. Good access to cheap solar power and no worries about cooling. That and no one can hear the poor techs that have to manage it scream.
"space"
"no worries about cooling"
"space"
I know you're making a joke but this still hurts.

As for the sea based server nodes, could you just stick them to the base of windmills? Power and physical infrastructure already in place.
 

soresu

Platinum Member
Dec 19, 2014
2,934
2,157
136
As for the sea based server nodes, could you just stick them to the base of windmills? Power and physical infrastructure already in place.
If you mean offshore wind then yeah I suppose - but you still need to connect to comms, so infrastructure is lacking regardless, unless it's all done by microwave point to point at ground level, or ground to satellite.

There's also tidal based and high altitude wind power too.
 

DrMrLordX

Lifer
Apr 27, 2000
21,791
11,131
136


Seems like THEY didn't have any problems with cooling. Or talking for that matter! AMD needs to license that tech from Aperture, stat. Somehow I don't think they'll get any of that ready in time for Zen4, though. Zen5 maybe?

Guess NVidia will have to go from making Ti cards to Th cards.

Yes, makes perfect sense.
 
Last edited:

tomatosummit

Member
Mar 21, 2019
184
177
116
If you mean offshore wind then yeah I suppose - but you still need to connect to comms, so infrastructure is lacking regardless, unless it's all done by microwave point to point at ground level, or ground to satellite.

There's also tidal based and high altitude wind power too.
Yeah I mean offshore.
I was thinking more as co-located more than adding to existing installations.
When erecting an offshore turbine, attach some mounting points for submerged datacenters into the pylons below the waterline and when running the power lines out to the structures add in some fiber as well.

Either leech some power from the windmills or have it built specifically for the underwater compute.
 

Ajay

Lifer
Jan 8, 2001
16,074
8,103
136
I guess that is about all we are gonna get. Thanks for the link mate

But this is the main takeaway.


AMD's N7-->N5 transition is not the same as advertised by TSMC. They are coming from a much worse efficiency and density standpoint compared to vanilla N7 (of course highly optimized for perf)
AMD's N7 as used by them is not the same as TSMC's stock N7 offered to others. Neither is the N5 as they use it.

This is reason I have interpreted Lisa's sentences exactly the way she said. 2x density, 2x efficiency and 1.25x perf.
However the 2x efficiency gain will get nullified by almost 2x more active silicon per CCD, so it will be a wash. But as said, we shall find out.
They shared some bits and pieces for N7 optimizations during Zen2/3 launched and I guess this time they will too for Zen4.
So, maybe I'm having problems with my brain, but CMOS efficiency is power / xtor switch rate. Or, just power @ iso clock. So with 2x xtors, if true, then net power will be approximately the same (as you point out), but perf/watt - based on process alone, will be up 25% - so SoC efficiency (perf/watt) goes up. If AMD can augment that performance to ~35% with uArch changes (or more), then perf/watt goes up quite a bit (and this will matter tremendously for Genoa). So efficiency is an important enabler for the ability to increase xtor density, but not the deciding factor in perf/watt (which is often cited as efficiency, but is a different metric) Maybe this is already on your mind, and you chose to narrowly define your response to the graph at hand. In which case, my elaboration hasn't much to do with your brevity. I guess the notion of 'highly optimized for perf', triggered me a bit.

I can't imagine that the 'optimized' N5 allows for much deviation in the actual electrostatic parameters of the N5 transistor. In the past, high speed logic traded of xtor size for performance by adding an extra fin per xtor. Not sure if that is possible @ N5. Which would mean tweaks to the N5 recipe would be limited. Their are other process components, as you've mentioned b/4, such as changes to the metal layers, which can be considered an 'optimization' of a different sort. Makes me kind of think that the optimization may have more to do with DTCO than anything else. I wish @Idontcare was still on this board to get a better sense of available process options.

/end transmission
 

Mopetar

Diamond Member
Jan 31, 2011
8,000
6,433
136
The whole space thing was a bit of joke, but I did learn a lot of interesting things today as a result. It's not the cooling, the power, or anything else that's impractical. Putting a datacenter into orbit would cost even more than all of the expensive server hardware so even if there were a practical cooling solution and energy was basically free, it wouldn't offset the cost of getting the hardware there.

I guess that really just means we need a space fab first. It's got a process node so amazing that it's out of this world.
 

tomatosummit

Member
Mar 21, 2019
184
177
116
The whole space thing was a bit of joke, but I did learn a lot of interesting things today as a result. It's not the cooling, the power, or anything else that's impractical. Putting a datacenter into orbit would cost even more than all of the expensive server hardware so even if there were a practical cooling solution and energy was basically free, it wouldn't offset the cost of getting the hardware there.

I guess that really just means we need a space fab first. It's got a process node so amazing that it's out of this world.
You're killing me here, almost as much as cosmic radiation kills lithography.
 
Reactions: Tlh97 and Mopetar

lightmanek

Senior member
Feb 19, 2017
399
798
136
The whole space thing was a bit of joke, but I did learn a lot of interesting things today as a result. It's not the cooling, the power, or anything else that's impractical. Putting a datacenter into orbit would cost even more than all of the expensive server hardware so even if there were a practical cooling solution and energy was basically free, it wouldn't offset the cost of getting the hardware there.

I guess that really just means we need a space fab first. It's got a process node so amazing that it's out of this world.

But SpaceX goal is to lower cost of 1T to orbit by 1000x in the near future as this is the only way to have any chance of getting enough materials to Mars for a colony. From there on, we should have more reasons to expand our space presence therefore making it even more affordable, eventually making it as expensive as a flight to Antarctica.

So cost to orbit should be a problem we can overcome with time, latency might be a different matter.
 
Reactions: Tlh97

maddie

Diamond Member
Jul 18, 2010
4,787
4,771
136
I think AMD uses custom designs. I assume that this resulted in the N7 + N5 HPC variants. The 3xxx series moved to a higher-performance version of N7 before 5xxx used it.

Lisa only mentioned changes with cache (I don't think they will use less) and a more dense/energy-efficient process.

I'm more curious about the way that they will fit 12 memory channels in with 128 cores since it doesn't divide evenly. My only guess is creating a group of 32 cores connected to 3 channels on the I/O die. Or maybe that doesn't matter?
Why would the cores and memory channels have to be divisible as you say. The IO die feeds all the cores on the CPU. This isn't the original Zen. At present, you can, in theory get 1-64 cores with 8 memory channels.
 

maddie

Diamond Member
Jul 18, 2010
4,787
4,771
136
In the future they'll put the data centers in space. Good access to cheap solar power and no worries about cooling. That and no one can hear the poor techs that have to manage it scream.
I get the joke, but cooling in space is only by radiation. It ain't easy.

edit: Oops, I see Doug S has it covered.
 
Last edited:
Reactions: Tlh97 and Mopetar

maddie

Diamond Member
Jul 18, 2010
4,787
4,771
136
For a while I have been asking myself if someone harvests that thermal energy and converts it back to electricity. I mean, besides the processing a chip basically wastes 99% of its energy input. Even if they could harvest only 10% to 20% as electricity that would be worthwhile. And some fluid for thermal transfer seems to be the obvious starting point.
Low temp heat, <5% efficiency might be the number.
 

tomatosummit

Member
Mar 21, 2019
184
177
116
Why would the cores and memory channels have to be divisible as you say. The IO die feeds all the cores on the CPU. This isn't the original Zen. At present, you can, in theory get 1-64 cores with 8 memory channels.
As it's still zen3 derivative my bet is that the IO die is laid out in a similar fashion again. That is it's in quads. Each quarter of the io die is connected to one quarter of the ccds, memory controllers and maybe the pcie root complex.
So in theory it is sort of similar to zen1 but the latency for hoping across the data fabric to the other side of the IO die is minimal now.
Bergamo is likely to done the same way just one quad of the IO die has three memory channels and two 16core ccds connected directly to it.

I secretly hoped raphael IO die would have three memory channels due to this.
 

Saylick

Diamond Member
Sep 10, 2012
3,372
7,104
136
Bergamo is likely to done the same way just one quad of the IO die has three memory channels and two 16core ccds connected directly to it.
Yeah, the number of cores and memory channels not having a common denominator is weird, but I imagine it's intentionally decoupled anyways since there's a separate IO die that can accommodate various number of chiplets/core counts.

I secretly hoped raphael IO die would have three memory channels due to this.
Former Nehalem users when they read that:
 

moinmoin

Diamond Member
Jun 1, 2017
4,993
7,763
136
Was watching Gabe Loh's talk:

It's mostly just a recap about the decisions going chiplets (nothing new really), but what stood out to me regarding our current discussion was this particular slide:


Same numbers as used now for 5nm! And it's "or" and seems to me an internal target to achieve before picking up a new node in HVM.
 

DisEnchantment

Golden Member
Mar 3, 2017
1,682
6,197
136
Was watching Gabe Loh's talk:

It's mostly just a recap about the decisions going chiplets (nothing new really), but what stood out to me regarding our current discussion was this particular slide:
View attachment 55776

Same numbers as used now for 5nm! And it's "or" and seems to me an internal target to achieve before picking up a new node in HVM.


Papermaster did speak about that almost 3 years ago during EPYC Rome launch. (damn... time went fast)
 

leoneazzurro

Golden Member
Jul 26, 2016
1,003
1,593
136
I think we may expect a performance jump similar to what Zen2 was to original Zen, as this will be nearly the same ("refinement" of a new architecture linked to the jump to a new node). Zen2 to Zen IIRC was a +15% IPC and then new process allowed for better frequencies, here if we really have a jump from 4.4-4,5 GHZ allcore to 5 GHz allcore (and we don't know if that's the limit) then if IPC increase is the same as Zen2 to Zen, a 25% performance uplift for Zen 4 could be definitely achievable.
 

DisEnchantment

Golden Member
Mar 3, 2017
1,682
6,197
136
I can't imagine that the 'optimized' N5 allows for much deviation in the actual electrostatic parameters of the N5 transistor. In the past, high speed logic traded of xtor size for performance by adding an extra fin per xtor. Not sure if that is possible @ N5. Which would mean tweaks to the N5 recipe would be limited. Their are other process components, as you've mentioned b/4, such as changes to the metal layers, which can be considered an 'optimization' of a different sort. Makes me kind of think that the optimization may have more to do with DTCO than anything else. I wish @Idontcare was still on this board to get a better sense of available process options.
Difficult to say much about those without running some simulations for these cutting edge process/SoCs. I bet nobody is going to share such things for obvious reasons.
Historically you'd be calculating power as P ∝ C*V2*f. But f also is a nonlinear function of V. Reducing V sounds great except you have to raise I to drive higher switching frequency leading to higher I2R losses.
So efficiency at device level is not going to show the complete picture.
Optimization of metal layers and PD network is more and more critical. Accurate parasitic extraction would be hell for these processes.
At SoC level things like interconnect efficiency plays big role in compute efficiency due to energy used during data movement
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |