Discussion Speculation: Zen 4 (EPYC 4 "Genoa", Ryzen 7000, etc.)

Page 494 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Vattila

Senior member
Oct 22, 2004
805
1,394
136
Except for the details about the improvements in the microarchitecture, we now know pretty well what to expect with Zen 3.

The leaked presentation by AMD Senior Manager Martin Hilgeman shows that EPYC 3 "Milan" will, as promised and expected, reuse the current platform (SP3), and the system architecture and packaging looks to be the same, with the same 9-die chiplet design and the same maximum core and thread-count (no SMT-4, contrary to rumour). The biggest change revealed so far is the enlargement of the compute complex from 4 cores to 8 cores, all sharing a larger L3 cache ("32+ MB", likely to double to 64 MB, I think).

Hilgeman's slides did also show that EPYC 4 "Genoa" is in the definition phase (or was at the time of the presentation in September, at least), and will come with a new platform (SP5), with new memory support (likely DDR5).



What else do you think we will see with Zen 4? PCI-Express 5 support? Increased core-count? 4-way SMT? New packaging (interposer, 2.5D, 3D)? Integrated memory on package (HBM)?

Vote in the poll and share your thoughts!
 
Last edited:
Reactions: richardllewis_01

Hail The Brain Slug

Diamond Member
Oct 10, 2005
3,228
1,666
136
Techspot RAM Speed review

Interesting to see the 7950X3D looks to be quite a bit more RAM sensitive than the 5800X3D. I wonder if the V$ CCD is just more sensitive to RAM than last gen, or if it has to do with the other CCD being used sometimes.
 

Geddagod

Golden Member
Dec 28, 2021
1,205
1,169
106
It's beating a 7200 RAM 13900K...
That... has nothing to do with it?
But idk if the 7800x3d even shows much more RAM sensitivity compared to the 5800X3D.
The highest zen 3 supports is 3200, and going to 3800 only gets you a gain of like 2%. Similarly, the highest ram speed supported for zen 4 is 5200, and going to 6000 (what AMD calls the sweet spot) gets you like 3%.
 

Mopetar

Diamond Member
Jan 31, 2011
8,004
6,446
136
The highest zen 3 supports is 3200, and going to 3800 only gets you a gain of like 2%. Similarly, the highest ram speed supported for zen 4 is 5200, and going to 6000 (what AMD calls the sweet spot) gets you like 3%.

I don't know what benchmarks you're referring to, but the recent HUB testing shows that AMD gets pretty significant performance gains in most games from tuning the memory. Even we'll-tuned 5200 will beat poorly configured faster memory, but it's not reasonable to claim that Zen isn't sensitive to memory settings isn't really true.
 
Reactions: Thunder 57

Geddagod

Golden Member
Dec 28, 2021
1,205
1,169
106
I don't know what benchmarks you're referring to, but the recent HUB testing shows that AMD gets pretty significant performance gains in most games from tuning the memory. Even we'll-tuned 5200 will beat poorly configured faster memory, but it's not reasonable to claim that Zen isn't sensitive to memory settings isn't really true.
Also HWUB, I think? Techspot is HWUB's website right?
I'm referring to 5800x3d review, they tested both 3200 and 3800 ram speeds in their OG review. Like 2%.
The other review I'm talking about is their recent 7950X3D memory scaling article. It shows a 3 or 4%, I forget how much exactly, going from ddr5 5200 to ddr5 6000.
I'm specifically referring to the 3D parts btw, not base Zen 4 or base Zen 3. So 5800x3D and 7950X3D, not 5800x and 7950x.
 

Markfw

Moderator Emeritus, Elite Member
May 16, 2002
25,738
14,771
136
Also HWUB, I think? Techspot is HWUB's website right?
I'm referring to 5800x3d review, they tested both 3200 and 3800 ram speeds in their OG review. Like 2%.
The other review I'm talking about is their recent 7950X3D memory scaling article. It shows a 3 or 4%, I forget how much exactly, going from ddr5 5200 to ddr5 6000.
I'm specifically referring to the 3D parts btw, not base Zen 4 or base Zen 3. So 5800x3D and 7950X3D, not 5800x and 7950x.
And that benchmarking would involve games only ??? If so, no doubt on 3d parts. What about CB23 ??
 

Geddagod

Golden Member
Dec 28, 2021
1,205
1,169
106
And that benchmarking would involve games only ??? If so, no doubt on 3d parts. What about CB23 ??
Ye, the figures I cited was the average gaming tests. CB23 wasn't really reliant on mem speeds regardless of the CPU having 3D-cache or not, I thought?
Overall though I don't think Zen 43D is any more reliant on mem speeds than Zen 33D is, at least for gaming tasks.
 

lightmanek

Senior member
Feb 19, 2017
401
805
136
Also HWUB, I think? Techspot is HWUB's website right?
I'm referring to 5800x3d review, they tested both 3200 and 3800 ram speeds in their OG review. Like 2%.
The other review I'm talking about is their recent 7950X3D memory scaling article. It shows a 3 or 4%, I forget how much exactly, going from ddr5 5200 to ddr5 6000.
I'm specifically referring to the 3D parts btw, not base Zen 4 or base Zen 3. So 5800x3D and 7950X3D, not 5800x and 7950x.

I think we are concentraiting too much on memory speed and forgetting absolute latency. It would be better to compare memory scailing between Zen3 and Zen4 3D models keeping absolute latency constant while increasing bandwidth and also decrese latency by keeping relative timing static and varying memory speed.

I suspect Zen4 X3D will scale better with lower latency compared to increased bandwidtch, especially in game workloads.
 
Reactions: Tlh97 and Geddagod

Markfw

Moderator Emeritus, Elite Member
May 16, 2002
25,738
14,771
136
Ye, the figures I cited was the average gaming tests. CB23 wasn't really reliant on mem speeds regardless of the CPU having 3D-cache or not, I thought?
Overall though I don't think Zen 43D is any more reliant on mem speeds than Zen 33D is, at least for gaming tasks.
My point is that gaming benefits from the big cache on 16 threads or less, where CB23 uses all 32 threads, and I bet memory speed helps the cores not on the big cache. So using all 32 threads should make more of a difference in memory improving speed. No so much as games make memory speed almost irrelevant with the 3d cache.

Note the above is my deduction, not proved (yet) in benchmarks.
 

Mopetar

Diamond Member
Jan 31, 2011
8,004
6,446
136
I'm assuming cinebench is pretty tightly coded to work best with prevailing cache sizes. I don't think cache is a big deal beyond the minimum it expects and that most CPUs have. Maybe it's possible to leverage additional cache for more performance, but until that becomes standard, I don't think there's a lot of pressure or reason to pursue those avenues. Zen-3D is a very early adopter of something that I think will eventually become widely adopted, but most software isn't being built with it in mind. The big performance wins are happy coincidence, but there're a lot of products that could be modified to leverage that additional cache.
 
Reactions: ZGR and lightmanek

Joe NYC

Platinum Member
Jun 26, 2021
2,323
2,929
106
Nice new details. Primarily on V-Cache, but a lot of info on process nodes, die sizes transistor density.

On V-Cache, it has incredible transistor density, even though it stayed on N7. Die size shrunk from 41 mm2 to 36 mm2, number of transistors remained the same but TSVs are more efficiently and densely placed.

Only signal TSVs in the L3 area and power TSVs now segregated and moved over to the L2 area. So the V-Cache now covers L3 and partially L2 as well.

There is also some info on IO dies for both server and client. Graphics is taking quite a bit of space, but that's space on N6 that does not have to use N5.

The V-Cache bandwidth increased from 2 TB/s to 2.5 TB/s and the overhead to access V-Cache is 4 clocks - which is not bad for L3. Also, AFAIK, this overhead is only applicable if the data is coming from V-Cache portion of L3, not all of L3 (from comments going back to 5800x3d).

 

Geddagod

Golden Member
Dec 28, 2021
1,205
1,169
106
Nice new details. Primarily on V-Cache, but a lot of info on process nodes, die sizes transistor density.

On V-Cache, it has incredible transistor density, even though it stayed on N7. Die size shrunk from 41 mm2 to 36 mm2, number of transistors remained the same but TSVs are more efficiently and densely placed.

Only signal TSVs in the L3 area and power TSVs now segregated and moved over to the L2 area. So the V-Cache now covers L3 and partially L2 as well.

There is also some info on IO dies for both server and client. Graphics is taking quite a bit of space, but that's space on N6 that does not have to use N5.

The V-Cache bandwidth increased from 2 TB/s to 2.5 TB/s and the overhead to access V-Cache is 4 clocks - which is not bad for L3. Also, AFAIK, this overhead is only applicable if the data is coming from V-Cache portion of L3, not all of L3 (from comments going back to 5800x3d).

I think the most impressive part from that article is that Zen 4 uses 5nm HD cells - 6T right is HD right? I thought HD cells can't scale up to the voltages Zen 4 uses...
 

BorisTheBlade82

Senior member
May 1, 2020
667
1,022
136
Nice new details. Primarily on V-Cache, but a lot of info on process nodes, die sizes transistor density.

On V-Cache, it has incredible transistor density, even though it stayed on N7. Die size shrunk from 41 mm2 to 36 mm2, number of transistors remained the same but TSVs are more efficiently and densely placed.

Only signal TSVs in the L3 area and power TSVs now segregated and moved over to the L2 area. So the V-Cache now covers L3 and partially L2 as well.

There is also some info on IO dies for both server and client. Graphics is taking quite a bit of space, but that's space on N6 that does not have to use N5.

The V-Cache bandwidth increased from 2 TB/s to 2.5 TB/s and the overhead to access V-Cache is 4 clocks - which is not bad for L3. Also, AFAIK, this overhead is only applicable if the data is coming from V-Cache portion of L3, not all of L3 (from comments going back to 5800x3d).

Finally the confirmation, that the cache-die from Zen4 is not exactly the same one as from Zen3.
Positively surprised and impressed, that they achieved 7nm-on-5nm SoIC. TSMC never stated this to be a possible variant (much to my surprise back then in 2021 or so).

 

Kaluan

Senior member
Jan 4, 2022
503
1,074
106
Interesting to see the 7950X3D looks to be quite a bit more RAM sensitive than the 5800X3D. I wonder if the V$ CCD is just more sensitive to RAM than last gen, or if it has to do with the other CCD being used sometimes.
That... has nothing to do with it?
But idk if the 7800x3d even shows much more RAM sensitivity compared to the 5800X3D.
The highest zen 3 supports is 3200, and going to 3800 only gets you a gain of like 2%. Similarly, the highest ram speed supported for zen 4 is 5200, and going to 6000 (what AMD calls the sweet spot) gets you like 3%.

My biggest gripe is that they didn't test 1R vs 2R. That's were 5800X3D saw the most consistent gains.

Followed by "no Infinity Fabric/RAM scaling testing"

But it's still somewhat useful data.
 

Timorous

Golden Member
Oct 27, 2008
1,727
3,152
136
Nice new details. Primarily on V-Cache, but a lot of info on process nodes, die sizes transistor density.

On V-Cache, it has incredible transistor density, even though it stayed on N7. Die size shrunk from 41 mm2 to 36 mm2, number of transistors remained the same but TSVs are more efficiently and densely placed.

Only signal TSVs in the L3 area and power TSVs now segregated and moved over to the L2 area. So the V-Cache now covers L3 and partially L2 as well.

There is also some info on IO dies for both server and client. Graphics is taking quite a bit of space, but that's space on N6 that does not have to use N5.

The V-Cache bandwidth increased from 2 TB/s to 2.5 TB/s and the overhead to access V-Cache is 4 clocks - which is not bad for L3. Also, AFAIK, this overhead is only applicable if the data is coming from V-Cache portion of L3, not all of L3 (from comments going back to 5800x3d).


Interesting.

Z4 CCD is 99M xtors per mm but if you exclude the IO and focus just on the cores and the cache the density is around 118M xtors per mm.

If Z4c is 16c with 32MB L3 cache and probably 2 of those IO blocks then it will need 10.8B (~4.15Bx2 for 16 cores + 2.35B for 32MB L3 + 0.7B*2 for the CCX IO) transistors which to fit in say a Z3 sized CCD would need a density of around 135M transistors per mm. Given we know the logic scales better than the cache on N5 and the fact that the cache amount is staying fixed and the core count is rising could AMD manufacture it on the current N5 node they already use? If 8c + 32MB L3 is 118M xtors per mm then it is logical to think 16c + 32MB L3 would be denser just due to the change in the ratio of logic to cache.
 

leoneazzurro

Golden Member
Jul 26, 2016
1,005
1,598
136
Reactions: Tlh97 and Kaluan
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |