Speculation: Ryzen 4000 series/Zen 3

Page 126 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Shivansps

Diamond Member
Sep 11, 2013
3,872
1,527
136
Just saw this leak with Fire Strike scores of Renoir APUs.





Ok the DDR4-3200 limitation is clear here, but the graphics scores look a little low to me, 3576 for the 4200G stock? ist that around the same the 3200G can do at stock?

 
Reactions: lightmanek

LightningZ71

Golden Member
Mar 10, 2017
1,652
1,938
136
Let’s see, same VEGA architecture? Check!
Similar ram bandwidth? Yes!
Fewer CUs at higher clocks? Yes!
Desktop 3000g series processors running at higher clocks than mobile ones already? Yes!

I don’t know why anyone is shocked.

the big difference will be overclocking. I suspect that Renoir on desktop will tolerate high DRAM clocks much better than raven ridge. The iGPU may also tolerate another 10% on the clocks as well. That should be noticeably better than the 3400g.
 

Shivansps

Diamond Member
Sep 11, 2013
3,872
1,527
136
Let’s see, same VEGA architecture? Check!
Similar ram bandwidth? Yes!
Fewer CUs at higher clocks? Yes!
Desktop 3000g series processors running at higher clocks than mobile ones already? Yes!

I don’t know why anyone is shocked.

the big difference will be overclocking. I suspect that Renoir on desktop will tolerate high DRAM clocks much better than raven ridge. The iGPU may also tolerate another 10% on the clocks as well. That should be noticeably better than the 3400g.

I dont think anybody is shocked, still the results look a little low, the top 3200G/Vega 8 score on FireStrike was archived at 1780mhz and DDR4-3466 with 4650 GS, compared to 4300 on 4700G 2100mhz and DDR4-3200... clear ram bottleneck there, but still.
 

LightningZ71

Golden Member
Mar 10, 2017
1,652
1,938
136
I dont think anybody is shocked, still the results look a little low, the top 3200G/Vega 8 score on FireStrike was archived at 1780mhz and DDR4-3466 with 4650 GS, compared to 4300 on 4700G 2100mhz and DDR4-3200... clear ram bottleneck there, but still.

Very reasonable question. I think that we've known that VEGA iGPUs are heavily ram bottlenecked, and a roughly 7-8% difference in RAM bandwidth giving a nearly matching 7-8% difference in the final score follows that well.

What I'm hoping for is support for DDR4-4266 with reasonable latencies on current motherboards. I don't think that is too far fetched what with the Chip's support for LPDDR4X-4266 in mobile configurations. Given how ram bandwidth constrained VEGA8 is, if it can maintain 2+ GHz clocks with 4266 ram (greater ram bandwidth will result in greater GPU utilization, increasing heat output), we should see decent improvements in the scores.
 

DrMrLordX

Lifer
Apr 27, 2000
21,784
11,125
136
@LightningZ71

DDR4-4266 is already supported on x570 motherboards. I got DDR4-4400 running once, though the bandwidth numbers were not very good. When you go into async mode, it isn't just latency that suffers.
 
Reactions: mopardude87

Hans Gruber

Platinum Member
Dec 23, 2006
2,209
1,146
136
@LightningZ71

DDR4-4266 is already supported on x570 motherboards. I got DDR4-4400 running once, though the bandwidth numbers were not very good. When you go into async mode, it isn't just latency that suffers.
I think what people are talking about is over 4000mhz in coupled mode. That way the latency would not get out of control. The fabric clock with Zen 2 is off when you get over 3733mhz. Supposedly Zen 3 solves a lot of the problems we have seen in Zen 2 with the infinity fabric. So your 4400mhz may be stable with Zen 3.
 

DisEnchantment

Golden Member
Mar 3, 2017
1,672
6,153
136



New super computer award for AMD,
128 Core EPYC Milan

The 128 Core does not seem to be a typo
1 Billion CPU Cores/Year [ 128 Cores * 1000 CPUs * 24 Hours * 365 Days = 1.12 Billion CPU core hours / year ]

They could be SMT cores. I hope they are SMT cores.
Doubling core count again would imply AMD is still sticking with tiny cores which IMO needs to change. For future processes sure, it would be a great idea.
Unless ...
 
Last edited:

mopardude87

Diamond Member
Oct 22, 2018
3,348
1,575
96
@LightningZ71

DDR4-4266 is already supported on x570 motherboards. I got DDR4-4400 running once, though the bandwidth numbers were not very good. When you go into async mode, it isn't just latency that suffers.

Oh dayum, i am thinking of trying to amp up mine. I got horrible timings as it is on this kit. I honestly got no idea what will benefit more, tighter timings on this kit or sheer speed LOL.
 

Shivansps

Diamond Member
Sep 11, 2013
3,872
1,527
136
@LightningZ71

DDR4-4266 is already supported on x570 motherboards. I got DDR4-4400 running once, though the bandwidth numbers were not very good. When you go into async mode, it isn't just latency that suffers.

Thats why ive mentioned the possibility of Renoir archiving higher fabric speeds than Matisse, it kinda should considering it is 7nm monolithic, not sure if enoght for 4266 whiout going async, but maybe 4000.
 

LightningZ71

Golden Member
Mar 10, 2017
1,652
1,938
136
Renoir doesn't couple DRAM speeds to IF speeds like Matisse does. I encourage you to go to your local search engine and look it up. It runs various internal ratios depending on the frequency spread, but it's always uncoupled.

My commenton achievable memory speeds wasn't just directed at x570 boards. I suspect that some AM4 boards are less capable of achieving high memory clocks than others. I also don't know how good the Renoir IMC will be on desktop AM4 sockets. There are a lot of variables at play here.

The only thing that's certain is that Renoir's VEGA will be highly memory bandwidth constrained beyond 2Ghz (and lower) and every additional MHz of RAM bandwidth will be valuable.
 

moinmoin

Diamond Member
Jun 1, 2017
4,993
7,763
136

View attachment 22391

New super computer award for AMD,
128 Core EPYC Milan

The 128 Core does not seem to be a typo
1 Billion CPU Cores/Year [ 128 Cores * 1000 CPUs * 24 Hours * 365 Days = 1.12 Billion CPU core hours / year ]

They could be SMT cores. I hope they are SMT cores.
Doubling core count again would imply AMD is still sticking with tiny cores which IMO needs to change. For future processes sure, it would be a great idea.
Unless ...
Personally wouldn't expect another doubling of cores with Milan, only with Genoa. Maybe they actually mean threads instead cores? If not that's one hell of an accidental leak.
Also Dell of all companies doing a supercomputer with AMD hardware?
 

Hitman928

Diamond Member
Apr 15, 2012
5,562
8,692
136
Personally wouldn't expect another doubling of cores with Milan, only with Genoa. Maybe they actually mean threads instead cores? If not that's one hell of an accidental leak.
Also Dell of all companies doing a supercomputer with AMD hardware?

Yeah, pretty sure it's either threads or they are using dual socket blades and that's where the confusion is coming in.
 
Reactions: ksec

DrMrLordX

Lifer
Apr 27, 2000
21,784
11,125
136
I think what people are talking about is over 4000mhz in coupled mode. That way the latency would not get out of control. The fabric clock with Zen 2 is off when you get over 3733mhz.

In those cases, it isn't the motherboard that's the bottleneck.
 

Shivansps

Diamond Member
Sep 11, 2013
3,872
1,527
136
Renoir doesn't couple DRAM speeds to IF speeds like Matisse does. I encourage you to go to your local search engine and look it up. It runs various internal ratios depending on the frequency spread, but it's always uncoupled.

My commenton achievable memory speeds wasn't just directed at x570 boards. I suspect that some AM4 boards are less capable of achieving high memory clocks than others. I also don't know how good the Renoir IMC will be on desktop AM4 sockets. There are a lot of variables at play here.

The only thing that's certain is that Renoir's VEGA will be highly memory bandwidth constrained beyond 2Ghz (and lower) and every additional MHz of RAM bandwidth will be valuable.

That may be a power saving feature. In the end what matters is having higher fabric speeds than Matisse, if is coupled or not is not that important.
I need to look that up, but im not that much interested in the mobile versions.
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
Yeah, pretty sure it's either threads or they are using dual socket blades and that's where the confusion is coming in.

Hmm, maybe not.

1,000 128-core AMD Epyc “Milan” processors
peak performance of 5.3 petaflops.

At 3GHz, a 64 core Rome CPU delivers 3072GFlops, or 3TFlops. 1000 Milan CPUs offering 5.3 PFlops gives us three options:

1. 5.3GHz 64 core
2. 128 core 2.6GHz
3. 64 core 2.6GHz with full fledged AVX-512(that is, 2x 512-bit vector units per core)

1 is out of the question. So that leaves us with 128 cores, or double FP performance. Unless AMD is going 5nm, I don't really see them going with that many cores. And even if its 5nm, you won't double perf/watt and they'll use some of the efficiency gains on the wider uarch anyway.

The most logical choice seems to be a 64 core part with Zen 3 supporting 2x AVX512 capable units.

Oh, that was a mistake, they meant 128 cores per node, not per socket.

Or that lol.
 
Last edited:

DrMrLordX

Lifer
Apr 27, 2000
21,784
11,125
136
In the end what matters is having higher fabric speeds than Matisse, if is coupled or not is not that important.

Unless AMD has "fixed" the performance problems that crop up when you're in async mode, having synced IF/RAM is much more important that IF speed.

edit: I want to amend this slightly with some data from my own system.

RAM 1833/IF 1833:
Read: 57994 MB/s
Latency: 65.4ns

RAM 1833/IF 1800 (async):
Read: ~57400 MB/s
Latency: 74.8ns

So clearly, in terms of bandwidth, async alone isn't doing that much to hurt performance (though it stings a bit). Latency is another story.
 
Last edited:

Shivansps

Diamond Member
Sep 11, 2013
3,872
1,527
136
Unless AMD has "fixed" the performance problems that crop up when you're in async mode, having synced IF/RAM is much more important that IF speed.

edit: I want to amend this slightly with some data from my own system.

RAM 1833/IF 1833:
Read: 57994 MB/s
Latency: 65.4ns

RAM 1833/IF 1800 (async):
Read: ~57400 MB/s
Latency: 74.8ns

So clearly, in terms of bandwidth, async alone isn't doing that much to hurt performance (though it stings a bit). Latency is another story.

For IGP gaming latency is secondary to bandwidth, memory timings that has a large impact on bandwidth are more important than the ones on latency, so the same will apply to coupled/async. But those 600MB/s lost there is in fact significant for an APU, for a CPU only i dont think so. Consideing how well those IGP are performing compared to a RX550 with 128-bit GDDR5 i dont think a CPU core uses much bandwidth while gaming, not sure if this was ever tested, that would also change depending on game and will explain a lot.
 

jamescox

Senior member
Nov 11, 2009
640
1,104
136
I agree. It seems like a fairly substantial jump.

But here's my theory
  • The IOD for Zen3 could possibly be fabbed by TSMC going forward. This could explain the jump in needed wafers because the IOD is not small.
    • Assuming worse density scaling than the CCDs because of the PHYs and IOs it will be bigger than the CCDs (80-100mm2 DT and 250-280mm2 EPYC)
    • This is bound to happen at some point. I am not sure GF can make the super dense micro bumps needed in the future for die stacking.
  • The cores and cache grew in size for Zen3
    • Zen --> Zen2 core saw a growth of ~36% in transistor count including L3 cache. Just the core with the L2 it grew 17%
    • An increase in the CCD die size of 15% lets say means AMD would need 15% more wafers even keeping demand at same pace. And yield would drop even more.
  • Additional wafer allocation for Q4 could be needed to cover additional products for Cezanne/Mobile APU. We know AMD has to be in time for OEM refresh otherwise they will miss the bus.
I don’t know if the chiplet die size will increase much at all over Zen 2. Zen 2 already has 32 MB cache per CCD (2 x 16 MB CCX) and the same number of cores. In Zen 3, it will all be unified into a single 8-core CCX, but it is still the same number of cores and L3 cache. It may be larger L2 size and the new architecture will take more transistors. Floating point hardware takes a lot of die area and there is a good chance Zen 3 has significantly increased FP power. I don’t know if they will go up to a full 4 AVX256 units. Some of the transistor count increase may be offset by denser process. Die size increase probably will not be due to L3 cache though, since it is actually the same amount per die. A larger cache size product may exists in some manner. That may be what the specialized super computer chips are. They could also possibly do something like Intel does (differing number of AVX512 units) and have a chip with more FP units. That gets complicated due to scheduler ports though. Initial Zen 1 was one die to do everything but AMD has a bit more money now, so they can afford to do more die variants to better cover the market.

I haven’t read this whole thread, so some stuff may have already been mentioned or debunked. I have ave been thinking that Zen 3, being a completely new architecture, will actually be very conservative in the initial release and use almost the same IO as Zen 2. Then Zen 4 will just be a shrink and/or slightly tweaked version of Zen 3, but with completely new IO die (pci-e 5, DDR5, etc). It may make sense for the Zen 4 EPYC IO die to be an interposer or just made of multiple chips. If they want to add L4 cache to the EPYC IO die for a more unified last level cache, then they would want to make them on a leading edge process for maximum density. It would also make sense to have them split into separate chips if they have a lot cache. Things get a bit crazy with an interposer or multi-chip IO die since there is a large number of possibilities.

I expected that the initial Zen 3 launch would be mostly EPYC and a small number of high end desktop parts. The Zen 2 XT parts are a bit confusing though. Are they going to release R9-4900 and R9-4950 in 6 months? I guess they may have just been getting high binning parts, so they decided to release some faster variants to look better against Intel and sell off some Zen 2 stock before Zen 3.

They could be working on a 5 nm APU based on Zen 3. If they have an 8 core single CCX Zen 3 APU, then would there be a reason to sell a single Zen 3 CCD + IO die for the low end desktop market? I guess I could see them making a 2 die APU also, with cpu + IO on one die and a small GPU die. That would allow for maximum flexibility, especially if they make a GPU die with an HBM stack.
 
Reactions: Gideon
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |