Speculation: Ryzen 4000 series/Zen 3

Page 163 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Lennox0010

Junior Member
Aug 26, 2020
2
1
36
I thought the current amendment was 7 which runs until 2024?

“Today, the seventh amendment of the WSA spans on January 2019 through March 2024,” Devinder Kumar, AMD CFO, says during the earnings call (via Seeking Alpha). “It establishes purchase commitments and pricing at 12-nanometer and above for the years 2019 through 2021. The amendment also provides AMD full sourcing flexibility at 7-nanometer and beyond without any one-time payments or royalties for products, purchase from other foundries.”


Sorry I misread it spans through 2021. But yeah they do so much volume they may not need another amendment.
 

Veradun

Senior member
Jul 29, 2016
564
780
136
It is also uncertain if new GF processes will be used anyway for next gen gauntlet/motherchip/IOdie making any WSA constraint irrelevant.

(this would allow to avoid putting even more pressure on TSMC capacity)
 

Kedas

Senior member
Dec 6, 2018
355
339
136
Al we know is that Zen3 will have shared L3 over 8 cores instead of 4.
And we assume some edits for improvements.

So is Zen3 a new design or just a L3 change of Zen2 with some small changes in the core?
 

coercitiv

Diamond Member
Jan 24, 2014
6,393
12,823
136
So is Zen3 a new design or just a L3 change of Zen2 with some small changes in the core?
New design according to AMD’s Forest Norrod.
When asked about what kind of performance gain Milan's CPU core microarchitecture, which is known as Zen 3, will deliver relative to the Zen 2 microarchitecture that Rome relies on in terms of instructions processed per CPU clock cycle (IPC), Norrod observed that -- unlike Zen 2, which was more of an evolution of the Zen microarchitecture that powers first-gen Epyc CPUs -- Zen 3 will be based on a completely new architecture.

Norrod did qualify his remarks by pointing out that Zen 2 delivered a bigger IPC gain than what's normal for an evolutionary upgrade -- AMD has said it's about 15% on average -- since it implemented some ideas that AMD originally had for Zen but had to leave on the cutting board. However, he also asserted that Zen 3 will deliver performance gains "right in line with what you would expect from an entirely new architecture."
They are emulating the old tick-tock strategy, meaning we can expect evolutionary arch steps on a mature process. The Zen 2 IPC jump over Zen does not fit this model, but was explained as an exception since the original Zen was constrained in a such way that they had to leave some improvements for later.
He also indicated that AMD's server CPU launches are set to rely on the "tick-tock" cadence that was once the hallmark of Intel CPU launches, with the launch of a CPU platform that relies on a new manufacturing process node but the same microarchitecture as the last platform (the "tick") followed by a platform that relies on a new microarchitecture but the same manufacturing process node (the "tock").
The situation with AMD is still fluid though, they may yet have chosen a more aggressive path if they had enough faith in TSMC.
 

Kedas

Senior member
Dec 6, 2018
355
339
136
If Zen3 is the new architecture and not Zen4 shouldn't AVX512 be present in Zen3? (unless they only plant to add it in Zen5)
 

inf64

Diamond Member
Mar 11, 2011
3,764
4,222
136
If Zen3 is the new architecture and not Zen4 shouldn't AVX512 be present in Zen3? (unless they only plant to add it in Zen5)
Zen1 was a new uarcticture and it did not have native 256bit L/S and AVX units. Zen3 is in the same boat, they will move to wider L/S and AVX units once they are on a smaller node, just like with Zen2. Zen3->Zen4 is going to be very similar jump as Zen1(+)->Zen2.
 

Kedas

Senior member
Dec 6, 2018
355
339
136
That could make sense, I thought AVX512 would be much harder than making some paths wider.
 

moinmoin

Diamond Member
Jun 1, 2017
4,994
7,765
136
That could make sense, I thought AVX512 would be much harder than making some paths wider.
It's the exact opposite. Balancing the overall design to prevent bottlenecks anywhere is way harder than increasing the performance in one isolated area without consideration to everything else.
 

Kedas

Senior member
Dec 6, 2018
355
339
136
But then you mean Zen4 will be a bigger desiign update than Zen3 circling back to my first point Zen3 is a not a big design change.
 

A///

Diamond Member
Feb 24, 2017
4,352
3,155
136
Why offer AVX512 on Ryzen when you could offer it on Threadripper? What conceivable reason would there be to offer it on regular mainstream lines?
 

NostaSeronx

Diamond Member
Sep 18, 2011
3,688
1,222
136
The big design is Zen5. Family 19h to Family 2xh has equal or greater ST performance enhancement to Family 15h to Family 17h.

Basically, the cores teams on Zen/Zen2 was moved to Zen5 on "3nm". While, another cores team dealt with Zen3/Zen4 on 5nm.

However, there is a bit of RevA/RevB going on.
Zen3 on 7nm(<== Husked architecture) != Zen3 on 5nm(<== Full new architecture)

7nm Zen3 => Bad, Zen2 + some features with lots of features scrapped.
5nm Zen3 => Good, new architecture w/o features planned for early Zen3. (Fresh design w/o legacy)

48W Zen2/Zen3 7nm equal performance as 32W Zen3 5nm and <25W Zen4. With Zen5 going for >1.5x performance at same power as last 19h architecture aka Zen4.

AMD has a couple issues with staying TSMC 7nm now, while ARM and RISC-V new cores prioritizing launching on TSMC 5nm now to slightly later.

All these AMD roadmaps lead to a bleak picture where N2/N3 and U87/___(X1 response) in HPC SoCs will be preferred before Milan/Trento and Genoa/Torino.
They have SVE and RVV, while AMD killed off their own FPU instruction set team after XOP.

OT: There is hidden text...

If you highlight the above green area: +40% perf SVE 2x256
 
Last edited:

moinmoin

Diamond Member
Jun 1, 2017
4,994
7,765
136
But then you mean Zen4 will be a bigger desiign update than Zen3 circling back to my first point Zen3 is a not a big design change.
Who is "you"? Please use quotes if you are talking with specific people.

In general Zen 3 is going to have a new core design. From the initial Zen launch until now the Zen cores didn't see any fundamental changes, just tweaks and iterative ones. On a technical level (e.g. OS detection) this makes Zen 3 the first cores from a CPU family different from all previous Zen cores.

Zen 4 is expected to support an all new platform, moving to what's assumed to be AM5 with support for PCIe 5 and DDR5. With that the whole package topology, especially the uncore and all I/O is expected to change. Just like the significant package topology changes in Zen 2 this doesn't affect the cores per se.
 
Reactions: Nereus77 and Tlh97

Kedas

Senior member
Dec 6, 2018
355
339
136
Did AMD ever mention Zen3 not to be on 7nm(+)?
Who is "you"? Please use quotes if you are talking with specific people.
The Borg collective obviously.
Sorry it's a reply based on the combined comments not necessary your full point of view. (like A says A=1, B says B=2. wait A+B is not 4)

I don't think this an 7nm and 5nm version of Zen3 is believable, 5nm has always been on track, right? then there is no important reason for AMD to change plans that were already made (at least not due to TSMC production readiness). Sure there are can be plenty of other reasons to go back to 7nm but then that's just bad planning and AMD seems to be in the zone.
 

NostaSeronx

Diamond Member
Sep 18, 2011
3,688
1,222
136
Did AMD ever mention Zen3 not to be on 7nm(+)?
Not on the roadmap. [Comments]
- Supported L3 Cache Physical Design team -Worked with the TSMC 7nm and 5nm
- AMD's 5nm/7nm IC and graphics-card products [early]
- 5nm/7nm AMD CPU/GPU/APU and graphic-card products [later]
- Mixed-Signal layout design in 5nm/7nm FinFET technology
- Synthesis over 7nm/5nm.formality,vsilp,timing over video tiles.
- Cadence Virtuoso tool for TSMC (7nm, 6nm, 5nm)
- Hands-on experience on multiple projects at 5nm
- L2 Macro -5nm FinFET Test Chip Layout
- Built, released, and executed IP verification on standard cell libraries in TSMC 5nm
- feature enhancement on top of FinFET based TSMC 5nm & 7nm technologies.
- high performance Microprocessors,SERDES, DDRs, DAC's, ADC's, PLL's in tsmc7 and 5nm technologies.
- N5 TSMC FINFET, HDSP, UHD2PRF and HDRF2P Testchip and compiler development from scratch for Hi-Silicon and AMD @Synopsys

Some of these are old, some of these are new.

However the newer ones overlap with newer Zen3:
- Part of Zen 3 core RTL team for load store unit.
- upcoming Zen 3 CPU core verification/debug.

The cores team currently handling Zen3 is on 5nm only. While the previous cores team handling Zen3 was working on 7nm only(w/ Zen5 cores team now). While the contractors deal with both. ODC work is more heavily 5nm than 7nm for Zen3.

Most of the info gets deleted if spotted and posted everywhere:
Renoir - N7
Durango - N6
Rembrandt - N5

30h and greater for 19h is a new architecture unrelated to the 18h family reject Zen3/7.

2015-2017 7nm/7nm+
K18.1 => Zen2 CPU
K18.2 => Zen3 CPU

Pre-Dhyana/JV, K18.0 = Zen, where Family 17h = ARMv8.0-A "K12" in even older docs.

2018-2020 5nm/5nmP
True 19h cores: Zen3/Zen4
20,000 wpm for N5 in 2020. [Q4 2019 onwards] (5nm Zen3)
30,000 wpm for N5P in 2021. [Q4 2020 onwards] (5nm Zen4)
^-- TSMC 5nm WSA
 
Last edited:

turtile

Senior member
Aug 19, 2014
618
296
136
If Zen3 is the new architecture and not Zen4 shouldn't AVX512 be present in Zen3? (unless they only plant to add it in Zen5)

I don't think they will use AVX512 in any Zen architecture. They've said it's better to do in GPU. From patents, it looks like the post-Zen era (likely after Zen 5) will use chiplets with subsets of x86 instructions that can be customized for servers. So if a customer needs AVX512, they can add a chiplet that can do it.
 

jamescox

Senior member
Nov 11, 2009
642
1,104
136
There are a lot of ways to make things small and compact without being wasteful. Note that eliminating the IO die would not be done for performance, but rather, package space. It is 2:30am here and I am on mobile so I won’t go into it now, but I do have some thoughts here.

One thing to remember is, excluding the Threadripper and EPYC product lines (because they need MCM), MCM actually costs more than a more classic design. The vast majority of AMD’s desktop chips contain a single chiplet and IO die as only the top end use 2 chiplets. All of their laptops are also a single monolithic die. I am betting AMD will approach MCM very differently in the future.

EDIT: It looks like Renoir is actually much cheaper to make vs. desktop Ryzen. Granted they cut down on the cache, but they also added a GPU. Don’t be surprised if select future Ryzen chips end up being monolithic. Cheaper manufacturing means higher margins after all. Yeah R&D, tape-out, etc. all cost a fortune, but they are (relatively) fixed costs, and provided AMD can sell enough chips, the trade off is worth it.

The MCM design is a big part why AMD is so much better than intel right now. They can make huge numbers of tiny cpu die which helps yields massively (see the ISSCC 2018 Zeppelin die slides). Current Zen2 Rome with 8 cpu die is over 1000 square mm of silicon total. You just can't do that at all as a monolithic die currently. Retilcle limit is somewhere around 830 square mm. Nvidia V100 is one of the largest chips made at 815 square mm. Intel has to use two 28 core cpus at, I think, close to 800 square mm each just to come close. Intel probably can't get as good of yields as using tiny chiplets even on a 14 nm +++ process with such a large die. That is also a lot less cache. It is only 37.5 MB per die.

The mobile chips are monolithic almost entirely for power consumption. Driving external interfaces takes a lot more power than staying on chip. The IFOP links are very power efficient, much better than IFIS, but nowhere near as low as staying on chip with very short interconnect. Sram cache also takes a lot of power. AMD mobile chips have generally half the cache of the desktop parts of the same generation to save on power. This is also probably a part of apple power efficiency. They only have 6 MB L2, no L3 and generally much lower DRAM than competing parts also.

There is good reason to have a different die for mobile while keeping the same chip for desktop, server, and workstation. They still want to use the same cpu chip across multiple markets for the economy of scale and for the best binning. It may be the case that a monolithic die would be cheaper for desktop parts based on die area or some other metric, but being able to tape out a single type of chip and some cheap IO die on older processes is almost certainly a big win. Intel has quite a large number of die variants to make chips with varying numbers of cores. They use at least 3 for low, medium, and high core count. The product stack is made of salvaged parts for most SKUs. They make something like a 10 core and then salvage maybe 8 and 6 core parts from that production. I haven't looked up Intel's different die variants in a while, so that may not be accurate now. If you are going from 4 core to 64 cores, then making a large number of monolithic die variants doesn't work well. Most of them would be way too large to make efficiently on 7 nm as a monolithic die anyway.

I could see them extending a Zen 3 based APU to cover more of the desktop market though, so I guess I may agree with you on future monolithic die. They could just use the APU for anything 8 cores or less. That would reduce the number of cpu die used for desktop parts significantly, but perhaps their yeilds are good enough and they are shipping enough Epyc and Threadripper that it isn't a concern. The cache sizes are getting larger than is probably necessary for most desktop applications anyway. If the Zen 3 based APU is half the cache that is on the chiplet, then that would still be 16 MB; don't know what it will be for Zen 4.

Zen 3 Epyc seems that it will probably have roughly the same IO die as zen2 Epyc. I don't know if it is coming with zen 4 or zen 5, but the next step seems to be an active interposer based on what AMD has said about X3D die stacking (2.5D plus 3D stacking). With an active interposer, almost all of the current IO die could actually be in the interposer. The external links would only be PCI-e and DRAM since the cpus would be on top. I am not sure what memory they would stack on top though. HBM memory is not that useful as a cpu-style cache. It is DRAM, so it doesn't really have very good latency. If they wanted to have support for AVX512, then the bandwidth would be useful, but HBM probbaly would not reduce latency much. They could perhaps stack an SRAM cache chip under or on top the cpu die; that would make some sense to save space. They would still be limited on area due to reticle size. It is hard to speculate what they will do with an interposer since there is a lot of possibilities. I think they will stick with an 8 core CCX for a while though, even if they put more than one CCX on a die again. If the cpu chiplet is interposer based for Epyc, then a desktop part would also need a small interposer which would make it even more expensive to make compared to the current desktop ryzen parts. This makes it seem more likely that anything 8 cores (or perhaps 16 by Zen 4 or 5 generation) or less will be an APU part.

AVX512 would take a lot of changes due to needing to supply a lot more bandwidth. Current infinity fabric in Zen 2 is 256 bit (32 bytes) wide. A dual channel DRAM controller is 128 bit, but since it is DDR, it can supply 256 bits per actual memory clock. That is also the size of the cache line, so almost all internal paths are 256-bits. For AVX512, they would need to go up to 512-bit wide to keep supply a reasonable amount of bandwidth. If the do AVX512 in Zen 3, then I think it will be split across two clocks with a 256-bit unit. The will need to widen the internal interconnect for DDR5 and PCI-e 5 anyway, but it is unclear when or if they will actually do so. Even though Zen 3 is a new architecture, I expect it to be the same 256 bit interconnect with the same IO die. Widening the pathways it probably Zen 4 or 5, whichever supports DDR5 and PCI-e 5. It may make sense to support AVX512 then. A lot of servers don't need much of any FP processing power, so I am wondering if we will actually get multiple cpu designs at some point.
 

jamescox

Senior member
Nov 11, 2009
642
1,104
136
I don't think they will use AVX512 in any Zen architecture. They've said it's better to do in GPU. From patents, it looks like the post-Zen era (likely after Zen 5) will use chiplets with subsets of x86 instructions that can be customized for servers. So if a customer needs AVX512, they can add a chiplet that can do it.
They may still want to include AVX512 for HPC. Many servers don't really need much FP processing power at all, but I suspect those will be going ARM eventually anyway. If Zen 3 has AVX 512, then it is probably split into 256-bit chunks and processed over 2 clocks. Going full AVX512 will waste a lot of power since they would need to double a lot of the internal interconnect from 256 to 512 bits wide. I don't think that they did that yet with Zen 3.

With Zen 4 or Zen 5, they could do a lot of different things using interposers. That makes the wider paths a lot easier to do. Things that can take advantage of a 512 bit vector are probably best run on a gpu anyway since they actually have the bandwidth to support such wide units but with an interposer they could add HBM to supply the bandwidth. They could also put some form of gpu unit on the interposer.
 

maddie

Diamond Member
Jul 18, 2010
4,787
4,771
136
Charlie has an intriguing post on Milan.

AMD is going to Celeron Intel’s Xeon margins

"AMD is going to ‘Celeron’ Intel’s Xeon line later this year and there is nothing Intel can do about it. SemiAccurate was quite impressed when we heard about this plan, it shows AMD is really maximizing their resources."

"AMD is somewhat constrained on pricing too, if they raise ASPs too much, their TCO advantage goes away and the pressure comes off Intel. If they keep prices too low, Intel feels the pain but they do too. Other than sending Morris Chang at TSMC large bouquets of flowers and chocolates that spell out, “please give us more wafers”, what can AMD do? With Milan, AMD came up with something really clever that should put Intel in a very painful place when it comes to server pricing."
 

Martimus

Diamond Member
Apr 24, 2007
4,488
153
106
What does that even mean?
One thing I hate about Charlie Demerjians writing, is that he posts emotionally charged comments about technology. It makes it very difficult to take anything he says seriously.

EDIT: Although after reading the article, it isn't really emotional, it is just poorly written. What he is trying to convey is that AMD is pricing EPYC processors so that Intel won't be able to make a profit from their server processors if they are even remotely price competitive.
 
Last edited:
Reactions: Tlh97 and Bigos

maddie

Diamond Member
Jul 18, 2010
4,787
4,771
136
What does that even mean?
I gather it means a reduction in margin due to moving down the product stack. Product stack meaning the offerings from both companies, not solely Intel. This does not sound like speculation but access to a sort of product map. Hyperbole, yes, but still intriguing.

In any case, the Celeron part was not the interesting piece for me, but the ability to ship more product and easing their supply constraints by some strategy not solely based on increased 7nm wafer supply. The last sentence.
 

jpiniero

Lifer
Oct 1, 2010
14,835
5,452
136
I gather it means a reduction in margin due to moving down the product stack. Product stack meaning the offerings from both companies, not solely Intel. This does not sound like speculation but access to a sort of product map. Hyperbole, yes, but still intriguing.

In any case, the Celeron part was not the interesting piece for me, but the ability to ship more product and easing their supply constraints by some strategy not solely based on increased 7nm wafer supply. The last sentence.

Dual sourcing Zen 3 on some inferior node? Don't think it would be worth it.
 

moinmoin

Diamond Member
Jun 1, 2017
4,994
7,765
136
Will be interesting to see what this "clever approach" will be. Milan otherwise would be a "boring" step since unlike previous gens it doesn't improve the most obvious parameter of moving other products down the stack: increasing the core count.
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |