Discussion Zen 5 Speculation (EPYC Turin and Strix Point/Granite Ridge - Ryzen 9000)

DisEnchantment · Sep 29, 2022

Speculate at will

Ajay · Apr 8, 2023

CakeMonster said:
How has this been done historically? I mean, it seems the transistor budget these days is rather decent, at least as long as AMD is using TSMC, is there any reason why they are not slowly just adding functions for what they think the future needs? That being whether or not they have planned a radical architecture shift in the next few generations, surely slowly adding that in would be useful? I realize that its being designed several years ahead, so whatever generation is in its infancy right now will probably have some AI/ML functions added just to be safe (despite the GPUs doing the main work)?

They are. AMD are adding AI/ML "optimizations" to Zen5. So, it doesn't seem (as of yet) like any kind of extended instruction set or even new fixed function units for sure.

A/// · Apr 8, 2023

Anhiel said:
keep things running cool as ~10 years before while the clockspeed race will likely have calmed down.

????????????????????????????

A/// · Apr 8, 2023

Ajay said:
They are. AMD are adding AI/ML "optimizations" to Zen5. So, it doesn't seem (as of yet) like any kind of extended instruction set or even new fixed function units for sure.

That's the Xilinx IP. I remember neural engines being mentioned which fits that bill but to the extent of which is not known.

DrMrLordX · Apr 9, 2023

Ajay said:
Zen5 on 4nm and 3nm. If I were a betting man, I would say the CCDs are coming out on N4 - in keeping with AMD's preference for going with an N-1 node (with appropriate co-optimizations). I don't know if N3 will be for Zen5c or a Zen5 based APU SOC).

Point being, some of this speculation seems a bit odd, given what AMD has publicly published.

I would bet on N4P or similar for Zen5.

A/// · Apr 9, 2023

DrMrLordX said:
I would bet on N4P or similar for Zen5.

Probably a customized version of their performance library. NVidia is on a 5nm process called 4N which stands for "For Nvidia" or "Nvidia Performance" but I would expect them to get their own process for 4 or 3 nm with Blackwell, but don't know if they amd will get a custom process for their own gpus.

Tsmc is due to introduce their gaafet tech around 2nm and it'll be interesting to see who has the better process at that juncture whether it's intel's gaafe, tsmc's or if samsung picks up their internal slack and releases their process. In a perfect and just world, all three would succeed.

DrMrLordX · Apr 9, 2023

A/// said:
Probably a customized version of their performance library.

We won't know until well after the fact, but yes, you are likely correct. AMD never used a conventional N7 node either.

NostaSeronx · Apr 9, 2023

Zen5 (#Turin):
CPP=45nm
M2=30nm
SRAM Target = ~6 GHz at 1.2V
3nm [Zen5 was stamped in and greenlit before N3E was ever announced.]

Zen5c (#Turin-Dense):
CPP=49nm
M2=35nm
SRAM Target = ~5 GHz at 1.2V
4nm [Zen2 -> Zen3, but, Zen4c -> Zen5c // Similar architecture setup]

[Zen5] is only on 3nm. Zen5 in Turin = 3nm, Zen5 in Granite = 3nm, Zen5 in Strix = 3nm.

BorisTheBlade82 · Apr 9, 2023

@NostaSeronx
I'd say your information is outdated. Yes, initially Zen5 was supposed to be 3nm - but even back a couple of years that seemed like a stretch goal when comparing the lag between Apple starting on 7nm and 5nm vs. AMD.
So after the trouble of TSMC 3nm all the chatter points to AMD using 4nm for Zen5 and maybe some 3nm variant for Zen5c, which might come a year later.

Geddagod · Apr 9, 2023

BorisTheBlade82 said:
@NostaSeronx
I'd say your information is outdated. Yes, initially Zen5 was supposed to be 3nm - but even back a couple of years that seemed like a stretch goal when comparing the lag between Apple starting on 7nm and 5nm vs. AMD.
So after the trouble of TSMC 3nm all the chatter points to AMD using 4nm for Zen5 and maybe some 3nm variant for Zen5c, which might come a year later.

Not even chatter points, Lisa Su herself said Zen 5 was coming on both 3nm and 4nm. The only 'stretch' I could see there is if the zen 5 CCD uses 3nm and has 4nm stacked on top, by default, but I'm still pretty sure it means that we will see Zen 5 on 4nm and Zen 5C on 3nm.

A/// · Apr 9, 2023

Geddagod said:
Not even chatter points, Lisa Su herself said Zen 5 was coming on both 3nm and 4nm. The only 'stretch' I could see there is if the zen 5 CCD uses 3nm and has 4nm stacked on top, by default, but I'm still pretty sure it means that we will see Zen 5 on 4nm and Zen 5C on 3nm.

Core and cache chiplets spooning placed off to the corner of this discussion, how does AMD plan on using 3nm if Apple's bought up a large run of it already or are apple on another node? This doesn't add up if Zen 5 is due q2-3 next year.

DisEnchantment · Apr 9, 2023

A/// said:
Core and cache chiplets spooning placed off to the corner of this discussion, how does AMD plan on using 3nm if Apple's bought up a large run of it already or are apple on another node? This doesn't add up if Zen 5 is due q2-3 next year.

This time there is far more supply than Apple is able to consume. TSMC has to put the construction of F12P9 on hold for fear of underutilization. There is low key F21P2 under construction for fabbing N3 in USA.
There will be 5 gigantic fabs (F18 P5/6/7/8 and F12P8) producing N3 family in 2024. Insane to think TSMC can do close to 200k wpm of N3 in 2024 (with option of F21P2 and F12P9 later for a total of 7 GigaFabs!!!). Good luck to the competition. This is on top of 200k wpm capacity of N5 family (F18P1/2/3/4 and F21P1). And F20 P1 - 4 for N2 in construction at once.
Compare this to ~120k wpm of N7 (F14 & F15 fabs) when Zen 2 launched.

A/// · Apr 9, 2023

DisEnchantment said:
This time there is far more supply than Apple is able to consume. TSMC has to put the construction of F12P9 on hold for fear of underutilization. There is low key F21P2 under construction for fabbing N3 in USA.
There will be 5 gigantic fabs (F18 P5/6/7/8 and F12P8) producing N3 family in 2024. Insane to think TSMC can do close to 200k wpm of N3 in 2024 (with option of F21P2 and F12P9 later for a total of 7 GigaFabs!!!). Good luck to the competition. This is on top of 200k wpm capacity of N5 family (F18P1/2/3/4 and F21P1)
Compare this to ~120k wpm of N7 (F14 & F15 fabs) when Zen 2 launched.

I thought f12p9 was laid out for Intel only or quote unquote the rumors of intel ordering, cancelling halting whatever. I figured Apple's decision to order a halt on the fabs was due to them realziing they're not selling like they did during the pandemic and the timing is so suspect with wwdc around the corner and an insight into the m3 processors or new m2 extension products but likely m3.

by the time the arizona plant is done most companies like nvidia or amd or apple will not be on n3, but it is invaluable for others. Unless you're suggesting they begin a ramp up in 2024?

DisEnchantment · Apr 9, 2023

A/// said:
I thought f12p9 was laid out for Intel only or quote unquote the rumors of intel ordering, cancelling halting whatever. I figured Apple's decision to order a halt on the fabs was due to them realziing they're not selling like they did during the pandemic and the timing is so suspect with wwdc around the corner and an insight into the m3 processors or new m2 extension products but likely m3.

by the time the arizona plant is done most companies like nvidia or amd or apple will not be on n3, but it is invaluable for others. Unless you're suggesting they begin a ramp up in 2024?

Actually the Intel contract changed, it was supposed to be based on F14 with 20K wpm dedicated to Intel, but now it is just a replica of F18 series. Seems they cannot consume the wafer output ...
TSMC feels 5 fabs will be enough, the AZ F21P2 is for N3 in 2025+, F12P9 on hold. AZF21P1 will do N5 in 2024.

DisEnchantment · Apr 9, 2023

NostaSeronx said:
Zen5 (#Turin):
CPP=45nm
M2=30nm
SRAM Target = ~6 GHz at 1.2V
3nm [Zen5 was stamped in and greenlit before N3E was ever announced.]

Zen5c (#Turin-Dense):
CPP=49nm
M2=35nm
SRAM Target = ~5 GHz at 1.2V
4nm [Zen2 -> Zen3, but, Zen4c -> Zen5c // Similar architecture setup]

[Zen5] is only on 3nm. Zen5 in Turin = 3nm, Zen5 in Granite = 3nm, Zen5 in Strix = 3nm.

I believe these were original plans from 2 years ago, even Execufix mentioned them. But things changed a bit

https://twitter.com/x/status/1397816823622639619

But I don't see N4P as being severely disadvantaged vs N3E like N6 vs N5 for instance. Outside of density, N4P is just a tiny bit less efficient than N3 as per official TSMC numbers.
However Zen 4 CCDs are still tiny, there is still quite a lot of room to grow.
From slide below, AMD will likely take the node once it becomes mainstream.

But I am curious where you found M2P for N3? CPP is same from IEDM.

A/// · Apr 9, 2023

DisEnchantment said:
Actually the Intel contract changed, it was supposed to be based on F14 with 20K wpm dedicated to Intel, but now it is just a replica of F18 series. Seems they cannot consume the wafer output ...
TSMC feels 5 fabs will be enough, the AZ F21P2 is for N3 in 2025+, F12P9 on hold. AZF21P1 will do N5 in 2024.

Hmm anything to do with the "sudden news" of mtl s being cut from the future lineup?

Tuna-Fish · Apr 9, 2023

Anhiel said:
I've a new realization. Since Windows 11 no longer supports 32b it's possible both AMD and Intel will be moving toward discarding the 32b part of the decoder. I'm not sure x86 will be completely discarded, probably not. In any case, the reduction should be significant and reduce power consumption quite a bit. This probably won't happen with Zen5 yet. I'm guessing a 3-5 year grace period. So probably not happening with Zen6 either.

Not true. Windows 11 still requires 32bit support, there is just no longer a fully 32-bit build.

Also, maintaining the 32-bit paths on the decoder is not a significant die area, performance or power cost.

Joe NYC · Apr 9, 2023

Ajay said:
They are. AMD are adding AI/ML "optimizations" to Zen5. So, it doesn't seem (as of yet) like any kind of extended instruction set or even new fixed function units for sure.

Wouldn't it make sense to add the silly data types like BF16 to the AVX512 instruction set?

Joe NYC · Apr 9, 2023

A/// said:
60 days? idk Joe.

Yes most people confuse this when they read it. AMD is throttling orders based on what they know will sell out. It's a very good lineup but you can 100% over produce epycs and be stuck with inventory. dc refresh cycles come in waves, it's not constant.

Well, it is not June yet, and AMD AM5 socket jumped from being behind Intel, to nearly 3x Intel. And it nearly tied AM4.

The x3d is taking the PC gaming by storm:
#1 7800x3d 1860
#2 5800x3d 1070

https://twitter.com/x/status/1645055506275262464

Ajay · Apr 9, 2023

NostaSeronx said:
Zen5 (#Turin):
CPP=45nm
M2=30nm
SRAM Target = ~6 GHz at 1.2V
3nm [Zen5 was stamped in and greenlit before N3E was ever announced.]

So, AMD openly lied in their recent Financial Analyst's Day presentations??? I don't buy it. Or, possibly, CCDs are on 3N and IOD is on 4nm. I haven't seen anything legit, but pretty sure Zen5 is in AMD labs for testing and verification as we speak. Too bad the semiconductor design houses have gotten so good a quashing leaks.

Ajay · Apr 9, 2023

Joe NYC said:
Wouldn't it make sense to add the silly data types like BF16 to the AVX512 instruction set?

I believe it's already there in Genoa.

BorisTheBlade82 · Apr 9, 2023

Ajay said:
So, AMD openly lied in their recent Financial Analyst's Day presentations??? I don't buy it. Or, possibly, CCDs are on 3N and IOD is on 4nm. I haven't seen anything legit, but pretty sure Zen5 is in AMD labs for testing and verification as we speak. Too bad the semiconductor design houses have gotten so good a quashing leaks.

As already stated: The consensus these days more or less is that Zen5c might arrive in some 3nm variant, while OG Zen5 rather not.
Which slides do you reference specifically?

A/// · Apr 9, 2023

Joe NYC said:
Well, it is not June yet, and AMD AM5 socket jumped from being behind Intel, to nearly 3x Intel. And it nearly tied AM4.

The x3d is taking the PC gaming by storm:
#1 7800x3d 1860
#2 5800x3d 1070

https://twitter.com/x/status/1645055506275262464

To quote our very own "lord" on this Lord day @DrMrLordX amd ought to have started out with 3d vcache models to pump sales all throughout fall and winter and now into spring.

yuri69 · Apr 9, 2023

Joe NYC said:
Well, it is not June yet, and AMD AM5 socket jumped from being behind Intel, to nearly 3x Intel. And it nearly tied AM4.

The x3d is taking the PC gaming by storm:
#1 7800x3d 1860
#2 5800x3d 1070

This week is the 7800 X3D launch week. The stats are most likely just temporary.

Anhiel · Apr 9, 2023

A/// said:
????????????????????????????

Starting from 14nm summing the power savings from N10 to N2: 1/(1.35*1.4*1.3*1.15*1.3*1.15)=0.23673
adjusting for clockspeed increase using Core i7-4790 84W TDP 4GHz to i9-13900 max 219(PL2)TDP 5.2GHz: 5.2/4=1.3
=> 0.30775 power => 13900 with N2 ~67W(PL2)
Ofc in practice everyone will be raising clockspeed but it should still be in the ball park of my estimation.

Tuna-Fish said:
Not true. Windows 11 still requires 32bit support, there is just no longer a fully 32-bit build.

Also, maintaining the 32-bit paths on the decoder is not a significant die area, performance or power cost.

Removing 32b would at least save 1/3 of the width (dunno about the tail end but assuming they are kind of same length). That would also remove related error handling which should be 1/4 as complex as 64b, hence, saving 1/5.
So a 5-wide 64b only implementation would have saved 5/3 less width, hence, 2/3 less width compared to 4-wide full 32b+64b implementation.
If that isn't significant I don't know what is.

Tuna-Fish · Apr 9, 2023

Anhiel said:
Removing 32b would at least save 1/3 of the width (dunno about the tail end but assuming they are kind of same length). That would also remove related error handling which should be 1/4 as complex as 64b, hence, saving 1/5.
So a 5-wide 64b only implementation would have saved 5/3 less width, hence, 2/3 less width compared to 4-wide full 32b+64b implementation.
If that isn't significant I don't know what is.

What nonsense is this? What do you even mean by width?

The hard part about decoding x86 instructions is length determination, and doing this is exactly as hard with x86_64 as it is on legacy x86. Unlike what ARM did, AMD didn't significantly clean up the encoding. They just retired some legacy ops, but AMD64 is still byte-aligned with tons of separate prefixes. There are no major wins available for retiring 32-bit support on x86.

Discussion Zen 5 Speculation (EPYC Turin and Strix Point/Granite Ridge - Ryzen 9000)

Golden Member

Lifer

Diamond Member

Diamond Member

Lifer

Diamond Member

Lifer

Diamond Member

Senior member

Golden Member

Diamond Member

Golden Member

Diamond Member

Golden Member

Golden Member

Diamond Member

Golden Member

Platinum Member

Platinum Member

Lifer

Lifer

Senior member

Diamond Member

Senior member

Member

Golden Member