Discussion Zen 5 Speculation (EPYC Turin and Strix Point/Granite Ridge - Ryzen 9000)

Page 867 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

OneEng2

Senior member
Sep 19, 2022
259
357
106
And what of ARL-H?

I still think it'll be better than Strix Point.
Possibly so. As I have said, Arrow Lake appears to be a very good laptop chip.
Nobody is talking about a process shrink. N4P is still made on the 5nm process. Going from N5 to N4P is a big generational jump in both performance and efficiency. If the 9800x3D is only 7-8% better than the 7800x3D. That would fall outside (11% performance increase) if AMD made a 7800x3D on N4P. There is a difference between real world and paper statistics. At the same time, TSMC cannot publish performance and power efficiency numbers that are not accurate. Customers would not buy their silicon if the numbers were not accurate.

Forum members should stop reading reviews or watching youtube vidoes on how silicon process works. You either get efficiency or performance increase, not both is BS. It depends on the node and the variant TSMC offers. The N3 3nm silicon only gives a significant efficiency increase over 5nm. There are clock regressions with N3 and no performance uplift. You cannot choose I want performance over efficiency. You have to wait for a N3P to get performance and efficiency gains. When they say it's customized for AMD. That simply means they give them an option to lean towards power or efficiency or a mix of both depending on the process.

When AMD was set to use N3 with Zen 5 they mentioned clock regressions and suddenly that changed when they had to use N4P. No clock regressions but clock increases. People attribute increased clocks to the Zen 5 design. I attribute it to N4P. Zen 4 was originally supposed to be on N5P but they could not get N5P. They settled for the base N5 silicon. A 7800x3D on N4P would yield very good efficiency and performance gains based on the N4P silicon.

Sometimes core density matters more than anything. Take Nvidia's Blackwell. That chip is so advanced they really needed to be on TSMC 3nm but Apple bought all the N3 silicon. Instead, Blackwell is on N4P which provides increased silicon density over all other 5nm processes.
When TSMC says "performance" it isn't referring to benchmarks, and it is qualified as to what it does mean.

Your assertion that Zen 4 would perform the same as Zen 5 on the same process node is nonsensical. You surely see that, right?
 

Hans Gruber

Platinum Member
Dec 23, 2006
2,369
1,259
136
Possibly so. As I have said, Arrow Lake appears to be a very good laptop chip.

When TSMC says "performance" it isn't referring to benchmarks, and it is qualified as to what it does mean.

Your assertion that Zen 4 would perform the same as Zen 5 on the same process node is nonsensical. You surely see that, right?
I don't know. Maybe watch all the Hardware Unboxed Zen 5 reviews and get back to me.
 

dr1337

Senior member
May 25, 2020
428
707
136
So any rumors about dual stacked CCDs? Or was all the news about TSVs being on top of the die just people getting their analysis wrong?

Supposedly Zen2 already had TSVs that never got used, maybe a Zen5+ or Zen6 will have fully sandwiched dies? Maybe stacked logic is going to finally become a thing?

I'm still skeptical about them finding a way to join the two CCDs together on a single interposer without changing the IOD, let alone that being economical, but maybe thats what the top side vias are for?
 

Hitman928

Diamond Member
Apr 15, 2012
6,390
11,392
136
So any rumors about dual stacked CCDs? Or was all the news about TSVs being on top of the die just people getting their analysis wrong?

Supposedly Zen2 already had TSVs that never got used, maybe a Zen5+ or Zen6 will have fully sandwiched dies? Maybe stacked logic is going to finally become a thing?

I'm still skeptical about them finding a way to join the two CCDs together on a single interposer without changing the IOD, let alone that being economical, but maybe thats what the top side vias are for?

No stacked logic. Not sure what you mean by TSVs being on top of the die.
 

dr1337

Senior member
May 25, 2020
428
707
136
No stacked logic. Not sure what you mean by TSVs being on top of the die.
High Yield made a video about it, and there was another twitter tech poster (I can't remember) that did another zen 5 die analysis. All from the Fritzchens Fritz pictures. Pretty sure I found both of those from this thread.

So either these people were just making things up, or there would be something else going on with multiple stacks of silicon.
 

Hitman928

Diamond Member
Apr 15, 2012
6,390
11,392
136
High Yield made a video about it, and there was another twitter tech poster (I can't remember) that did another zen 5 die analysis. All from the Fritzchens Fritz pictures. Pretty sure I found both of those from this thread.

So either these people were just making things up, or there would be something else going on with multiple stacks of silicon.

I don’t remember them claiming TSVs on the top of the die. Did you mean TSVs on the top die?
 

dr1337

Senior member
May 25, 2020
428
707
136
I don’t remember them claiming TSVs on the top of the die. Did you mean TSVs on the top die?
As far as I am aware, Fritzchens Fritz has never taken a picture that wasn't a top down view.

AMD put out a video today that showed the cache being under the die... so what else could I mean eh? Surely you're not getting into semantics about the position of flip chips right?
 

gaav87

Member
Apr 27, 2024
180
380
96
Nobody is talking about a process shrink. N4P is still made on the 5nm process. Going from N5 to N4P is a big generational jump in both performance and efficiency. If the 9800x3D is only 7-8% better than the 7800x3D. That would fall outside (11% performance increase) if AMD made a 7800x3D on N4P. There is a difference between real world and paper statistics. At the same time, TSMC cannot publish performance and power efficiency numbers that are not accurate. Customers would not buy their silicon if the numbers were not accurate.

Forum members should stop reading reviews or watching youtube vidoes on how silicon process works. You either get efficiency or performance increase, not both is BS. It depends on the node and the variant TSMC offers. The N3 3nm silicon only gives a significant efficiency increase over 5nm. There are clock regressions with N3 and no performance uplift. You cannot choose I want performance over efficiency. You have to wait for a N3P to get performance and efficiency gains. When they say it's customized for AMD. That simply means they give them an option to lean towards power or efficiency or a mix of both depending on the process.

When AMD was set to use N3 with Zen 5 they mentioned clock regressions and suddenly that changed when they had to use N4P. No clock regressions but clock increases. People attribute increased clocks to the Zen 5 design. I attribute it to N4P. Zen 4 was originally supposed to be on N5P but they could not get N5P. They settled for the base N5 silicon. A 7800x3D on N4P would yield very good efficiency and performance gains based on the N4P silicon.

Sometimes core density matters more than anything. Take Nvidia's Blackwell. That chip is so advanced they really needed to be on TSMC 3nm but Apple bought all the N3 silicon. Instead, Blackwell is on N4P which provides increased silicon density over all other 5nm processes.
Thats wrong my friend. For tsmc 'performance' increase is BASE clock speed increase at 1.2V. Not benchmark perf +x% !!
Boost clock has nothing to do with it except where they clearly specify Fmax increase. Boost clock depends on many factors but cba educating. So in other words +11% for your imaginary node jump for 7800x3d would increase base clock speed by 11% not the boost clock. Boost clock could still be the same.
 

Hans Gruber

Platinum Member
Dec 23, 2006
2,369
1,259
136
Thats wrong my friend. For tsmc 'performance' increase is BASE clock speed increase at 1.2V. Not benchmark perf +x% !!
Boost clock has nothing to do with it except where they clearly specify Fmax increase. Boost clock depends on many factors but cba educating. So in other words +11% for your imaginary node jump for 7800x3d would increase base clock speed by 11% not the boost clock. Boost clock could still be the same.
That's wrong. If you run two identical processors on different processes on the 5nm node. Clock for clock the performance on N4P will be 11% better on average than on N5. That means if you run a 7800x3d @ 1.2v and 4ghz. The N4P at the same clock and voltage would perform 11% better than on N5 with identical clocks and voltage. The efficiency gains means at the same voltage it would take eg. 10% less power to produce the same result. The increased clocks are primarily due to silicon and or voltage.

Fun fact. AMD actually wanted the Zen 4 CPU's on N5P but they had to settle for N5.
 

Abwx

Lifer
Apr 2, 2011
11,612
4,469
136
Fun fact. AMD actually wanted the Zen 4 CPU's on N5P but they had to settle for N5.

It is on N5P.

Dr Su reinforced that technology roadmaps are all about making the right choices and the right junctures, and explicitly stated that our 5nm technology is highly optimized for high-performance computing – it’s not necessarily the same as some other 5nm technologies out there

 

StefanR5R

Elite Member
Dec 10, 2016
6,056
9,106
136
If the 9800x3D is only 7-8% better than the 7800x3D [...]
One link in your chain of errors is your interpretation that anybody claimed that 9800X3D was 8% faster than 7800X3D. AMD at least did not say that. Here is AMD's claim in their October 31 press release:

A video game computer with a Radeon RX 7900 XTX, DDR5-6000, and 9800X3D in it delivers 8% more video frames per second on average (in AMD's selection of game benchmarks) than a video game computer with a Radeon RX 7900 XTX, DDR5-6000, and 7800X3D in it.​

So what does this say about how much faster 9800X3D's integer ALUs perform compared to 7800X3D's?
That's a rhetorical question.
 

CouncilorIrissa

Senior member
Jul 28, 2023
575
2,256
96
That's wrong. If you run two identical processors on different processes on the 5nm node. Clock for clock the performance on N4P will be 11% better on average than on N5. That means if you run a 7800x3d @ 1.2v and 4ghz. The N4P at the same clock and voltage would perform 11% better than on N5 with identical clocks and voltage. The efficiency gains means at the same voltage it would take eg. 10% less power to produce the same result. The increased clocks are primarily due to silicon and or voltage.
Seek help.
 

yuri69

Senior member
Jul 16, 2013
574
1,017
136

Josh128

Senior member
Oct 14, 2022
511
865
106
That's wrong. If you run two identical processors on different processes on the 5nm node. Clock for clock the performance on N4P will be 11% better on average than on N5. That means if you run a 7800x3d @ 1.2v and 4ghz. The N4P at the same clock and voltage would perform 11% better than on N5 with identical clocks and voltage.
No. That doesnt make any sense at all. If you run Proc A at 4 GHz and then you run Proc A on a newer, superior node, but identical design and logic, it will perform exactly the same at the same frequency with better efficiency.
 
Last edited:

LightningZ71

Golden Member
Mar 10, 2017
1,910
2,260
136
No. That doesnt make any sense at all. If you run Proc A at 4 GHz and then you run Proc A on a newer, superior node, but identical design and logic, it will perform exactly the same with better efficiency.
That's really going to depend on what was limiting Fmax on that logic. If it was power limited or thermal limited, then, no, you CAN get SOME limited frequency improvement.

But, that's not pertinent to the existing discussion.

TSMC's power/performance/density claims have, for at least the last decade or more, always been in reference to whatever representative test logic chip that they are demoing. I'll repeat, the claims only EXACTLY apply to that ONE, REPRESENTATIVE TEST CHIP, running at a strictly specified frequency, at a strictly specified power level, and implemented at their claimed target circuit density.

AMD does NOT run their chips at that certain power level, at least, not 100% of the time. It may cross that threshold during normal operations, but power draw is dynamic.

AMD does NOT run their chips at that one specific frequency, at least, not 100% of the time. It may cross that frequency threshold during normal operations, but frequency is dynamic.

AMD does not implement their chips at that one particular density level, EVER. AMD typically uses relaxed circuit density in an effort to boost Fmax and reduce leakage among other things. That relaxed density means that their chips have far different behavior than the TSMC test chip.

Now, all that being said, lets look at the 7700X and the 7800X3d. What would be a reasonable expectation of their BEHAVIOR on N4P instead of N5P? The reasonable performance would be that boost frequencies MAY increase by about 200Mhz. Frequency, not benchmark performance. Why would this be? The frequency would increase because N4P does allow slightly higher circuit frequencies at a given power level. It would also be possible because the power draw at those same frequencies would be slightly lower than it would be on N5P, meaning less heat is being generated. This means that any thermal limits aren't hit until a higher frequency, any power limits aren't hit until a higher frequency, and the core can run a little bit faster. But it isn't 11%!!! AMD already has relaxed circuit density on timing critical circuits, achieving a lot of what N4P was supposed to gain over N5P, so whatever improvements are given by the refinement over N5P are less effective. Because N4P offers a very modest improvement to circuit density, AMD could choose to implement a bit more padding on those same critical circuits and maybe get another slight frequency increase, but it's arguable that it wouldn't help.

Continuing, lets look at all-core frequencies. When the whole chip is running at it's power and thermal limits, process refinements are more evident. Just a few % reduction in power draw closer to the given chip's efficiency sweet spot can have a larger impact on highest achievable frequencies for all cores. This means that they MIGHT be able to get 300Mhz more all-core boost frequency at the same power draw and thermal limits as before. Again, 300Mhz FREQUENCY, not performance.

Why do I specify frequency as separate from performance? Because, for the x86 chips that we are focusing on, their achievable performance per Mhz of clock speed decreases as the processor increases it's operating frequency. Why? because the cores then spend more and more of their time waiting on the memory subsystem to serve them with data and work to do. And this is the rub. BASED ON PROCESS ALONE, cache density and performance barely moves at all between N5P to N4P. You MIGHT get 100Mhz more out of the L3. That's not making a big dent on performance.

In summary, you MIGHT get a couple hundred Mhz more boost speed out of the 7700X and 7800X3d (perhaps another 100Mhz on the X3D part because of their notable sensitivity to thermal load), but it isn't going to make up the whole difference between them and their 9XXX comparable parts. You'd expect maybe 2-3% performance improvement ON AVERAGE from that increase ON DESKTOP, ENTHUSIAST usage patterns.

However, you are forgetting that Zen4/5 are also server chips too. As we've just seen Michael post on Phoronix, Zen5 really is managing the rumored 40% improvement over Zen4 on life-for-like SKUs. Not the few % from above, not even 10% from the ideal achievable increase due to thermal/power load improvements at frequencies that are very near the sweet spot for the chips, but big improvements. That was achieved by improvements all over the chips. AMD doesn't get that by just an optical shrink. And, if AMD doesn't get that in servers, they're not going to waste the time and money to even attempt it.
 
Reactions: Nothingness

OneEng2

Senior member
Sep 19, 2022
259
357
106
Wow. I have to admit, it has been a while since I have seen a forum post as wrong headed as this...
That's wrong. If you run two identical processors on different processes on the 5nm node. Clock for clock the performance on N4P will be 11% better on average than on N5. That means if you run a 7800x3d @ 1.2v and 4ghz. The N4P at the same clock and voltage would perform 11% better than on N5 with identical clocks and voltage. The efficiency gains means at the same voltage it would take eg. 10% less power to produce the same result. The increased clocks are primarily due to silicon and or voltage.
The non-bold parts are only true for the exact conditions and mix of logic that TSMC quotes the metrics for.

The belief that clock speed of an entire superscalar processor pipeline is dependent ONLY on the performance of the transistors used in the pipeline shows a lack of understanding of how modern CPU's work.

Also, the power assumption is flawed (although perhaps not as off-base as the other assumptions) as well in that it depends on the mix of transistors in the CPU design.
 

eek2121

Diamond Member
Aug 2, 2005
3,135
4,479
136
Considering what we got from the leaker and the strong hints for 2x X3D dies, I still think the most probable outcome is

5.3 Ghz boost clock for the 9900X3D
5.4 Ghz boost clock for the 9950X3D

Two dies would also make 9900X3D an actually viable product (in gaming)
I would almost bet money on 5.5ghz+ for the 9950X3D. I would absolutely not be shocked if the chip hit 5.7ghz, but 5.5 is plenty. The 9900X3D will be slightly lower.

The 7950X3D topped out at 5.25ghz for the chiplet with the cache.

I don’t think AMD will have cache on both chiplets, however, if they did and also hit 5.7ghz SC, they would have a banger of a chip.

AMD stated that the cache was never the thing holding clocks back to begin with, it was cooling for the cores.

Why is the 9800X3D only at 5.2ghz? Probably mostly as to not step on the 9900X. Thermals are likely still a small factor as well, but these chips are hitting 5.4ghz+ with an overclock and are completely unlocked.
 

Hitman928

Diamond Member
Apr 15, 2012
6,390
11,392
136
As far as I am aware, Fritzchens Fritz has never taken a picture that wasn't a top down view.

AMD put out a video today that showed the cache being under the die... so what else could I mean eh? Surely you're not getting into semantics about the position of flip chips right?

I honestly am not sure what you meant so I guess I'll answer the question as asked, no, no one thought the TSVs were on the top of the CCD die.
 
Last edited:

moinmoin

Diamond Member
Jun 1, 2017
5,145
8,226
136
I honestly am not sure what you meant so I guess I'll answer the question as asked, no, no one thought the TSVs were on the top of the CCD die.

I think (among others) he's referring to this die shot analysis:
Zen 5 die: TSV's to the 64 MB X3D L3 cache:

It simply must have enough and well distributed TSV's...

It looks like a all the TSV's providing the power to the X3D L3 die are using an MIM decoupling capacitor for each power and ground pair of TSV's leading to something like an extra ~8500 power/ground TSV's. Under each black square should be two TSV's.

Original die photos: https://www.flickr.com/photos/130561288@N04/

View attachment 108838

Image in the quote:
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |