Discussion AMD cools the pace to Moore's Law Death

Page 2 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

GTracing

Member
Aug 6, 2021
168
396
106
Would a new IOD improve SPEC2017 scores?
The are a few ways that a new I/O die could increase performance.
  • Lower memory latency
  • Higher ram bandwidth
  • Higher bandwidth from the I/O die to the CCD
  • Lower latency between CCDs
I might be missing some, but they all basically come down to improved bandwidth and latency for the L3 cache or ram.

With that in mind, the question becomes does SPEC benefit from better L3 and RAM? If you look at chips and cheese's 9800X3D vs 9950X benchmarks, I would say yes.

 

inquiss

Senior member
Oct 13, 2010
250
354
136
• New IOD
• Frequency Increase

I have a question: Doesn't the performance uplift brought by the new IOD fall into the category of "IPC" ?
Interesting but I don't think so. At least going by the latest IPC suggested in zen 5 where it seems like they explained the upcoming ipc based on what the core was capable of, but without the limitations of the IOD. Worked against them for zen 5 maybe work for them with zen 6?
 

yuri69

Senior member
Jul 16, 2013
574
1,017
136
Also, while I think that modifying the IOD of Zen 5 might well unlock more performance for AMD with minimal effort, I wonder if the Zen 5 front end needs modified to really take advantage of additional bandwidth?
The main counter point to a new IOD is AMD's history of the cheapest possible solutions. They have done "the XT" twice, why not with Zen 5 too?
 

gdansk

Diamond Member
Feb 8, 2011
3,276
5,186
136
The main counter point to a new IOD is AMD's history of the cheapest possible solutions. They have done "the XT" twice, why not with Zen 5 too?
There is something planned for next year that isn't the cheapest possible solution. But no one will want it and I'm not convinced it'll help much.
 

moinmoin

Diamond Member
Jun 1, 2017
5,145
8,226
136
Would a new IOD improve SPEC2017 scores?
As @GTracing already indicated the score already differs between different Zen 5 products. Epyc Turin famously achieves ~40% while X3D less client chips don't even manage 10% to the disappointment of pretty much everybody.

I expect Zen 6 to remove all the bottlenecks and then some, creating the uncore that is then reused again with Zen 7 which then is bottlenecked by that again. That's the repeating cycle AMD chose to go through with its Zen gens.

----

Regarding the thread's topic, I still think AMD biggest issue is not the kind of cadence but the fact it's slowing (@yuri69 pointed it out before in this thread).
 

gdansk

Diamond Member
Feb 8, 2011
3,276
5,186
136
As @GTracing already indicated the score already differs between different Zen 5 products. Epyc Turin famously achieves ~40% while X3D less client chips don't even manage 10% to the disappointment of pretty much everybody.
But a very large portion of that difference is because client Zen 5 runs up against frequency walls while Turin isn't. How is it going to fix that?

Edit: Even with finflex and N3 I don't expect more than 6050MHz.
 
Last edited:
Reactions: Tlh97 and DavidC1

Doug S

Platinum Member
Feb 8, 2020
2,888
4,911
136
The are a few ways that a new I/O die could increase performance.
  • Lower memory latency
  • Higher ram bandwidth
  • Higher bandwidth from the I/O die to the CCD
  • Lower latency between CCDs
I might be missing some, but they all basically come down to improved bandwidth and latency for the L3 cache or ram.

With that in mind, the question becomes does SPEC benefit from better L3 and RAM? If you look at chips and cheese's 9800X3D vs 9950X benchmarks, I would say yes.


Improved memory latency helps SPECint, but improving memory bandwidth only helps SPECfp. Improving the latency being CCDs won't really help SPECrate MT scores because it is a fully parallel benchmark, running multiple independent copies of the SPEC tests.

The reason SPEC is helped by X3D is because of the size of the cache reducing the effective memory latency for stuff that is pushed out of the smaller non-X3D L3 and has to be reloaded from main memory.
 

OneEng2

Senior member
Sep 19, 2022
259
356
106
AMD makes a new core every 2 years for the past 8 years, why are you surprised that Zen 6 is coming out 2 years after Zen 5?
The cadence has slowed over time. I am not surprised, in fact I argue that it is a waste of time and energy to design something new without the advantage of a significantly improved process to do it on.
What exactly is termed IPC?

(1) Is it the basic core processing "X" instructions for "Y" cycles = X/Y IPC?

(2) Is it the wider CPU processing "X" instructions for "Y" cycles = X/Y IPC?

(3) Something else?


(2) = real world usage for user. IOD improvements apply here but not in case (1), which seems to assume zero external latency effects. I would assume feeding the cores as relevant to IPC, but it seems core designers think more narrowly (my impression) as in case (1).
Unless we are talking about pure single threaded performance ..... which I think is silly in this day and age, it is my opinion that performance per clock is a much better gauge of the design of a core. This is why I keep questioning the assertion that Skymont is a great idea for a unified design. Only when viewed from a single threaded POV is this true. Once you are talking about MT applications, how much performance per core does Skymont attain compared to Zen 5c with SMT?
The main counter point to a new IOD is AMD's history of the cheapest possible solutions. They have done "the XT" twice, why not with Zen 5 too?
I am not complaining about AMD's efforts to remain very profitable while Intel is burning through money like a drunken sailor (FYI, I have done this personally in a very literal sense ). Zen 5 manages to outperform everything Intel has to offer in nearly every way while doing it on a less expensive and less dense process node.
There is something planned for next year that isn't the cheapest possible solution. But no one will want it and I'm not convinced it'll help much.
No one here will want it perhaps. We are not the majority of customers. I see AMD gaining traction in the corporate laptop market next year by offering the absolute best performance per $. Intel has spent decades using its loss leader capability in one segment that they simply augment with a profit leader in another. They do this and squeeze AMD's market share and profit that has kept AMD from building up further and investing to become more competitive. I think that things have changed. Cheap is good sometimes.
 

Thunder 57

Diamond Member
Aug 19, 2007
3,079
4,873
136
The cadence has slowed over time. I am not surprised, in fact I argue that it is a waste of time and energy to design something new without the advantage of a significantly improved process to do it on.

Zen 2 to Zen 3 had great gains gen over gen and both were N7. That said I don't think AMD can gain much more with a better IOD die and fclk.
 
Reactions: Tlh97 and yuri69

OneEng2

Senior member
Sep 19, 2022
259
356
106
Zen 2 to Zen 3 had great gains gen over gen and both were N7. That said I don't think AMD can gain much more with a better IOD die and fclk.
The core die size did go up from 74mm2 to 80mm2 though. Additionally, Zen 2 was the first chiplet design and like Intel, AMD had latency issues in the first design that they fixed with Zen 3 giving basically the same design much better performance.
 

GTracing

Member
Aug 6, 2021
168
396
106
The core die size did go up from 74mm2 to 80mm2 though. Additionally, Zen 2 was the first chiplet design and like Intel, AMD had latency issues in the first design that they fixed with Zen 3 giving basically the same design much better performance.
I don't know what you mean by "latency issues" but that's not what AMD says the IPC comes from.
 

DavidC1

Golden Member
Dec 29, 2023
1,211
1,932
96
Why do you think 10% performance is all that zen 6 will bring to the table? That's just the suggested IPC increase, there are other vectors for improved performance...
Zen 5 is already at 5.7GHz. There's practically no room to grow there. They used up most of them in Zen 4. We aren't going to get another 10%. Intel literally kills Raptorlake to get there, and in Arrowlake a significant performance potential.

In certain limited scenarios they will get faster. But when they say "10%" they mean in average, across wide multiples of workloads it'll get 10%. It doesn't matter whether you have 100,000MT CUDIMM backed up by Optane running at 1THz, and Infinity Fabric at 10GHz. It's all combined.

Well, maybe they can get 15% if not being able to use the clustered decode is a missed target. But that's about it. This is a losing battle.
This is why I keep questioning the assertion that Skymont is a great idea for a unified design. Only when viewed from a single threaded POV is this true. Once you are talking about MT applications, how much performance per core does Skymont attain compared to Zen 5c with SMT?
What's the size of Zen 4c without L2 cache? The performance per clock differences is 25-30% in Integer and 60% in FP. Gracemont clocks quite a bit higher though on the client part at 4.4GHz.

Now on Skymont that difference gets reduced to 5-10% in Integer and 20-25% on FP. Skymont clocks 5% higher at 4.6GHz, even though Lion Cove had to clock lower by 5%, both on the same die. Turin Dense on N3E is 1.9mm2 by the way.
 
Last edited:
Reactions: Tlh97 and OneEng2

moinmoin

Diamond Member
Jun 1, 2017
5,145
8,226
136
Going from Zen 2 to 3 the most obvious change was going from a 4 cores CCX to a 8 cores CCX. That both doubled the L3$ available to all cores in the CCD ("gamecache") as well as made cross CCX latency less of an issue (only limited to two CCDs x990 and x950 chips from that point onward). All this didn't even concern the redesigned core.
 

StefanR5R

Elite Member
Dec 10, 2016
6,056
9,106
136
What @moinmoin said. Also,
I don't know what you mean by "latency issues" but that's not what AMD says the IPC comes from.
View attachment 112010
I don't know what @OneEng2 had in mind particularly either. But when AMD highlights, for example, the changed cache prefetching policy of Zen 3, then this is about reduced latency. (See the middle part of page 5 of Ian Cutress' Ryzen 5000 deep dive.) Edit: I haven't looked up to which extent other items in that list, i.e. execution engine etc., were about reduced latency (versus improved throughput, although you hardly can look at either in isolation).

BTW, AMD apparently didn't disclose what the 25 workloads for their 19% figure were. Hence it is impossible to say if these workloads were largely insensitive to the CCX change. Only then would the left side of this slide correspond well with the right side of the slide. (End note R5K-003: See e.g. press release at amd.com.)
 
Last edited:
Reactions: Tlh97

StefanR5R

Elite Member
Dec 10, 2016
6,056
9,106
136
The cadence has slowed over time. I am not surprised, in fact I argue that it is a waste of time and energy to design something new without the advantage of a significantly improved process to do it on.
On the other hand, they diversify their products and thus address more target markets or address markets better.
 

GTracing

Member
Aug 6, 2021
168
396
106
What @moinmoin said. Also,

I don't know what @OneEng2 had in mind particularly either. But when AMD highlights, for example, the changed cache prefetching policy of Zen 3, then this is about reduced latency. (See the middle part of page 5 of Ian Cutress' Ryzen 5000 deep dive.)

BTW, AMD apparently didn't disclose what the 25 workloads for their 19% figure were. Hence it is impossible to say if these workloads were largely insensitive to the CCX change. Only then would the left side of this slide correspond well with the right side of the slide. (End note R5K-003: See e.g. press release at amd.com.)
AMD actually did say which 25 workloads they tested.



I can see where better prefetching could be considering lowering latency, but that's clearly not what OneEng2 was talking about. Since he said the first generation chiplets were the issue and prefetching has nothing to do with chiplets.
 

Meteor Late

Member
Dec 15, 2023
116
98
61
Zen 5 is already at 5.7GHz. There's practically no room to grow there.

Meh, I used to think the same about the 5GHz barrier. We don't know, but I'm willing to bet 6GHz is possible with TSMC N3P.
If you think about it, Intel 6GHz was achieved on a very old node (Intel 7), They are now at 5.7GHz with just TSMC N3B. Are you telling me N3B to N3P doesn't get you from 5.7GHz to 6GHz? let me doubt it.
 
Last edited:
Reactions: Tlh97

Meteor Late

Member
Dec 15, 2023
116
98
61
AMD actually did say which 25 workloads they tested.

View attachment 112036

I can see where better prefetching could be considering lowering latency, but that's clearly not what OneEng2 was talking about. Since he said the first generation chiplets were the issue and prefetching has nothing to do with chiplets.
Yeah that 19% IPC uplift is BS because it's heavily gaming focused, that's not what you do when you want to assess average IPC uplift across many applications.
Maybe it's truly 19% or so in Spec, but this test for sure was wrong if they wanted to show average IPC.
 

Thunder 57

Diamond Member
Aug 19, 2007
3,079
4,873
136
Yeah that 19% IPC uplift is BS because it's heavily gaming focused, that's not what you do when you want to assess average IPC uplift across many applications.
Maybe it's truly 19% or so in Spec, but this test for sure was wrong if they wanted to show average IPC.

What kind of garbage post is this? 19% in games, 19% or so in SPEC, but it's all wrong if you look at "average IPC"?
 

OneEng2

Senior member
Sep 19, 2022
259
356
106
On the other hand, they diversify their products and thus address more target markets or address markets better.
I think This is the immediate future. More non-homogeneous computing aimed at more specific work loads.

General computer is good at doing many things OK. Specific hardware like DSP for example, can perform their specific tasks hundreds or even thousands of times better at a fraction of the cost.
 

gdansk

Diamond Member
Feb 8, 2011
3,276
5,186
136
Yeah that 19% IPC uplift is BS because it's heavily gaming focused, that's not what you do when you want to assess average IPC uplift across many applications.
Maybe it's truly 19% or so in Spec, but this test for sure was wrong if they wanted to show average IPC.
It basically matched for Zen 3. But if you haven't realized by now AMD does their marketing numbers backward. Engineers say what they achieved in spec and then the marketing bros pick a bunch of benchmarks and games to put on the slide such that the geomean is around the target given.

It backfires tremendously when the SPEC and gaming results aren't close like Zen 5.
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |