Discussion AMD cools the pace to Moore's Law Death

Page 3 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

DavidC1

Golden Member
Dec 29, 2023
1,211
1,932
96
Meh, I used to think the same about the 5GHz barrier. We don't know, but I'm willing to bet 6GHz is possible with TSMC N3P.
If you think about it, Intel 6GHz was achieved on a very old node (Intel 7), They are now at 5.7GHz with just TSMC N3B. Are you telling me N3B to N3P doesn't get you from 5.7GHz to 6GHz? let me doubt it.
Both 5GHz and 6GHz has been only in the realm of overclockers until very, very recently.

5GHz was first reached with the advent of heatsinks that weigh as much as some T&L laptops, and common watercooling. And with 5GHz CPUs there's very little overclocking headroom, meaning it was used up in factory settings. There has been increased understanding to how things work(electromagnetic interference, crosstalk, capacitance) along with maybe circuitry such as better error correction.

While 5GHz was reached that way, 6GHz was done with exotic cooling. Remember the world record is still only 9.1GHz. It cannot be sustained in any way, because it requires liquid nitrogen, on a hand picked lucky part, under voltages that it'll fry the chip in days. So there's a point where it cannot be exceeded, no matter how anyone wishes for it.

At some point, it's not process that gives you clocks. It's purely everything else. Optimization.
 
Last edited:

OneEng2

Senior member
Sep 19, 2022
259
356
106
AMD usually uses SpecINT2017 as the base for its IPC claims.
In what ways is SpecINT2017 representative of application performance, and where does it not represent application performance?

As far as I know, there aren't many people running SpecINT2017 to do their job today .

On the original topic, I have several more points on the brick wall of process enhancement as a method of increasing processor performance gen over gen:

  1. As mentioned before, the cost of new process equipment is becoming exponentially more expensive. This cost must be amortized across all the chips produced in the lifetime of that equipment.
  2. The new processes require more time to produce a chip (more passes of process) which is greatly increasing the cost of making a wafer of chips.
  3. Even IF we were able to continue Moore's law of doubling transistor count every 18 months (which we are not), increasing the transistor count is reaching diminishing returns as well (with the exception of highly parallel workloads in DC or workstation). Going from 1 to 2 INT execution units gives you a huge uplift through ILP and OOE, going from 4-8? Not so much.
Seems like the most likely path forward is more heterogenous computing with more specific compute units for specific tasks.
 

OneEng2

Senior member
Sep 19, 2022
259
356
106
Zen 5 is already at 5.7GHz. There's practically no room to grow there. They used up most of them in Zen 4. We aren't going to get another 10%. Intel literally kills Raptorlake to get there, and in Arrowlake a significant performance potential.

In certain limited scenarios they will get faster. But when they say "10%" they mean in average, across wide multiples of workloads it'll get 10%. It doesn't matter whether you have 100,000MT CUDIMM backed up by Optane running at 1THz, and Infinity Fabric at 10GHz. It's all combined.

Well, maybe they can get 15% if not being able to use the clustered decode is a missed target. But that's about it. This is a losing battle.

What's the size of Zen 4c without L2 cache? The performance per clock differences is 25-30% in Integer and 60% in FP. Gracemont clocks quite a bit higher though on the client part at 4.4GHz.

Now on Skymont that difference gets reduced to 5-10% in Integer and 20-25% on FP. Skymont clocks 5% higher at 4.6GHz, even though Lion Cove had to clock lower by 5%, both on the same die. Turin Dense on N3E is 1.9mm2 by the way.
Wow. Zen 5c on N3E is 1.9mm2 compared to Skymont at 1.73mm2 on N3B! I have found it difficult to find direct PPA comparisons of N3B vs N3E, but from what I have found, N3B had 1.7x density compared to N5 while N3E had 1.6x density over N5, so I am guessing it is a reasonable guess that N3B is about 10% more dense than N3E?

Given the die sizes above, Skymont on N3B is ~ 9.8% smaller than Zen 5c on N3E. 10% anyone?

So normalized for the same process node, Skymont die size is identical to Zen 5c, operates at lower IPC, and lacks SMT and AVX512.

Kind of a context swerve in the thread, but This makes Zen 5c seem pretty darned impressive and Skymont a little less so don't we think?
 
Reactions: Tlh97

DavidC1

Golden Member
Dec 29, 2023
1,211
1,932
96
Wow. Zen 5c on N3E is 1.9mm2 compared to Skymont at 1.73mm2 on N3B! I have found it difficult to find direct PPA comparisons of N3B vs N3E, but from what I have found, N3B had 1.7x density compared to N5 while N3E had 1.6x density over N5, so I am guessing it is a reasonable guess that N3B is about 10% more dense than N3E?
Zen 5c on N3E is 1.9mm2 without the L2 cache. Skymont's 1.73mm2 is with L2. Without it it's 1.15mm2.
 
Reactions: Tlh97

itsmydamnation

Platinum Member
Feb 6, 2011
2,978
3,656
136
Zen 5 is already at 5.7GHz. There's practically no room to grow there. They used up most of them in Zen 4. We aren't going to get another 10%. Intel literally kills Raptorlake to get there, and in Arrowlake a significant performance potential.

In certain limited scenarios they will get faster. But when they say "10%" they mean in average, across wide multiples of workloads it'll get 10%. It doesn't matter whether you have 100,000MT CUDIMM backed up by Optane running at 1THz, and Infinity Fabric at 10GHz. It's all combined.

Well, maybe they can get 15% if not being able to use the clustered decode is a missed target. But that's about it. This is a losing battle.
Zen5 literally have multiple critical performance structures* that half the size of your "darling cores" you worship in this thread. Like Zen5 has 1/2 the potencial OOOE window of oryon let alone whatever Apples upto these days.

like seriously do you actually look at the things your posting about or do you just have that heavy a bias ? no room to improve performance .... seriously

*=
iPRF
ROB
dTLB
 

OneEng2

Senior member
Sep 19, 2022
259
356
106
Zen 5c on N3E is 1.9mm2 without the L2 cache. Skymont's 1.73mm2 is with L2. Without it it's 1.15mm2.
Quite a bit smaller then. Still, there is quite a performance difference in MT between a single Zen 5c and a single Skymont. The question is if the difference in performance justifies the extra die size.
 

Meteor Late

Member
Dec 15, 2023
116
98
61
Quite a bit smaller then. Still, there is quite a performance difference in MT between a single Zen 5c and a single Skymont. The question is if the difference in performance justifies the extra die size.

I don't think so, if you equalize area when they converge the earliest, so maybe 3 Zen 5c vs 5 Skymont, I think Skymont wins.
 
Reactions: Tlh97

gdansk

Diamond Member
Feb 8, 2011
3,276
5,186
136
I have it on good authority (my own measurements from low resolution pictures) the area measurement presented above is incorrect. Closer to 1.65-1.7mm²
 
Reactions: Tlh97 and DavidC1

OneEng2

Senior member
Sep 19, 2022
259
356
106
I don't think so, if you equalize area when they converge the earliest, so maybe 3 Zen 5c vs 5 Skymont, I think Skymont wins.
In threaded loads (DC in specific) Zen 5c gains 40% with SMT. Zen 5c is also marginally higher single thread performance per clock IIRC, but even if you give Skymont a 1:1 score in single thread, In your example you are looking at 3 * 1.4 = 4.2 for Zen 5c and 5*1=5 for Skymont in MT (without AVX512).

Noting that the actual die space for Zen 5c would be 5.7mm2 vs 5 skymont = 5.75 (so about equal as you say).

However, in DC the end customer would end up paying licensing on that extra skymont core for all software that had this type of license model (per core).

Still, it paints Skymont as a very good desktop/laptop MT core for sure (where no software is licensed like this) .... as long as it can clock as high as Zen 5c that is.
 

burninatortech4

Senior member
Jan 29, 2014
686
352
136
AMD has a lot of work to do on the IOD. They could (but I don't think they will) refresh Zen 5 as Zen 5+ with a new IOD. Something like DDR5 6400 (which still preserves FCLK:MCLK ratio) and 2 RDNA 3 compute units.

I'm not worried about the CCD. But they shouldn't get lazy either - not that Intel's roadmap is anything to fear.
 
Reactions: Tlh97 and Thibsie

FlameTail

Diamond Member
Dec 15, 2021
4,238
2,592
106
I have it on good authority (my own measurements from low resolution pictures) the area measurement presented above is incorrect. Closer to 1.65-1.7mm²
For Skymont?

I took a measurement of Skymont in a high resolution Lunar Lake due shot, and the area is 1.1 mm² without L2 cache.
 

OneEng2

Senior member
Sep 19, 2022
259
356
106
Ok, so reworked, we are saying that Skymont on N3B w/ L2 = 1.15 mm2

Zen 5c on N3E is 1.65mm2. N3B ~ 10% more dense than N3E -> Zen 5c on N3B would be about 1.5mm2

1 Zen 5c ~= 1.5 Skymont in non AVX512 MT app performance.

So 2 Zen 5c ~= 3 Skymont in performance MT

2 Zen 5c = 3mm2 (N3B equivalent)
3 Skymont = 3.45mm2 (N3B)

Skymont design suffers a 3/2 per-core licensing deficit to Zen 5c.

Again, it looks an awful lot like Zen 5c is the better solution for MT.

Does anyone know what the max clock speeds are for current implementations of Skymont and Zen 5c? I know that Skymont is close, but slightly behind Zen 5c in single core IPC, but clock rates matter.
 

Meteor Late

Member
Dec 15, 2023
116
98
61
Ok, so reworked, we are saying that Skymont on N3B w/ L2 = 1.15 mm2

Zen 5c on N3E is 1.65mm2. N3B ~ 10% more dense than N3E -> Zen 5c on N3B would be about 1.5mm2

1 Zen 5c ~= 1.5 Skymont in non AVX512 MT app performance.

So 2 Zen 5c ~= 3 Skymont in performance MT

2 Zen 5c = 3mm2 (N3B equivalent)
3 Skymont = 3.45mm2 (N3B)

Skymont design suffers a 3/2 per-core licensing deficit to Zen 5c.

Again, it looks an awful lot like Zen 5c is the better solution for MT.

Does anyone know what the max clock speeds are for current implementations of Skymont and Zen 5c? I know that Skymont is close, but slightly behind Zen 5c in single core IPC, but clock rates matter.

I think N3B is more like 6% more dense, not 10%.
-N3B 1.7x compared to N5
-N3E 1.6x compared to N5
So 1.7/1.6 = 1.0625
Zen 5c would be more like 1.55 mm2, still not much of a difference anyway.
 

OneEng2

Senior member
Sep 19, 2022
259
356
106
I think N3B is more like 6% more dense, not 10%.
-N3B 1.7x compared to N5
-N3E 1.6x compared to N5
So 1.7/1.6 = 1.0625
Zen 5c would be more like 1.55 mm2, still not much of a difference anyway.
Agree. Not enough difference to shake a stick at, and it is still a totally theoretical situation.

The real point of the entire thread is that it may not make sense for companies to churn out ground up cores every 18 months. Lets say that it takes 36-48 months (3-4 full years) before process technology improves by ~ 15-20% as we move forward. I believe that only when a significant increase in transistor budget is possible, does it make sense to do a new design.

I do agree that with such long times between major updates to the design, there needs to be a tweaked design in between.

Still, no matter how you paint the picture, it means we will be seeing a big slowdown in the advancement of processing power in the future.
 

DavidC1

Golden Member
Dec 29, 2023
1,211
1,932
96
Zen5 literally have multiple critical performance structures* that half the size of your "darling cores" you worship in this thread. Like Zen5 has 1/2 the potencial OOOE window of oryon let alone whatever Apples upto these days.

like seriously do you actually look at the things your posting about or do you just have that heavy a bias ? no room to improve performance .... seriously.
Right, so pointing out a chasm of performance per clock and performance per watt advantages, and with M4 absolute performance too has to do with "bias".

no room to improve performance .... seriously
The problem with x86 vendors is TTM and execution. Near 2 year cycle from Zen 5 means it would need to be 20% gain minimum.

No I said there was no room to increase clocks. How do people expect chips to easily reach 6.5-7GHz is beyond me. And even if they could, it would come with big sacrifices, aka Raptorlake, aka Arrowlake.
I have it on good authority (my own measurements from low resolution pictures) the area measurement presented above is incorrect. Closer to 1.65-1.7mm²
That's still a big difference.
 
Last edited:

OneEng2

Senior member
Sep 19, 2022
259
356
106
No I said there was no room to increase clocks. How do people expect chips to easily reach 6.5-7GHz is beyond me. And even if they could, it would come with big sacrifices, aka Raptorlake, aka Arrowlake.

That's still a big difference.
Agree. I think that people that are expecting clocks to do more than minor bumps in clock speeds over the next decade are likely young and don't remember Netburst. Still waiting for that 10Ghz processor Intel promised .

... and it is a big difference. I am surprised at how small Zen 5c is on N3E.

I would still like to see something that shows performance per clock for Skymont vs Zen 5c in a MT environment. Everyone here seems fixated on IPC which I feel is the incorrect measurement.
 

DavidC1

Golden Member
Dec 29, 2023
1,211
1,932
96
Agree. I think that people that are expecting clocks to do more than minor bumps in clock speeds over the next decade are likely young and don't remember Netburst. Still waiting for that 10Ghz processor Intel promised .
Anyone who paid attention to world overclocking records could see it coming, you didn't need to know ounce about CPU uarch to see it coming. 5GHz was all water until quite recently. Water usage was very rare in those days, since it was unreliable, expensive and thus leaked quite a bit. Above that it was exotic cooling and not sustainable for perhaps more than an hour.

I'm pleasantly surprised they could get the clocks this high. But the world overclock record is set by the degrader Raptorlake, and it's only at 9.1GHz. And you have near zero OC headroom today, when back in those days you could get 30%

It's amazing how a significant subset of Intel engineering believed they could reach 5, 6, 7GHz so easily. I guess if you don't constantly try to look at the big picture you get pigeonholed, no matter how smart you are.
I would still like to see something that shows performance per clock for Skymont vs Zen 5c in a MT environment. Everyone here seems fixated on IPC which I feel is the incorrect measurement.
The core is the hard part. If they want a better dedicated chip then they need to at least move to a dual core cluster, or even a core with dedicated L2.

Core 2 was merely ok for servers. But when it moved to IMC and P2P bus it showed how good it really was.
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |