Question AMD Phoenix/Zen 4 APU Speculation and Discussion

Page 45 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Exist50

Platinum Member
Aug 18, 2016
2,452
3,101
136
Full fat core made with potential to high clocks will have lower efficiency at lower clocks despite being able to hit those clocks with lower voltage.
At iso-voltage, leakage and CaC will be higher for the big core, but at iso-performance, the small core will require substantially higher voltage. We don't have enough information to conclude how those balance out.
 

naukkis

Senior member
Jun 5, 2002
782
636
136
At iso-voltage, leakage and CaC will be higher for the big core, but at iso-performance, the small core will require substantially higher voltage. We don't have enough information to conclude how those balance out.

We do know from existing mobile designs that there is crossover point at some performance level - below efficiency cores will win and above performance cores will be more efficient. And that point is quite a high for typical mobile loads. So for mobile devices best performance and battery usage time will need two-differently optimized core groups. We don't yet know AMD core configuration for that Phoenix - but there is basically two possibilities used by ARM. big.Little approach with two core groups - for AMD that would mean different CCX for different groups - or later Dynamic IQ where different cores are laid to same group - to AMD that would mean that both high performance and efficient cores are part of same CCX - and so being able to share their L3 cache too. I don't see point for AMD to go with that older ARM route.
 

BorisTheBlade82

Senior member
May 1, 2020
667
1,022
136
With same rail both core groups can efficiently share their core-level L3-cache. That might be bigger benefit than performance optimizations with different power rails.
This will never ever be the case!
The two CCX will be totally separate from each other regarding Caches. Otherwise, as stated before: I'll eat a broom.
 

BorisTheBlade82

Senior member
May 1, 2020
667
1,022
136
At iso-voltage, leakage and CaC will be higher for the big core, but at iso-performance, the small core will require substantially higher voltage. We don't have enough information to conclude how those balance out.
What makes you think that?
Quite the opposite should, and very like will, be true.
Or to be even more specific: No matter the voltages I fully expect Zen4c to get more work done per Joule than Zen4.
 

BorisTheBlade82

Senior member
May 1, 2020
667
1,022
136
We do know from existing mobile designs that there is crossover point at some performance level - below efficiency cores will win and above performance cores will be more efficient. And that point is quite a high for typical mobile loads. So for mobile devices best performance and battery usage time will need two-differently optimized core groups. We don't yet know AMD core configuration for that Phoenix - but there is basically two possibilities used by ARM. big.Little approach with two core groups - for AMD that would mean different CCX for different groups - or later Dynamic IQ where different cores are laid to same group - to AMD that would mean that both high performance and efficient cores are part of same CCX - and so being able to share their L3 cache too. I don't see point for AMD to go with that older ARM route.
The very strong evidence for separate CCX is that AMD is basically recycling everything - in this case the design blocks from Bergamo and PHX1. They are doing things smarter, not harder.
 

bakyt115

Member
Nov 21, 2016
84
153
106
Much more frustrating is it seems this 'Zen4C' has same size of L3 as Zen4 in PHX2?


Isn't Zen4C being confirmed that it has half L3 of Zen4? How much smaller would this 'Zen4C' than 'Zen4 with half L3 in APUs' ? And afterall the L3 could be globally shared between all cores other than L3 being seperated into two slices/CCX?
for the APU L3 is already halved
 
Reactions: Tlh97 and Kaluan

eek2121

Diamond Member
Aug 2, 2005
3,051
4,276
136
That seems very unclear. They certainly take less area, but if they have a ~GHz penalty, the full fat core should be able to hit the same clocks at a substantially lower voltage. Will be interesting to get more data on for sure.
Intel/AMD chase the last 10% of performance at the expense of enormous amounts of power. If you limited a Zen 4 chip to 3.7ghz, you will notice power consumption drops like a rock.

Now take a density/power optimized chip and test again, power consumption will end up being even lower, especially if the chip is monolithic.

for the APU L3 is already halved

Correct, however it has not been confirmed if this will be the case for all APUs going forward.

I feel like some folks here don’t realize how much modern chips sacrifice in terms of density just to hit high frequencies.

Side Note: AMD could have a killer product if they made a 170W 8+16 chip with full L3 on the big cores, provided the big cores could hit 5.5ghz+ ST.

Such a chip would surely beat all of Intel’s chips quite easily.
 

RTX2080

Senior member
Jul 2, 2018
322
511
136
for the APU L3 is already halved

I know that. Why this comes as a surprise for me is I heard months ago the L3 cache of this 'Zen4C in PHX2' is also 'half of the half', which means it has half L3 of the Zen4C.

L3 size (per core) comparison
Zen4: 4mb
Zen4C: 2mb
Zen4 APU: 2mb
rumored Zen4C APU: 1mb

BUT the leak by Xinoassassin said it also has same L3 size as 'Zen4 APU' which would be 2mb. I wonder how much small of this 'Zen4C APU' CCX end up.

edit: this below is also important that AMD doesn't use hardware scheduler, just use CPPC instead:

 
Last edited:

Exist50

Platinum Member
Aug 18, 2016
2,452
3,101
136
This isn't Intel we are talking about. Too early to make that kind of definitive statement.
What makes you think that?
Quite the opposite should, and very like will, be true.
Or to be even more specific: No matter the voltages I fully expect Zen4c to get more work done per Joule than Zen4.
These are very fundamental scaling trends. It's the same architecture, and thus same IPC. What AMD has done is built a design that sacrifices clock speed for area. This is great if you want to have a whole bunch of cores running near Vmin for max perf/watt, but in terms of per core performance, any point Zen 4 achieves, Zen 4c will need higher voltage (further up the VF curve) to hit, and we all know how badly voltage scaling hurts power.

Yes, this is very similar to what we see on the Intel side, but that's because the same rules apply. If anything, AMD should demonstrate this even more clearly given the architectural consistency.
Intel/AMD chase the last 10% of performance at the expense of enormous amounts of power. If you limited a Zen 4 chip to 3.7ghz, you will notice power consumption drops like a rock.

Now take a density/power optimized chip and test again, power consumption will end up being even lower, especially if the chip is monolithic.
This isn't just clipping off the high voltage region; it's fundamentally shifting the entire curve down. If AMD could halve the area just by sacrificing high V support, vanilla Zen 4 would have very little reason to exist in servers, which rarely operate in that region anyway.

And you can see this in the numbers cortex posted above. The performance and power gaps between the two are similar. Which given superlinear power scaling means that Zen 4 would be substantially lower power if you scaled the clocks down to match. The only factor that could break this trend would be leakage dominance at low V.
 
Reactions: Tlh97

BorisTheBlade82

Senior member
May 1, 2020
667
1,022
136
@Exist50
Sorry, I have to disagree.
Zen4 is clearly a Vmax design, where trade-offs had to be taken - and I am not only talking about density, but consumption on a cell level as well.
If you can make yourself free of that goal, you can take the function blocks and put them in a totally different mask set with different libraries/ cells / constraints / stuff. And that basicall is Zen4c.
Your assumption, that the whole V/f curve might get shifted up is wrong - at least within the relevant boundaries. It will reach higher frequencies at, let's say, up to 3 GHz, with less voltage - the whole characteristics will change.
What you observed with Intel was that they operated Gracemont beyond the window it was designed for - just to win CB23 against the 16c SKUs of AMD. That, and the same voltage rail.
And yes, for stupidly parallel work, Bergamo might win overall against Genoa at ISO-TDP.
 
Reactions: Tlh97

Geddagod

Golden Member
Dec 28, 2021
1,205
1,172
106
@Exist50
Sorry, I have to disagree.
Zen4 is clearly a Vmax design, where trade-offs had to be taken - and I am not only talking about density, but consumption on a cell level as well.
If you can make yourself free of that goal, you can take the function blocks and put them in a totally different mask set with different libraries/ cells / constraints / stuff. And that basicall is Zen4c.
Your assumption, that the whole V/f curve might get shifted up is wrong - at least within the relevant boundaries. It will reach higher frequencies at, let's say, up to 3 GHz, with less voltage - the whole characteristics will change.
What you observed with Intel was that they operated Gracemont beyond the window it was designed for - just to win CB23 against the 16c SKUs of AMD. That, and the same voltage rail.
And yes, for stupidly parallel work, Bergamo might win overall against Genoa at ISO-TDP.
When you use smaller cells, you need more power for the same performance

About leakage, I heard different things. I assumed that smaller cells would have lower leakage because the area would be smaller, but I also heard someone say leakage for HP cells are lower than HD cells because more fins create more surface area/volume.
 

BorisTheBlade82

Senior member
May 1, 2020
667
1,022
136
When you use smaller cells, you need more power for the same performance
View attachment 78668
About leakage, I heard different things. I assumed that smaller cells would have lower leakage because the area would be smaller, but I also heard someone say leakage for HP cells are lower than HD cells because more fins create more surface area/volume.
You seem to have misinterpreted that chart and the source you got it from - quite the opposite is true.

 

Exist50

Platinum Member
Aug 18, 2016
2,452
3,101
136
@Exist50
Sorry, I have to disagree.
Zen4 is clearly a Vmax design, where trade-offs had to be taken - and I am not only talking about density, but consumption on a cell level as well.
If you can make yourself free of that goal, you can take the function blocks and put them in a totally different mask set with different libraries/ cells / constraints / stuff. And that basicall is Zen4c.
Your assumption, that the whole V/f curve might get shifted up is wrong - at least within the relevant boundaries. It will reach higher frequencies at, let's say, up to 3 GHz, with less voltage - the whole characteristics will change.
What you observed with Intel was that they operated Gracemont beyond the window it was designed for - just to win CB23 against the 16c SKUs of AMD. That, and the same voltage rail.
And yes, for stupidly parallel work, Bergamo might win overall against Genoa at ISO-TDP.
I mean, you can wait for Chips & Cheese to get ahold of it, but halving your inner core area does not come for free. As I said, if the VF curve was truly equivalent up to ~3GHz, then vanilla Zen has very little reason to exist outside of client. 4c would be a miracle core.

And this isn't some knock against Zen 4c. It'll do very well for its intended use cases. But there are real engineering tradeoffs behind AMD's choice in product segmentation.
 

Geddagod

Golden Member
Dec 28, 2021
1,205
1,172
106

BorisTheBlade82

Senior member
May 1, 2020
667
1,022
136
I mean, you can wait for Chips & Cheese to get ahold of it, but halving your inner core area does not come for free. As I said, if the VF curve was truly equivalent up to ~3GHz, then vanilla Zen has very little reason to exist outside of client. 4c would be a miracle core.

And this isn't some knock against Zen 4c. It'll do very well for its intended use cases. But there are real engineering tradeoffs behind AMD's choice in product segmentation.
There are a lot of ways to reduce size - increasing density is just one. And please have a look at the quoted Wikichip article as well. Denser cells consume less, not more.

OG Zen4 has its benefits: V-Cache option and higher per core performance for workloads that need that. But for embarrassingly parallel work, my bet is on Bergamo.
 

Exist50

Platinum Member
Aug 18, 2016
2,452
3,101
136
There are a lot of ways to reduce size - increasing density is just one. And please have a look at the quoted Wikichip article as well. Denser cells consume less, not more.

OG Zen4 has its benefits: V-Cache option and higher per core performance for workloads that need that. But for embarrassingly parallel work, my bet is on Bergamo.
Well hopefully AMD doesn't keep us waiting long for an answer. Also, note that Zen 4 already uses HD cells. There are knobs available other than process.
 

BorisTheBlade82

Senior member
May 1, 2020
667
1,022
136
@Exist50
Indeed. Bergamo as well as MI300 and now even PHX2 are products, where I can't wait for official numbers and specs.
On Intel's side it is sadly only MTL in the mid-term. And even that might get close before 2023 runs out.
 

NostaSeronx

Diamond Member
Sep 18, 2011
3,689
1,224
136
Okay decided to hop into this:

Zen4:
Standard HPC -> 75% of cells => 2x width/1x height
High-Speed HPC(DTCO knob) -> 25% of cells => greater than 2x width/1x height
Memories are custom high-current, SRAM = 8T HC.

Zen4-Dense:
Low-power HPC (DTCO knob) -> ?% of cells => ~1x width/1x height
Low-power/High-speed HPC (DTCO knob) -> ?% of cells => less than 2x width/1x height
Memories are mostly custom high-speed high-density, SRAM = 8T HSHD.

Probably have to wait for Athlon 700GE series. If there is one, to see when frequency gains requires exponential voltage growth and exotic cooling. Server offerings of Zen4 do not have the >5.5 GHz, nor do the mobile offerings go beyond >5.1 GHz like AM5 boards do. The highest max(SC) boost on Genoa is 4.4 GHz, while Mobile sees 5.1 GHz.
 
Last edited:

RTX2080

Senior member
Jul 2, 2018
322
511
136
Some old and interesting find, that it seems AMD already naming one of Zen4 CCD in Dragon Range as EFFICIENCY cores...? It sounds like a marketing disaster despite the CCD has lower clock. So what is the Efficiency core inside Phoenix?

 
Reactions: igor_kavinski
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |