Discussion Zen 5 Speculation (EPYC Turin and Strix Point/Granite Ridge - Ryzen 9000)

DisEnchantment · Sep 29, 2022

Speculate at will

naukkis · Jul 31, 2024

StefanR5R said:
So... there is a thin & light laptop which does not run at 5.1 GHz constantly. And this is a fiasco.

OK.

To me a fiasco is if a laptop doesn't have a keyboard with concave keys and enough travel, or lacks a trackpoint, for example.

Wasn't Zen5 supposed to be Apple killer? M4 uses 7W to have something 10+ specint result on Ipad, Strix in light laptop just above 7. 50% more performance at under half power - gap is widening not closing between AMD and Apple.

naukkis · Jul 31, 2024

Nothingness said:
SVE brings predicates and first fault ld/st to the table which can be quite useful for autovectorization. Some of these features were available starting with AVX-512 and were also added to Intel new AVX10.

That said, I agree most of the time hand tuned NEON code is as fast as 128-bit SVE. I still think the sweet spot is at 256-bit wide aka AVX2 or AVX10.2 with 256-bit vectors. And if area/power matters that much do as AMD did on Zen4 for AVX-512 use multiple uops on narrower paths; that did well on Zen4.

AMD does not use multiple uops to execute AVX-512, AVX512 has lane crossing instructions and splitting them to multiple uops would tank performance. AMD's AVX-512 on 256 bit hardware uses full 512-bit registers and single uop per instruction, only executing ALU and load/store are 256 bit so to execute full instruction they are replayed taking 2 clock cycles instead 1 when execution hardware word length matches instruction. This is nothing new, for example Zilog Z80 has 4-bit execution hardware for 8 bit registers.

igor_kavinski · Jul 31, 2024

Joe NYC said:
Another pricing leak

https://twitter.com/x/status/1818631374707761456

So X3D pricing would be:

9950X3D: $699
9900X3D: $549
9800X3D: $459

gaav87 · Jul 31, 2024

The price for 9700x and 9600x is horrible 7800x3d costs 329$ in Poland and can be found as low as 300$... 9700x wait for amDiscount.

StefanR5R · Jul 31, 2024

naukkis said:
Wasn't Zen5 supposed to be Apple killer?

Not this again. Did Dr. Lisa Su tell you this?

Hyperbole has done more than enough damage to this thread already.

A more practical question is: You have got a laptop which is designed for 17 W default heat dissipation from the SoC but can be reconfigured to put more than this through the SoC. This laptop allows a defective software which occupies 1 logical CPU 100% of the time to drive this CPU at 4.98 GHz, but not at 5.10 GHz. What are ASUS's customers missing due to this, well, fiasco?

Edit, and if you think of saying "but Apple..." another time, I can think of saying "trackpoint" one more time, if you wish. ;-)

naukkis · Jul 31, 2024

StefanR5R said:
Not this again. Did Dr. Lisa Su tell you this?

Hyperbole has done more than enough damage to this thread already.

A more practical question is: You have got a laptop which is designed for 17 W default heat dissipation from the SoC but can be reconfigured to put more than this through the SoC. This laptop allows a defective software which occupies 1 logical CPU 100% of the time to drive this CPU at 4.98 GHz, but not at 5.10 GHz. What are ASUS's customers missing due to this, well, fiasco?

Edit, and if you think of saying "but Apple..." another time, I can think of saying "trackpoint" one more time, if you wish. ;-)

What laptop cpu needs is long battery time and good performance for single and low-mt threads. That could be balanced well for laptops - but to consume 30W for single thread isn't exactly helpful. That Asus customer would have total fiasco sold as laptop, would burn his lap and drain it's 30Wh battery in two hours idling. Battery operated devices should newer allow 1T workload to use full tdp but to limit it to some sane power levels. AMD seems to not afford to do it anymore.

CouncilorIrissa · Jul 31, 2024

naukkis said:
Strix point seems worse and worse. It's ridiculous that it even cannot sustain max ST boost clocks on sub 30W devices - what full Zen5 @5,.7GHz consumes, full 100W @ ST workloads? Thats starts to be as ridiculous as Intel Raptor lake fiasco.

Imagine saying with a straight face that throttling -- we're not even sure if due to power or thermals -- is the same thing as CPUs becoming unstable within a 6-month period.

techjunkie123 · Jul 31, 2024

Josh128 said:
OK!! The fabled 35% has finally reared its head! lol

Looking at the single core performance vs power, wouldn't you want your core to run at ~12W, to get the most performance without absolutely blowing up power consumption? You'd only take a 5-10% hit in performance.

Why does AMD decide to just get maximum performance at the cost of power? It seems Apple is usually (except with M4, although their starting place is so much better) much better in deciding where to sit on the curve.

StefanR5R · Jul 31, 2024

naukkis said:
What laptop cpu needs is long battery time and good performance for single and low-mt threads. That could be balanced well for laptops - but to consume 30W for single thread isn't exactly helpful. That Asus customer would have total fiasco sold as laptop, would burn his lap and drain it's 30Wh battery in two hours idling. Battery operated devices should newer allow 1T workload to use full tdp but to limit it to some sane power levels. AMD seems to not afford to do it anymore.

Counter to your worries, Notebookcheck's (for instance) battery runtime test results of Zenbook S 16 look OK to my untrained¹ eye, well in line with current Windows and non-Windows competitors of the same weight/ thickness/ screen class.

________
¹) My own newest laptop has got a Haswell i7 and swappable battery.

Hitman928 · Jul 31, 2024

naukkis said:
What laptop cpu needs is long battery time and good performance for single and low-mt threads. That could be balanced well for laptops - but to consume 30W for single thread isn't exactly helpful. That Asus customer would have total fiasco sold as laptop, would burn his lap and drain it's 30Wh battery in two hours idling. Battery operated devices should newer allow 1T workload to use full tdp but to limit it to some sane power levels. AMD seems to not afford to do it anymore.

You keep repeating that STX needs 30 W for 1T loads but I think you need to provide proof of this if you are going to take that stand.

poke01 said:
Looking into it yes it seems the smaller Asus laptops cannot reach the 5.1GHz. The 16" Zenbook seems fine tho.

View attachment 104212

EDIT: To add some more notes it seems the 5.1GHz boost clock consumes up to ~29 watts. I think that is too much for the smaller Asus laptops to handle for longer periods.

View attachment 104213

This looks like a single core holding 5.1 GHz takes 18 W - 20 W to me. The initial power rush is a transition and probably due to multiple cores spinning up at the beginning of the test as the scene is loaded but then settles down to 18 W - 20 W once the pure 1T compute starts.

Edit: This is also package power, so the core power is obviously less than that and scaling up to 5.7 GHz won't be nearly as dramatic as you make it seem.

Nothingness · Jul 31, 2024

naukkis said:
AMD does not use multiple uops to execute AVX-512, AVX512 has lane crossing instructions and splitting them to multiple uops would tank performance. AMD's AVX-512 on 256 bit hardware uses full 512-bit registers and single uop per instruction, only executing ALU and load/store are 256 bit so to execute full instruction they are replayed taking 2 clock cycles instead 1 when execution hardware word length matches instruction. This is nothing new, for example Zilog Z80 has 4-bit execution hardware for 8 bit registers.

I was overusing the word uop to make it clear and obviously failed. I meant the datapaths are 256-bit wide which helps reduce area and peak power while still giving a good performance uplift over AVX2. This is applicable to 256-bit vector mapped to 128-bit datapaths which was my point. In that context you get the benefit of both wider registers and ISA extensions that AVX-512/SVE propose over AVX2/NEON.

Saylick · Jul 31, 2024

techjunkie123 said:
Why does AMD decide to just get maximum performance at the cost of power? It seems Apple is usually (except with M4, although their starting place is so much better) much better in deciding where to sit on the curve.

Because they can. When you design a chip, you try to balance performance, power, and die size (which is analogous to cost). Often times, you can only choose two at the detriment of the last one. In Apple's case, they can take a hit on die size because they can charge a premium for their products. Others don't enjoy that luxury. With a larger xtor budget, you can design a core with suitable performance and power by letting it stay within that sweet spot on the freq/power curve.

DAPUNISHER · Jul 31, 2024

naukkis said:
Strix point seems worse and worse. It's ridiculous that it even cannot sustain max ST boost clocks on sub 30W devices - what full Zen5 @5,.7GHz consumes, full 100W @ ST workloads? Thats starts to be as ridiculous as Intel Raptor lake fiasco.

Multiple reports for trolling and flame bait. Please stop, or your ban bingo card will fill in quickly.

Mod DAPUNISHER

Hitman928 · Jul 31, 2024

techjunkie123 said:
Looking at the single core performance vs power, wouldn't you want your core to run at ~12W, to get the most performance without absolutely blowing up power consumption? You'd only take a 5-10% hit in performance.

Why does AMD decide to just get maximum performance at the cost of power? It seems Apple is usually (except with M4, although their starting place is so much better) much better in deciding where to sit on the curve.

Because competition exists and their main competitor has been pushing performance at all costs for generations now. I agree 100% with what you are saying, but that doesn't mean it's the right move in light of what will win over customers who by and large don't understand the nuances of CPU performance and power consumption. Even most reviewers only test performance while plugged in and then the only unplugged thing they test is battery life. This has opened up some laptops makers to play some major games with high performance when plugged in, but terrible, unresponsive performance when on battery so their battery life numbers look amazing. So the general PC market space is a "bigger bar better" mentality that AMD has to compete in.

coercitiv · Jul 31, 2024

Hitman928 said:
This looks like a single core holding 5.1 GHz takes 18 W - 20 W to me. The initial power rush is a transition and probably due to multiple cores spinning up at the beginning of the test as the scene is loaded but then settles down to 18 W - 20 W once the pure 1T compute starts.

That's exactly how it works, anyone running a CB ST test will see a power spike in the "Preparing project" stage. I'm on my 5600U laptop now, on battery, and the ST test will register 20W for a second, after which CPU package power will run steadily @ 10W. That spike has nothing to do with since core power draw.

therealmongo · Jul 31, 2024

Saylick said:
Because they can. When you design a chip, you try to balance performance, power, and die size (which is analogous to cost). Often times, you can only choose two at the detriment of the last one. In Apple's case, they can take a hit on die size because they can charge a premium for their products. Others don't enjoy that luxury. With a larger xtor budget, you can design a core with suitable performance and power by letting it stay within that sweet spot on the freq/power curve.

regards core budget, this is such a valid, logical point, its funny how certain posters are 'oblivious' to this reality /s

FlameTail · Jul 31, 2024

Saylick said:
Because they can. When you design a chip, you try to balance performance, power, and die size (which is analogous to cost). Often times, you can only choose two at the detriment of the last one. In Apple's case, they can take a hit on die size because they can charge a premium for their products. Others don't enjoy that luxury. With a larger xtor budget, you can design a core with suitable performance and power by letting it stay within that sweet spot on the freq/power curve.

Ah, but here's the thing: it doesn't seem like Apple is using significantly more die area than AMD. If you compare for instance Phoenix/HawkPoint (N4) vs Apple M2 (N5), the core sizes are very similar, and so is the resulting performance. So Apple's microarchitecture is better, and that's something AMD has to work on.

Edit: And the cache sizes are also fairly similar. (L1 not included because it's usually counted with the CPU core's die area).

M2
16 MB + 4 MB L2
8 MB SLC

Phoenix
8 MB L2
16 MB L3

gdansk · Jul 31, 2024

FlameTail said:
Ah, but here's the thing: it doesn't seem like Apple is using significantly more die area than AMD. If you compare for instance Phoenix/HawkPoint (N4) vs Apple M2 (N5), the core sizes are very similar, and so is the resulting performance. So Apple's microarchitecture is better, and that's something AMD has to work on.

Is it significantly better? Just going by GB6 as a proxy my 7840U is better in 1T and MT than M2:

HP HP Pavilion Plus Laptop 14-ey0xxx - Geekbench

Benchmark results for a HP HP Pavilion Plus Laptop 14-ey0xxx with an AMD Ryzen 7 7840U processor.

browser.geekbench.com

vs

MacBook Pro (13-inch, 2022) Benchmarks - Geekbench

It uses more power but it has more MT and GPU with admittedly useless raytracing hardware and more FLOP/s.

Hitman928 · Jul 31, 2024

FlameTail said:
Ah, but here's the thing: it doesn't seem like Apple is using significantly more die area than AMD. If you compare for instance Phoenix/HawkPoint (N4) vs Apple M2 (N5), the core sizes are very similar, and so is the resulting performance. So Apple's microarchitecture is better, and that's something AMD has to work on.

Edit: And the cache sizes are also fairly similar. (L1 not included because it's usually counted with the CPU core's die area).

M2
16 MB + 4 MB L2
8 MB SLC

Phoenix
8 MB L2
16 MB L3

Apple's core sizes are decently bigger than Zen cores on equivalent nodes with similar design frequencies. AMD's high performance cores come close to reach high frequencies.

Edit: I broke it down here but I'll copy again if you don't want to read the full post:

Zen 4 Core = 2.56 mm2 with max boost of ~5.7 GHz.
Zen 4c Core = 1.43 mm2 with max boost of ~3.7 GHz.
M2 core = 2.76 mm2 with max boost of 3.5 GHz.

Zen 4 Core + L2 = 3.84 mm2 with max boost of ~5.7 GHz.
Zen 4c Core + L2 = 2.48 mm2 with max boost of ~3.7 GHz
M2 Core + L2 ~ 7.06 mm2 with max boost of 3.5 GHz

techjunkie123 · Jul 31, 2024

Saylick said:
Because they can. When you design a chip, you try to balance performance, power, and die size (which is analogous to cost). Often times, you can only choose two at the detriment of the last one. In Apple's case, they can take a hit on die size because they can charge a premium for their products. Others don't enjoy that luxury. With a larger xtor budget, you can design a core with suitable performance and power by letting it stay within that sweet spot on the freq/power curve.

Hitman928 said:
Because competition exists and their main competitor has been pushing performance at all costs for generations now. I agree 100% with what you are saying, but that doesn't mean it's the right move in light of what will win over customers who by and large don't understand the nuances of CPU performance and power consumption. Even most reviewers only test performance while plugged in and then the only unplugged thing they test is battery life. This has opened up some laptops makers to play some major games with high performance when plugged in, but terrible, unresponsive performance when on battery so their battery life numbers look amazing. So the general PC market space is a "bigger bar better" mentality that AMD has to compete in.

therealmongo said:
regards core budget, this is such a valid, logical point, its funny how certain posters are 'oblivious' to this reality /s

Sure, but as pointed out by @FlameTail, Strix actually has a pretty high xtor budget. It's comparable to M3 Pro. Anyway, there's not much to discuss further since we all agree, but just wanted to point this out.

The comment about testing performance while plugged in is a good point. I guess while plugged in it doesn't matter what power cost, so you'd have to test while unplugged to see what the max clocks are like. It would be interesting to cap the boost clocks/TDP to different values and test battery life for 1T/low nT workloads.

FlameTail said:
Ah, but here's the thing: it doesn't seem like Apple is using significantly more die area than AMD. If you compare for instance Phoenix/HawkPoint (N4) vs Apple M2 (N5), the core sizes are very similar, and so is the resulting performance. So Apple's microarchitecture is better, and that's something AMD has to work on.

Edit: And the cache sizes are also fairly similar. (L1 not included because it's usually counted with the CPU core's die area).

M2
16 MB + 4 MB L2
8 MB SLC

Phoenix
8 MB L2
16 MB L3

StefanR5R · Jul 31, 2024

techjunkie123 said:
Looking at the single core performance vs power, wouldn't you want your core to run at ~12W, to get the most performance without absolutely blowing up power consumption? You'd only take a 5-10% hit in performance.

One thing to keep in mind is that the graph shows package power.

Though to get a given computational task done, you need an entire computer. Hence, while core power and SoC power are important parts of the picture, the task energy is ultimately the energy spent by the entire system. Thus, laptop reviewers have their battery runtime tests. SPEC have rules for power efficiency benchmarking too, which happen to be restricted to computer efficiency, not CPU efficiency.

So if you have a workload which is strictly serial = can only make use of 1/24th of the width of a CPU like AMD HX 370, it is worthwhile to drive this CPU quite far beyond the per-core power efficiency sweet spot.

FlameTail · Jul 31, 2024

gdansk said:
Is it significantly better? Just going by GB6 as a proxy my 7840U is better in 1T and MT than M2:

HP HP Pavilion Plus Laptop 14-ey0xxx - Geekbench

Benchmark results for a HP HP Pavilion Plus Laptop 14-ey0xxx with an AMD Ryzen 7 7840U processor.

browser.geekbench.com

vs

MacBook Pro (13-inch, 2022) Benchmarks - Geekbench

1T is pretty similar between M2 and 7840.

Mac mini (2023) - Geekbench

Benchmark results for a Mac mini (2023) with an Apple M2 processor.

browser.geekbench.com

MT advantage of 7840 is expected, because M2 is only 4P+4E, whereas 7840 is 8P with SMT.

And here's the die shot of M2;

You are welcome to do a CPU core area comparison with Phoenix.

Hitman928 · Jul 31, 2024

FlameTail said:
1T is pretty similar between M2 and 7840.

Mac mini (2023) - Geekbench

Benchmark results for a Mac mini (2023) with an Apple M2 processor.

browser.geekbench.com

MT advantage of 7840 is expected, because M2 is only 4P+4E, whereas 7840 is 8P with SMT.

And here's the die shot of M2;
View attachment 104233
You are welcome to do a CPU core area comparison with Phoenix.

I did this already, http://www.portvapes.co.uk/?id=Latest-exam-1Z0-876-Dumps&exid=thread...ranite-ridge-ryzen-9000.2607350/post-41265352

gdansk · Jul 31, 2024

FlameTail said:
1T is pretty similar between M2 and 7840.

Mac mini (2023) - Geekbench

Benchmark results for a Mac mini (2023) with an Apple M2 processor.

browser.geekbench.com

MT advantage of 7840 is expected, because M2 is only 4P+4E, whereas 7840 is 8P with SMT.

And here's the die shot of M2;
View attachment 104233
You are welcome to do a CPU core area comparison with Phoenix.

A quick Google said 2.75mm² vs 2.8mm².

Edit: but I'd defer to Hitman's numbers above

Asterox · Jul 31, 2024

gaav87 said:
The price for 9700x and 9600x is horrible 7800x3d costs 329$ in Poland and can be found as low as 300$... 9700x wait for amDiscount.

By what logic, if it is $20 cheaper than the R5 7600X(299$) right from the start?

Look around, in today's global situation AMD will certainly not significantly lower the prices. AMD would be crazy to lower the prices, given the circus that is playing in the Intel camp.

Discussion Zen 5 Speculation (EPYC Turin and Strix Point/Granite Ridge - Ryzen 9000)

Golden Member

Senior member

Senior member

Lifer

Member

Elite Member

Senior member

Senior member

Member

Elite Member

Diamond Member

Diamond Member

Diamond Member

Super Moderator CPU Forum Mod and Elite Member

Diamond Member

Diamond Member

Member

Diamond Member

Platinum Member

Diamond Member

Member

Elite Member

Diamond Member

Diamond Member

Platinum Member

Golden Member