- Mar 3, 2017
- 1,747
- 6,598
- 136
So... there is a thin & light laptop which does not run at 5.1 GHz constantly. And this is a fiasco.
OK.
To me a fiasco is if a laptop doesn't have a keyboard with concave keys and enough travel, or lacks a trackpoint, for example.
SVE brings predicates and first fault ld/st to the table which can be quite useful for autovectorization. Some of these features were available starting with AVX-512 and were also added to Intel new AVX10.
That said, I agree most of the time hand tuned NEON code is as fast as 128-bit SVE. I still think the sweet spot is at 256-bit wide aka AVX2 or AVX10.2 with 256-bit vectors. And if area/power matters that much do as AMD did on Zen4 for AVX-512 use multiple uops on narrower paths; that did well on Zen4.
So X3D pricing would be:
Not this again. Did Dr. Lisa Su tell you this?Wasn't Zen5 supposed to be Apple killer?
Not this again. Did Dr. Lisa Su tell you this?
Hyperbole has done more than enough damage to this thread already.
A more practical question is: You have got a laptop which is designed for 17 W default heat dissipation from the SoC but can be reconfigured to put more than this through the SoC. This laptop allows a defective software which occupies 1 logical CPU 100% of the time to drive this CPU at 4.98 GHz, but not at 5.10 GHz. What are ASUS's customers missing due to this, well, fiasco?
Edit, and if you think of saying "but Apple..." another time, I can think of saying "trackpoint" one more time, if you wish. ;-)
Imagine saying with a straight face that throttling -- we're not even sure if due to power or thermals -- is the same thing as CPUs becoming unstable within a 6-month period.Strix point seems worse and worse. It's ridiculous that it even cannot sustain max ST boost clocks on sub 30W devices - what full Zen5 @5,.7GHz consumes, full 100W @ ST workloads? Thats starts to be as ridiculous as Intel Raptor lake fiasco.
Looking at the single core performance vs power, wouldn't you want your core to run at ~12W, to get the most performance without absolutely blowing up power consumption? You'd only take a 5-10% hit in performance.OK!! The fabled 35% has finally reared its head! lol
Counter to your worries, Notebookcheck's (for instance) battery runtime test results of Zenbook S 16 look OK to my untrained¹ eye, well in line with current Windows and non-Windows competitors of the same weight/ thickness/ screen class.What laptop cpu needs is long battery time and good performance for single and low-mt threads. That could be balanced well for laptops - but to consume 30W for single thread isn't exactly helpful. That Asus customer would have total fiasco sold as laptop, would burn his lap and drain it's 30Wh battery in two hours idling. Battery operated devices should newer allow 1T workload to use full tdp but to limit it to some sane power levels. AMD seems to not afford to do it anymore.
What laptop cpu needs is long battery time and good performance for single and low-mt threads. That could be balanced well for laptops - but to consume 30W for single thread isn't exactly helpful. That Asus customer would have total fiasco sold as laptop, would burn his lap and drain it's 30Wh battery in two hours idling. Battery operated devices should newer allow 1T workload to use full tdp but to limit it to some sane power levels. AMD seems to not afford to do it anymore.
Looking into it yes it seems the smaller Asus laptops cannot reach the 5.1GHz. The 16" Zenbook seems fine tho.
View attachment 104212
EDIT: To add some more notes it seems the 5.1GHz boost clock consumes up to ~29 watts. I think that is too much for the smaller Asus laptops to handle for longer periods.
View attachment 104213
I was overusing the word uop to make it clear and obviously failed. I meant the datapaths are 256-bit wide which helps reduce area and peak power while still giving a good performance uplift over AVX2. This is applicable to 256-bit vector mapped to 128-bit datapaths which was my point. In that context you get the benefit of both wider registers and ISA extensions that AVX-512/SVE propose over AVX2/NEON.AMD does not use multiple uops to execute AVX-512, AVX512 has lane crossing instructions and splitting them to multiple uops would tank performance. AMD's AVX-512 on 256 bit hardware uses full 512-bit registers and single uop per instruction, only executing ALU and load/store are 256 bit so to execute full instruction they are replayed taking 2 clock cycles instead 1 when execution hardware word length matches instruction. This is nothing new, for example Zilog Z80 has 4-bit execution hardware for 8 bit registers.
Because they can. When you design a chip, you try to balance performance, power, and die size (which is analogous to cost). Often times, you can only choose two at the detriment of the last one. In Apple's case, they can take a hit on die size because they can charge a premium for their products. Others don't enjoy that luxury. With a larger xtor budget, you can design a core with suitable performance and power by letting it stay within that sweet spot on the freq/power curve.Why does AMD decide to just get maximum performance at the cost of power? It seems Apple is usually (except with M4, although their starting place is so much better) much better in deciding where to sit on the curve.
Multiple reports for trolling and flame bait. Please stop, or your ban bingo card will fill in quickly.Strix point seems worse and worse. It's ridiculous that it even cannot sustain max ST boost clocks on sub 30W devices - what full Zen5 @5,.7GHz consumes, full 100W @ ST workloads? Thats starts to be as ridiculous as Intel Raptor lake fiasco.
Looking at the single core performance vs power, wouldn't you want your core to run at ~12W, to get the most performance without absolutely blowing up power consumption? You'd only take a 5-10% hit in performance.
Why does AMD decide to just get maximum performance at the cost of power? It seems Apple is usually (except with M4, although their starting place is so much better) much better in deciding where to sit on the curve.
That's exactly how it works, anyone running a CB ST test will see a power spike in the "Preparing project" stage. I'm on my 5600U laptop now, on battery, and the ST test will register 20W for a second, after which CPU package power will run steadily @ 10W. That spike has nothing to do with since core power draw.This looks like a single core holding 5.1 GHz takes 18 W - 20 W to me. The initial power rush is a transition and probably due to multiple cores spinning up at the beginning of the test as the scene is loaded but then settles down to 18 W - 20 W once the pure 1T compute starts.
regards core budget, this is such a valid, logical point, its funny how certain posters are 'oblivious' to this reality /sBecause they can. When you design a chip, you try to balance performance, power, and die size (which is analogous to cost). Often times, you can only choose two at the detriment of the last one. In Apple's case, they can take a hit on die size because they can charge a premium for their products. Others don't enjoy that luxury. With a larger xtor budget, you can design a core with suitable performance and power by letting it stay within that sweet spot on the freq/power curve.
Ah, but here's the thing: it doesn't seem like Apple is using significantly more die area than AMD. If you compare for instance Phoenix/HawkPoint (N4) vs Apple M2 (N5), the core sizes are very similar, and so is the resulting performance. So Apple's microarchitecture is better, and that's something AMD has to work on.Because they can. When you design a chip, you try to balance performance, power, and die size (which is analogous to cost). Often times, you can only choose two at the detriment of the last one. In Apple's case, they can take a hit on die size because they can charge a premium for their products. Others don't enjoy that luxury. With a larger xtor budget, you can design a core with suitable performance and power by letting it stay within that sweet spot on the freq/power curve.
Is it significantly better? Just going by GB6 as a proxy my 7840U is better in 1T and MT than M2:Ah, but here's the thing: it doesn't seem like Apple is using significantly more die area than AMD. If you compare for instance Phoenix/HawkPoint (N4) vs Apple M2 (N5), the core sizes are very similar, and so is the resulting performance. So Apple's microarchitecture is better, and that's something AMD has to work on.
Ah, but here's the thing: it doesn't seem like Apple is using significantly more die area than AMD. If you compare for instance Phoenix/HawkPoint (N4) vs Apple M2 (N5), the core sizes are very similar, and so is the resulting performance. So Apple's microarchitecture is better, and that's something AMD has to work on.
Edit: And the cache sizes are also fairly similar. (L1 not included because it's usually counted with the CPU core's die area).
M2
16 MB + 4 MB L2
8 MB SLC
Phoenix
8 MB L2
16 MB L3
Because they can. When you design a chip, you try to balance performance, power, and die size (which is analogous to cost). Often times, you can only choose two at the detriment of the last one. In Apple's case, they can take a hit on die size because they can charge a premium for their products. Others don't enjoy that luxury. With a larger xtor budget, you can design a core with suitable performance and power by letting it stay within that sweet spot on the freq/power curve.
Because competition exists and their main competitor has been pushing performance at all costs for generations now. I agree 100% with what you are saying, but that doesn't mean it's the right move in light of what will win over customers who by and large don't understand the nuances of CPU performance and power consumption. Even most reviewers only test performance while plugged in and then the only unplugged thing they test is battery life. This has opened up some laptops makers to play some major games with high performance when plugged in, but terrible, unresponsive performance when on battery so their battery life numbers look amazing. So the general PC market space is a "bigger bar better" mentality that AMD has to compete in.
regards core budget, this is such a valid, logical point, its funny how certain posters are 'oblivious' to this reality /s
Ah, but here's the thing: it doesn't seem like Apple is using significantly more die area than AMD. If you compare for instance Phoenix/HawkPoint (N4) vs Apple M2 (N5), the core sizes are very similar, and so is the resulting performance. So Apple's microarchitecture is better, and that's something AMD has to work on.
Edit: And the cache sizes are also fairly similar. (L1 not included because it's usually counted with the CPU core's die area).
M2
16 MB + 4 MB L2
8 MB SLC
Phoenix
8 MB L2
16 MB L3
One thing to keep in mind is that the graph shows package power.Looking at the single core performance vs power, wouldn't you want your core to run at ~12W, to get the most performance without absolutely blowing up power consumption? You'd only take a 5-10% hit in performance.
1T is pretty similar between M2 and 7840.Is it significantly better? Just going by GB6 as a proxy my 7840U is better in 1T and MT than M2:
vsHP HP Pavilion Plus Laptop 14-ey0xxx - Geekbench
Benchmark results for a HP HP Pavilion Plus Laptop 14-ey0xxx with an AMD Ryzen 7 7840U processor.browser.geekbench.com
1T is pretty similar between M2 and 7840.
Mac mini (2023) - Geekbench
Benchmark results for a Mac mini (2023) with an Apple M2 processor.browser.geekbench.com
MT advantage of 7840 is expected, because M2 is only 4P+4E, whereas 7840 is 8P with SMT.
And here's the die shot of M2;
View attachment 104233
You are welcome to do a CPU core area comparison with Phoenix.
A quick Google said 2.75mm² vs 2.8mm².1T is pretty similar between M2 and 7840.
Mac mini (2023) - Geekbench
Benchmark results for a Mac mini (2023) with an Apple M2 processor.browser.geekbench.com
MT advantage of 7840 is expected, because M2 is only 4P+4E, whereas 7840 is 8P with SMT.
And here's the die shot of M2;
View attachment 104233
You are welcome to do a CPU core area comparison with Phoenix.
By what logic, if it is $20 cheaper than the R5 7600X(299$) right from the start?The price for 9700x and 9600x is horrible 7800x3d costs 329$ in Poland and can be found as low as 300$... 9700x wait for amDiscount.