- Mar 3, 2017
- 1,684
- 6,227
- 136
It is just really strange, as it seems like a single thread cannot utilise all core resources, did AMD design what is effectively Bulldozer 2: Electric Boogaloo?
In a laptop? Who? lolYep, ST is very important. All things considered a mid release on the mobile platform in regards to that. However +66%MT is no joke, there will be people who want that.
People who don't like desktops( not me I love both)? Strix is essentially a 5950X in terms of MT.In a laptop? Who? lol
To give them ST credit, they're at M3 family levels of performance in geekbench ST (~3000). So basically second best already (second only to M4).
I guess. For laptops I figured most people would be interested in the M4/Lunar Lake type chips.People who don't like desktops( not me I love both)? Strix is essentially a 5950X in terms of MT.
Yep most are. Its a very very tiny space.I guess. For laptops I figured most people would be interested in the M4/Lunar Lake type chips.
Yes but they did with a 1GHz boost compared to M3 and we'll have to see power numbers and I don't think they will be close to M3.To give them ST credit, they're at M3 family levels of performance in geekbench ST (~3000). So basically second best already (second only to M4).
9.71% in SIR 2017.
"In this test, a single Zen 5 thread still performs like a 4-decode x86 core. But when we enable two SMT threads for testing, we can see that the throughput doubles, and the instruction throughput reaches 8 in the L1-L2 and even L3 ranges, and in the DRAM range it returns to the same normal level as Zen 4."
@SarahKerrigan you were onto something, I guess
Where does the +66% MT number come from? I've not seen it mentioned previously in the thread.However +66%MT is no joke, there will be people who want that.
Yeah, the Bulldozer comparisons are stupid because Bulldozer was drastically behind Intel on single-thread perf. That's not the case here - not even close.
If they actually managed a huge jump in MT from replicated frontends, and also a small ST bump in the same gen, without blowing out area, that's interesting.
To be fair they also led us on a merry hype train with the RDNA3 dual issue thing reporting twice the FLOPS/CU/clk despite that only being relevant to specific use cases.they spoiled us with previous Zen iterations
What did you think SMT was for?as it seems like a single thread cannot utilise all core resources
Yeah, and this isn't built like SMT-centric throughput cores (hello POWER) anyway.What did you think SMT was for?
Over hawk point? I think in CinebenchWhere does the +66% MT number come from? I've not seen it mentioned previously in the thread.
Huang's tests also don't show the 2x bandwidth in L1 cache, though L2 does show ~60% improvement with a weird spike up to 90% as L2 starts to get saturated.
I'm not saying it is bad, but a different take on a similar goal of nT spam.I still don't see how any IPC increase is Bulldozer 2.
Oh I know, but 1t mode seemingly has more static partitions than Z4 had.This may blow your mind, but a bunch of structures have been statically partitioned for a while.
(Also, it's entirely possible that in 1t mode, the two frontends work like they do with Atom - early fetch/decode of branch targets.)
SMT is great, but you want to try to avoid having net core performance reliant on SMT use.What did you think SMT was for?
In GB6 text processing he measure about 10% while AMD state that it s 19%.
In GB5 and AES XTS he measure about 12-13% while AMD state that it s 35%, so dunno what is the validity of his tests or if frequencies where accurate.
His tests are actually fine, considering that Zen 5 in Strix is castrated in a few ways. I wouldn't be surprised that Granite Ridge gets around 5% more IPC in 1T versus Strix which should put it close to ~16% figure AMD showed. Interesting that SMT might bring bigger uplift.
The problem is that such considerations make the assumption that coding and compiler output is optimal, and we all know that very often that is anything but the case unfortunately.Core should have high resource utilisation in 1t mode
The problem is that such considerations make the assumption that coding and compiler output is optimal, and we all know that very often that is anything but the case unfortunately.