- Mar 3, 2017
- 1,747
- 6,598
- 136
Seems like the opposite of Bulldozer, if anything - BD was one frontend with two backends, Zen5 appears to be two frontends with one backend.
Only for Bulldozer and Piledriver, starting with Kaveri and all the way to Excavator they used one decoder per core.
it appears to me to be a functional unit with resources split at the thread level. The implementation will obviously be different, but if the purpose of having two threads share some IP blocks while having separate paths through the unit, I see a possible connection.Seems like the opposite of Bulldozer, if anything - BD was one frontend with two backends, Zen5 appears to be two frontends with one backend.
There were quite a few Chrome tabs open and who knows what else. That's what I've been told.so for 8000Mhz memory they should be even better! And not worse
I’m talking about the whole family and how hard the thread paths are split.That would be why I said Bulldozer.
So DozerBull. Gotcha!Seems like the opposite of Bulldozer, if anything
What is your point?Does it hinder HPC from achieving higher scalability with SMT?
it appears to me to be a functional unit with resources split at the thread level. The implementation will obviously be different, but if the purpose of having two threads share some IP blocks while having separate paths through the unit, I see a possible connection.
From what I’ve seen it’s still early to say Bulldozer or the the opposite of, I’m just seeing the possibility of details that would amuse me.
So DozerBull. Gotcha!
How much evidence have you seen that they are not split?I've seen zero evidence that functional unit allocations are split between threads and that would sounds really undesirable if it were. Much of the entire advantage of SMT is ability to fill functional units in a dynamic manner using uops from all available threads.
Nah, Rezodllub.
How much evidence have you seen that they are not split?
I’m not here to make any claims, I just said I would find it amusing if they did it. If it turns out different, I don’t care.
It appears that there was some per thread thought that went to the design. Maybe I missed the hard facts that show where it ends.
Because there's no reason to make them split. That would be microarchitecturally undesirable, since there's already the plumbing in there for the full set of functional units to be available to a single thread. "Prove they aren't split!" is like saying "prove the FPU doesn't switch to IBM hexadecimal FP encoding at random" - I can't prove it because I don't have a Zen5 in front of me, but it doesn't.
Here's somewhat tuned 7950x3d results (my ccd1 is trash, can't clock above 5750 at all, surely dom can provide even better results on his binned CPUs):It’s scores ~3700 as well in GB6.2 which doesn’t include SME.
Someone leaked ahead of the Anandtech post that time?
That said. I think there is some point in WTFTech. They are actually quite good at scanning the tech leak landscape and summarizing findings. Even if they are not the best in accuracy, you have to give them credit for being quick. So if one applies a "common sense filter" to what is published, rendering what is reasonable vs BS, I think it's ok.
I'd much rather keep them as a player in the game than discarding them completely.
Seems like the opposite of Bulldozer, if anything - BD was one frontend with two backends, Zen5 appears to be two frontends with one backend.
So Bulldozer and Zen 5 have similar internal structure the same way apples and oranges share most of their DNA.
Dynamic bananas that can shorten or extend randomly?But while we're here, the recent performance claims, I think it might be more fair to say that Zen 5 shares a lot of DNA with bananas.
The persistence at pressing your point against an expression of a concept is amusing. In fact that demonstrates why I’d be amused if it were done. All the “why did they do that, there’s no reason to” that would erupt from the crowd.Because there's no reason to make them split. That would be microarchitecturally undesirable, since there's already the plumbing in there for the full set of functional units to be available to a single thread. "Prove they aren't split!" is like saying "prove the FPU doesn't switch to IBM hexadecimal FP encoding at random" - I can't prove it because I don't have a Zen5 in front of me, but it doesn't.
This is a very random "benchmark", here's two more results, I've run them one to the next with no reboots or settings changed:Here's somewhat tuned 7950x3d results (my ccd1 is trash, can't clock above 5750 at all, surely dom can provide even better results on his binned CPUs):
Prefer cache:
Gigabyte Technology Co., Ltd. B650E AORUS MASTER - Geekbench
Benchmark results for a Gigabyte Technology Co., Ltd. B650E AORUS MASTER with an AMD Ryzen 9 7950X3D processor.browser.geekbench.com
Prefer frequency:
Gigabyte Technology Co., Ltd. B650E AORUS MASTER - Geekbench
Benchmark results for a Gigabyte Technology Co., Ltd. B650E AORUS MASTER with an AMD Ryzen 9 7950X3D processor.browser.geekbench.com
OR the frequency fluctuation on Zen 4 is pretty random?Gigabyte Technology Co., Ltd. B650E AORUS MASTER - Geekbench
Benchmark results for a Gigabyte Technology Co., Ltd. B650E AORUS MASTER with an AMD Ryzen 9 7950X3D processor.browser.geekbench.comGigabyte Technology Co., Ltd. B650E AORUS MASTER - Geekbench
Benchmark results for a Gigabyte Technology Co., Ltd. B650E AORUS MASTER with an AMD Ryzen 9 7950X3D processor.browser.geekbench.com
On Debian I am seeing 3262 single - https://browser.geekbench.com/v6/cpu/6668209I am getting just over 3100 ST on a stock 7950X with 6400C30 tuned ram.
Gigabyte Technology Co., Ltd. X670E AORUS XTREME - Geekbench
Benchmark results for a Gigabyte Technology Co., Ltd. X670E AORUS XTREME with an AMD Ryzen 9 7950X processor.browser.geekbench.com
I'll run on some flavor of Linux when I have a chance tomorrow, it's a more fair comparison of hardware to hardware if we want to make estimates for zen 5 and inevitably compare to Apple.
Here it's opposite. ST is fairly stable in the subtests whereas there are some wild swings in MT between the two distros.On Debian I am seeing 3262 single - https://browser.geekbench.com/v6/cpu/6668209
On Manjaro I am seeing 3290 single - https://browser.geekbench.com/v6/cpu/6668942
Overall ST performance it would seem to compare favorably but isn't the Apple chip running at 4.5Ghz vs ~5.7Ghz for the theoretical Zen5?If we apply 16% uplift from 7950X to 9950X, that would put 9950X around 3800 ST in linux. Does that compare favorably against apple?
7950X3D super maxed outHere's somewhat tuned 7950x3d results (my ccd1 is trash, can't clock above 5750 at all, surely dom can provide even better results on his binned CPUs):
Prefer cache:
Gigabyte Technology Co., Ltd. B650E AORUS MASTER - Geekbench
Benchmark results for a Gigabyte Technology Co., Ltd. B650E AORUS MASTER with an AMD Ryzen 9 7950X3D processor.browser.geekbench.com
Prefer frequency:
Gigabyte Technology Co., Ltd. B650E AORUS MASTER - Geekbench
Benchmark results for a Gigabyte Technology Co., Ltd. B650E AORUS MASTER with an AMD Ryzen 9 7950X3D processor.browser.geekbench.com
That’s because GB6 doesn’t scale with high core count CPUs.According to GB6, 14900 has the same MT performance as TR with 96 cores
View attachment 101878View attachment 101879
View attachment 101880