Question Zen 6 Speculation Thread

Page 10 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.
Jul 27, 2020
17,712
11,499
106
they always had 15% on table from SMT.
Question is, does SMT affect the core even when it's available but disabled? Or is Intel using the 5% SMT area for ST specific architectural optimizations? If that 15% were always on the table, we would see that in HT on/off benchmarks:




That's 1.5% improvement in average fps and 3.1% boost in 1% low fps.

There's something Intel isn't telling us.

These are the games that saw the biggest increases but only in 1% low fps:



So Intel is giving up SMT for a minor percentage of games that will benefit from killing SMT?
 

Nothingness

Platinum Member
Jul 3, 2013
2,717
1,347
136
That is not the same as SMT being categorically bad and I don't know why you are having difficulty absorbing that. In a server, +30% throughput perf at +20% power (Intel's numbers, which are worse than I've seen from other vendors) is a really big deal and that's what SMT provides.
I think the move of hyperscalers to Arm-based SoC that lack tells something. I guess the decreased costs of having their own SoC outweighs the loss of SMT.

My personal experience of running non-threaded heavy CPU workloads on AWS clearly shows that HT kills performance of my thousands of jobs thanks to one vCPU being one thread. So for the types of loads I need for my work, SMT clearly is counterproductive. Of course if my workloads where threaded that would tell a different story. Conclusion (with my Captain Obvious hat): not all server tasks are created equal.
 
Reactions: lightmanek

SarahKerrigan

Senior member
Oct 12, 2014
585
1,396
136
I think the move of hyperscalers to Arm-based SoC that lack tells something. I guess the decreased costs of having their own SoC outweighs the loss of SMT.

Well, and Neoverse-V is also a really good IP. (I'd like it better with SMT, but that won't surprise you. Being conservatively modified smartphone IP makes that unlikely though.)

My personal experience of running non-threaded heavy CPU workloads on AWS clearly shows that HT kills performance of my thousands of jobs thanks to one vCPU being one thread. So for the types of loads I need for my work, SMT clearly is counterproductive. Of course if my workloads where threaded that would tell a different story. Conclusion (with my Captain Obvious hat): not all server tasks are created equal.

Noisy neighbors are a real problem, but similar arguments apply to "just use large numbers of weaker cores", which seems to be the server alternative proposed by the anti-SMT folks; you can get lower ST perf for a VM because of a noisy neighbor or you can get lower ST perf for a VM because of a smaller core, but the end result is the same.
 

Nothingness

Platinum Member
Jul 3, 2013
2,717
1,347
136
Well, and Neoverse-V is also a really good IP. (I'd like it better with SMT, but that won't surprise you. Being conservatively modified smartphone IP makes that unlikely though.)
The cost of SMT implementation and its tuning is huge: validation, security issues, performance tuning, etc. That would also likely need a new business model for per core royalties.

That doesn't mean it won't happen. It's just unlikely to happen in the coming years.

Noisy neighbors are a real problem, but similar arguments apply to "just use large numbers of weaker cores", which seems to be the server alternative proposed by the anti-SMT folks; you can get lower ST perf for a VM because of a noisy neighbor or you can get lower ST perf for a VM because of a smaller core, but the end result is the same.
I'm certainly not anti SMT I have no valid "scientific" proof that it's worse... or better, at least for my needs.
 

Markfw

Moderator Emeritus, Elite Member
May 16, 2002
25,736
14,767
136
Question is, does SMT affect the core even when it's available but disabled? Or is Intel using the 5% SMT area for ST specific architectural optimizations? If that 15% were always on the table, we would see that in HT on/off benchmarks:


View attachment 100894

That's 1.5% improvement in average fps and 3.1% boost in 1% low fps.

There's something Intel isn't telling us.

These are the games that saw the biggest increases but only in 1% low fps:

View attachment 100895

So Intel is giving up SMT for a minor percentage of games that will benefit from killing SMT?
why are we using Intel benchmarks to talk about SMT in a Zen 6 thread ? AMD and Intel work very differently in SMT.
 
Reactions: Tlh97 and soresu

SarahKerrigan

Senior member
Oct 12, 2014
585
1,396
136
The cost of SMT implementation and its tuning is huge: validation, security issues, performance tuning, etc. That would also likely need a new business model for per core royalties.

That doesn't mean it won't happen. It's just unlikely to happen in the coming years.

Agreed. I'd like to see it but I don't expect it. Validation is always the hard and mostly-invisible part of turning a microarchitecture into a product.

I'm certainly not anti SMT I have no valid "scientific" proof that it's worse... or better, at least for my needs.

You have good takes, as always.

I admit that I'm kind of an SMT zealot; a lot of that is born of spending lots of time with both big Power systems and, on a much more intimate level, embedded cores with aggressive hardware multithreading (usually FGMT, not SMT) for packet processing and other data-movement-heavy work. In both cases the gains were easily worth it.

Of course, I still think I'm right about all of this - but then, I would say that, wouldn't I?
 
Jul 27, 2020
17,712
11,499
106
why are we using Intel benchmarks to talk about SMT in a Zen 6 thread ? AMD and Intel work very differently in SMT.
Because naukkis saying that Intel getting rid of SMT to gain 15% ST performance. Based on the results I posted, that sounds doubtful. And Zen 6 will probably still have SMT so Intel is giving up "free" performance to win some benchmarks.
 

Markfw

Moderator Emeritus, Elite Member
May 16, 2002
25,736
14,767
136
Because naukkis saying that Intel getting rid of SMT to gain 15% ST performance. Based on the results I posted, that sounds doubtful. And Zen 6 will probably still have SMT so Intel is giving up "free" performance to win some benchmarks.
BTW, I think AMD gets 20-30% more out of SMT. If you find those benchmarks, that would be cool.
 
Jul 27, 2020
17,712
11,499
106
BTW, I think AMD gets 20-30% more out of SMT. If you find those benchmarks, that would be cool.





WOW. You were absolutely right! Even when it suffers a regression, it's less than Intel's regression.

So Intel is being dishonest. They should simply say that they are getting rid of SMT coz they can't quite figure it out as well as AMD

Source:
 

SarahKerrigan

Senior member
Oct 12, 2014
585
1,396
136
View attachment 100907
View attachment 100908

View attachment 100909

WOW. You were absolutely right! Even when it suffers a regression, it's less than Intel's regression.

So Intel is being dishonest. They should simply say that they are getting rid of SMT coz they can't quite figure it out as well as AMD

Source:

On the other hand, Intel is spending a hell of a lot less power to get those gains, and it causes my doubts about Intel's "SMT costs 20% higher power!" claims to go into overdrive.
 

naukkis

Senior member
Jun 5, 2002
768
633
136
In a server, +30% throughput perf at +20% power (Intel's numbers, which are worse than I've seen from other vendors) is a really big deal and that's what SMT provides.

Look closer. They also say that without SMT they gain 15% speed at isopower. So even for throughput non-smt wins when design is build around 1T cores.
 

naukkis

Senior member
Jun 5, 2002
768
633
136
On the other hand, Intel is spending a hell of a lot less power to get those gains, and it causes my doubts about Intel's "SMT costs 20% higher power!" claims to go into overdrive.

It's not SMT off vs SMT on. It's about core with SMT vs 1T. SMT does increase cpu critical path size so every instruction cpu executes takes more power whether SMT is used or not. From Intel numbers that difference has nowadays much bigger impact than few years ago.
 

Nothingness

Platinum Member
Jul 3, 2013
2,717
1,347
136
Maybe coz Intel's HT threads aren't able to properly max out available core resources so they sit idle or maybe even goto sleep!
Or perhaps it's exactly the contrary: AMD CPUs can't use all their resources in ST, while Intel CPUs do better; it might be interesting to redo testing after disabling SMT. Or simply, AMD has more dedicated structures for SMT; that would also explain the worse power efficiency of AMD SMT.

BTW where do these interesting data come from?
 

Nothingness

Platinum Member
Jul 3, 2013
2,717
1,347
136
It's not SMT off vs SMT on. It's about core with SMT vs 1T. SMT does increase cpu critical path size so every instruction cpu executes takes more power whether SMT is used or not. From Intel numbers that difference has nowadays much bigger impact than few years ago.
A critical path length increase would require a decrease in frequency. Has this been demonstrated by measuring frequency with and without SMT enabled under overclocking? Or do you mean it's part of the design from the start?
 

Markfw

Moderator Emeritus, Elite Member
May 16, 2002
25,736
14,767
136
On the other hand, Intel is spending a hell of a lot less power to get those gains, and it causes my doubts about Intel's "SMT costs 20% higher power!" claims to go into overdrive.
Personally, I think you don't get performance for free. AMD gets more performance out of SMT, so it uses more power. The opposite is true of intel. They get less from SMT and it costs less power.

My big point here, is that they have a very different behavior using SMT, and for Intel losing it , they lose less performance. This would really hurt AMD though.
 

LightningZ71

Golden Member
Mar 10, 2017
1,655
1,938
136
I love this in depth analysis of SMT! While I've seen it casually mentioned, it really needs to be emphasized a bit more: securing SMT requires a lot of potentially performance killing implementation work. The effort that must be put in to maintain memory isolation between the various threads being worked on is significant. If both threads are accessing similar regions of memory, then there will be a non-trivial amount of locking, deconfliction, and generally intensive memory management work done.

None of that is easy, quick to implement, and all of it requires a lot of extra data pathways and hardware bounds checking that can introduce latency where you least want it. It also introduces a lot of extra power used. While it doesn't account for ALL of the power difference, we do see that AMD, which, up to now, has had a more robust SMT implementation with respect to thread data security, does consume more power from their SMT for at least some of that effort.

Another thing that is encountered in the general computing world that may not be a thing for Sarah's world is memory controller contention. All those extra threads are going to generate their own share of memory requests. Supporting lots of threads requires a more robust memory controller and beefier memory throughput. It affects every level of processor memory, from the L1 cache layer all the way down.

Those things don't simply disappear when SMT is disabled in the BIOS. They might not be active, and they may consume less power, but they don't cease to exist. No matter what, it the circuits are present, if you have SMT off, a portion of it becomes wasted circuits. If you started from the block diagram of the core and built it for maximum ST performance instead of dedicating some circuits to SMT and it's security needs, you'd have many places to see gains in reduced latency, simplified logic, or other improvements to throughput and/or efficiency. What I'm getting at is that, just toggling SMT on and off doesn't give you the full picture of what the TRUE cost of SMT being present is. It may be super-critical in some tasks, but in others, it's more of an albatross around the core's neck...
 

naukkis

Senior member
Jun 5, 2002
768
633
136
A critical path length increase would require a decrease in frequency. Has this been demonstrated by measuring frequency with and without SMT enabled under overclocking? Or do you mean it's part of the design from the start?

From start - those Intel numbers aren't SMT on/off but core designed with or without SMT. SMT really won't see as free performance by those numbers.
 
Reactions: Nothingness

naukkis

Senior member
Jun 5, 2002
768
633
136
Intel optimizations also mean that Lion Cove with HT is pretty much totally different core than what they use in Lunar Lake. Their presentation says that hypothetical SMT Lion Cove would lose every metric besides perf/area. So it's pretty plain obvious that HT version of Lion Cove ain't coming in desktop and probably won't be used in server versions as perf/area isn't target for big cores, performance is and at least those Intel slides says that without SMT wins.

 

SarahKerrigan

Senior member
Oct 12, 2014
585
1,396
136
Repeating "Intel says so regarding an unreleased core used in a very specific heterogeneous context" over and over to argue SMT is doomed and irrelevant, and ignoring all of the other vendors (IBM, MIPS, Marvell) claiming significant power and area advantage for multithreading, is not all that convincing.

Intel says a lot of things. Betcha when Lion Cove shows up in servers, SMT will be right there.
 

Saylick

Diamond Member
Sep 10, 2012
3,372
7,103
136
Repeating "Intel says so regarding an unreleased core used in a very specific heterogeneous context" over and over to argue SMT is doomed and irrelevant, and ignoring all of the other vendors (IBM, MIPS, Marvell) claiming significant power and area advantage for multithreading, is not all that convincing.

Intel says a lot of things. Betcha when Lion Cove shows up in servers, SMT will be right there.
I think this little footnote is telling:


Sounds like Intel doesn't actually have an optimized P-core with HT components removed because the comparison is purely hypothetical. If they did have a Lion Cove core in LNL that was synthesized without HT, this wouldn't be hypothetical anymore. I think the version in LNL is probably just the same as ARL but with the HT fused off. That's it.
 

naukkis

Senior member
Jun 5, 2002
768
633
136
I think this little footnote is telling:
View attachment 100915

Sounds like Intel doesn't actually have an optimized P-core with HT components removed because the comparison is purely hypothetical. If they did have a Lion Cove core in LNL that was synthesized without HT, this wouldn't be hypothetical anymore. I think the version in LNL is probably just the same as ARL but with the HT fused off. That's it.

Other way around, It's hypothetical because they only have HT-capable Lion Cove simulation. No hardware with HT exist for comparison.
 

Doug S

Platinum Member
Feb 8, 2020
2,467
4,024
136
Because naukkis saying that Intel getting rid of SMT to gain 15% ST performance. Based on the results I posted, that sounds doubtful. And Zen 6 will probably still have SMT so Intel is giving up "free" performance to win some benchmarks.

I'd like to hear the exacting wording Intel used because I'm really skeptical that SMT hurts to that extent, unless they were talking about the hit when a second thread is running. I mean, how many benchmarks have we seen where SMT gains less than 15% throughput? Seems like there are more of those than there are where SMT gains are greater than 15%! So sure it gains more in some cases, but if the hit was that big I would argue it has NEVER made sense.

I could believe that SMT required 15% more power at iso-frequency when not running, since the additional transistors to support it are all over the core and couldn't be effectively gated.
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |