Speculation: Ryzen 4000 series/Zen 3

Ajay · Jul 1, 2020

maddie said:
What I meant is that AMD obviously expended a lot of effort into increasing the effectiveness of SMT and now because of their pricing structure relative to memory, it's more cost effective for some server clients to disable the feature.

In other words, choosing to sell at such reduced prices negated the SMT work. Was this predicted?

AMD still needs standard Zen2 (w/SMT 2) for enterprise servers (SMBs, maybe larger) and workstation market. It's also a 'selling' point for the, admitedly small, enthusiast market.

LightningZ71 · Jul 1, 2020

maddie said:
What I meant is that AMD obviously expended a lot of effort into increasing the effectiveness of SMT and now because of their pricing structure relative to memory, it's more cost effective for some server clients to disable the feature.

In other words, choosing to sell at such reduced prices negated the SMT work. Was this predicted?

It COULD have been predicted if planners had considered the greed of various software companies. If your hardware is providing end users a benefit that effectively reduces the incentive of that user to license additional sockets, then its only natural that the software company would seek to increase their bottom line by changing their licensing model to reflect the new trend in processors to offer more cores and more threads. However, given the environment that those original decisions were made in, I can understand why those decisions were made.

maddie · Jul 1, 2020

LightningZ71 said:
It COULD have been predicted if planners had considered the greed of various software companies. If your hardware is providing end users a benefit that effectively reduces the incentive of that user to license additional sockets, then its only natural that the software company would seek to increase their bottom line by changing their licensing model to reflect the new trend in processors to offer more cores and more threads. However, given the environment that those original decisions were made in, I can understand why those decisions were made.

What I wrote about is completely unrelated to software licenses. It's about CPU pricing, memory pricing and performance/thread with and without SMT as mentioned by Tuna-Fish.

amd6502 · Jul 1, 2020

Tuna-Fish said:
The server market is not currently asking for more SMT. I know for a fact that it has lately become more common for server customers to go the opposite way, and completely disable SMT on machines they purchase. There are two reasons for this:

Firstly, some of the recent security issues hit machines with SMT worse than ones without, and it's disabled for perceived security reasons.

Secondly, it's disabled because disabling it improves the performance/cost of the server.

I can imagine this for sparser workloads, as well as for extending the lifetime of older models with real security vulnerabilities. (Though probably this can often in lower performance demanding cases be compensated for by lowering the frequency.)

My thought is that disabling SMT would often be a very big hit for Perf/Watt, assuming the workloads are heavy and run at a decently high frequency, and on a say modern 4+3 wide core like Zen2 or wider (Power or Zen3), (with the OS doing proper tasksetting to optimize energy efficiency).

With the very far ahead OoO execution you start to have much of the energy spent on looking ahead and speculative execution (all to gain slightly higher IPC). SMT on loaded up cores would reign in that tendency by moderating the IPC on those threads.

Richie Rich · Jul 2, 2020

Richie Rich said:
SMT has some pros and cons like any other tech. Please do not talk only. But give me a proof that more than 50% of servers runs with SMT OFF today. Give numbers, links...

I'm still waiting for your prove how SMT is bad and nobody has it turned on. Silence means there in no proof and you was wrong about SMT.

I guess a lot of server customers would be happy for strong 8xALU Zen3 with SMT4.

Thunder 57 · Jul 2, 2020

Richie Rich said:
I'm still waiting for your prove how SMT is bad and nobody has it turned on. Silence means there in no proof and you was wrong about SMT.

I guess a lot of server customers would be happy for strong 8xALU Zen3 with SMT4.

But I thought x86 had no future?

LightningZ71 · Jul 2, 2020

maddie said:
What I wrote about is completely unrelated to software licenses. It's about CPU pricing, memory pricing and performance/thread with and without SMT as mentioned by Tuna-Fish.

I'm not disputing what you said; I'm offering an end user perspective. The company that I work for had to make the very real decision to deal with this on a recent upgrade project. It turned out that it was more cost effective to get 50% more servers with SMT disabled from a total cost point of view than it was to pay for yearly licenses on the extra threads. Enabling SMT on the CPUs gave us a 20% improvement in total server throughput for twice the licensing costs. Given how expensive the licenses are, it was a lot more cost effective to just add more cores. Decisions about hardware features in servers are not made in a vacuum. Had they figured out that software companies were just going to change their licensing model punitively, they may have been dissuaded from putting effort into those features. I'm not arguing that SMT isn't a useful feature. I'm not saying that it was a bad idea in any way. I'm just pointing out that there are external forces that affect these things.

Tuna-Fish · Jul 2, 2020

Richie Rich said:
It sounds reasonable in theory. But why AWS comparison of Graviton2 vs. Rome is with SMT2 enabled?

Because with SMT disabled, the Graviton2 comes out looking pretty bad. If they enable SMT on x86 they can say that their cores are roughly as good as threads on x86.

Amazon cloud can disable SMT but it's asked mainly by customer whos running HPC loads. https://aws.amazon.com/blogs/compute/disabling-intel-hyper-threading-technology-on-amazon-linux/

Amazon does not know what or how their customers run on their servers (or at least, they are not supposed to know, so they can't publish stats about it), they don't know what proportion of them run SMT disabled. It might be asked about most by HPC customers, but no-one knows what proportion of AWS users run with SMT vs what proportion doesn't.

As far as most server systems has SMT ON, then you are wrong. However feel free to provide the data that most servers running SMT OFF. Not mentioning that some tasks with low ILP like SQL benefits from SMT a lot.

Richie Rich said:
SMT has some pros and cons like any other tech. Please do not talk only. But give me a proof that more than 50% of servers runs with SMT OFF today. Give numbers, links...

Richie Rich said:
I'm still waiting for your prove how SMT is bad and nobody has it turned on. Silence means there in no proof and you was wrong about SMT.

I guess a lot of server customers would be happy for strong 8xALU Zen3 with SMT4.

I never made any claim about >50% of servers running with SMT off, just that more and more server customers are getting their servers like that. I don't have any industry-wide stats, just anecdotes that seem to be going one way.

As for reason for silence, some of us actually work with this stuff, so we don't have time to post on the forums about it every single day.

fleshconsumed · Jul 2, 2020

LightningZ71 said:
I'm not disputing what you said; I'm offering an end user perspective. The company that I work for had to make the very real decision to deal with this on a recent upgrade project. It turned out that it was more cost effective to get 50% more servers with SMT disabled from a total cost point of view than it was to pay for yearly licenses on the extra threads. Enabling SMT on the CPUs gave us a 20% improvement in total server throughput for twice the licensing costs. Given how expensive the licenses are, it was a lot more cost effective to just add more cores. Decisions about hardware features in servers are not made in a vacuum. Had they figured out that software companies were just going to change their licensing model punitively, they may have been dissuaded from putting effort into those features. I'm not arguing that SMT isn't a useful feature. I'm not saying that it was a bad idea in any way. I'm just pointing out that there are external forces that affect these things.

What software is licensed by the thread? All software that I know of, Microsoft, Microsoft SQL Server, and VMWare are all licensed by the core.

moinmoin · Jul 2, 2020

fleshconsumed said:
What software is licensed by the thread? All software that I know of, Microsoft, Microsoft SQL Server, and VMWare are all licensed by the core.

That's usually by logical core, not physical core.

fleshconsumed · Jul 2, 2020

moinmoin said:
That's usually by logical core, not physical core.

No, it's by the physical core.

Richie Rich · Jul 2, 2020

Tuna-Fish said:
Because with SMT disabled, the Graviton2 comes out looking pretty bad. If they enable SMT on x86 they can say that their cores are roughly as good as threads on x86.

Of course Graviton2 has lower IPC than Zen2. It's based on weak A76 from 2018. But it has core area of 1.4mm2 what allows to put twice as much cores than Zen2 (3.6mm2). Wait for 128-core Graviton3 based on A78 (30% more IPC while less -5% transistors). And prey for that they will not use Cortex X1 cores (60% higher IPC (40% more than Zen2) at 2.1mm2 area). How about that. Still looking x86 strong?

Tuna-Fish said:
I never made any claim about >50% of servers running with SMT off, just that more and more server customers are getting their servers like that. I don't have any industry-wide stats, just anecdotes that seem to be going one way.

As for reason for silence, some of us actually work with this stuff, so we don't have time to post on the forums about it every single day.

This claim did @LightningZ71 . He is silent because he cannot prove his crazy claim. When you work with the stuff I'd like to know numbers from your company. Just give us number of machines and how many of them is running SMT OFF.

We could have 8xALU, 4xAGU, 4xFPU, SMT4 CPU core in 2003, what a shame. If Zen3 isn't Keller's EV8 resurrection then AMD is in deep deep trouble. I hope Zen3 is at least 6xALU, 3xAGU, 4xFPU, SMT4.

AMD needs to bring more tech features and go forward. However we know they can go also backward like with Bulldozer. So who knows :/

blckgrffn · Jul 2, 2020

fleshconsumed said:
No, it's by the physical core.

Could be something Oracle related (I had issues with this 15 years ago in the good old days of ESX 3).

Or any number of business software packages that companies need. With so much abstraction available and the relative explosion off threads/cores per socket it really messes up some monetization models for companies who rely on per-thread throughput driving revenue.

In that case, it makes the most sense to license based on CPUs, no matter whether they are real or SMT based.

moinmoin · Jul 2, 2020

fleshconsumed said:
No, it's by the physical core.

You mean to tell me SMT4 actually has a purpose? At what point did commercial software licensing smart enough to not count logical cores?

Edit: Microsoft appears to have introduced that for servers back in April 2017. Can't quickly find anything on desktop Windows or other manufacturers. VMware essentially just limits a socket to 32 physical cores since April this year, that's not really a per-core but a restrictive per-socket license.

NostaSeronx · Jul 2, 2020

Richie Rich said:
If Zen3 isn't Keller's EV8 resurrection then AMD is in deep deep trouble.

They won't. Also, they need to go for clocks again.
PCIe 4.0 x4, PCIe 5.0 x4, PCIe 6.0 x4
DDR5-8400 -> DDR5-12000 -> DDR6-25000

The issue of slow I/O isn't an issue. AMD also have proved they don't need a big core to get high GHz.

Zen3 needs to be a K9 core with high frequency at low TDP and even higher frequency at high TDP. To be adept for the upcoming transfer rate boom.

mopardude87 · Jul 2, 2020

Is the 4000 series still slated for 4th quarter this year? Been kinda behind, been focusing on other hobbies and kinda coming back to this.

Hitman928 · Jul 2, 2020

mopardude87 said:
Is the 4000 series still slated for 4th quarter this year? Been kinda behind, been focusing on other hobbies and kinda coming back to this.

Yes.

mopardude87 · Jul 2, 2020

Hitman928 said:
Yes.

Thanks

jpiniero · Jul 2, 2020

Hitman928 said:
Yes.

But it might be 12 and 16 core only to start.

blckgrffn · Jul 2, 2020

moinmoin said:
You mean to tell me SMT4 actually has a purpose? At what point did commercial software licensing smart enough to not count logical cores?

Edit: Microsoft appears to have introduced that for servers back in April 2017. Can't quickly find anything on desktop Windows or other manufacturers. VMware essentially just limits a socket to 32 physical cores since April this year, that's not really a per-core but a restrictive per-socket license.

Isn't that the issue? That the software can't tell the difference between logical and physical cores and thus charges the same for them? The licensing is *per core*, regardless of how it shows up?

That's why you would run with SMT disabled. You'd get twice as many actual physical cores for your licensing.

Having 128 full cores is going to be better, performance wise, than 64 cores with SMT showing them as 128 cores to the licensing daemon/process/server or however it's being doled out.

Wait, are we talking about @LightningZ71 use case here? Sorry if we aren't...

mopardude87 · Jul 2, 2020

jpiniero said:
But it might be 12 and 16 core only to start.

Would be fine, outside of folding this chip is vast overkill but i got it mostly for folding. Hoping for some high clocks out of a new 12 core. I got no issues sticking to this many threads till the PS6 is a thing. i assume the PS6 may need 16 cores/32 threads but of course that is a theory i have.

Will have to see on prices and clocks, if i could get the same clocks i do now and also up to 16 cores while running the same temps or cooler that would be perfect as well.

fleshconsumed · Jul 2, 2020

blckgrffn said:
Isn't that the issue? That the software can't tell the difference between logical and physical cores and thus charges the same for them? The licensing is *per core*, regardless of how it shows up?

That's why you would run with SMT disabled. You'd get twice as many actual physical cores for your licensing.

Can't speak for every piece of software there is out there, but when it comes to Microsoft the licensing is essentially based on trust. Our department gets volume license keys, giving them pretty much unlimited installs, the catch is that they're supposed to buy what they use. I'm not privy to all the details on how it's done in my company, but I do know they have a yearly review to make sure their licenses are up to date and there are no discrepancies between what they use and what they actually paid for.

Hitman928 · Jul 2, 2020

blckgrffn said:
Could be something Oracle related (I had issues with this 15 years ago in the good old days of ESX 3).

Or any number of business software packages that companies need. With so much abstraction available and the relative explosion off threads/cores per socket it really messes up some monetization models for companies who rely on per-thread throughput driving revenue.

In that case, it makes the most sense to license based on CPUs, no matter whether they are real or SMT based.

I don't deal with the server / VM side, but the two CAD programs I use the most license per logical core. Our servers have SMT enabled because they run many different things, but for my situation, it would be better to not include SMT threads so I get the max performance possible per thread since I am limited in how many threads I can use due to licensing costs.

blckgrffn · Jul 2, 2020

Hitman928 said:
I don't deal with the server / VM side, but the two CAD programs I use the most license per logical core. Our servers have SMT enabled because they run many different things, but for my situation, it would be better to not include SMT threads so I get the max performance possible per thread since I am limited in how many threads I can use due to licensing costs.

Right, exactly what I was trying in so many words to say.

Tangent:

Back in the day, our Oracle rep was trying to get us to pay for every core in the ESX host, not just what was presented to the VM. Because theoretically at any time we could allocate more cores to it? What I took away from it really is that Oracle distrusted VMs to have adequate & consistent performance and so convinced our DBAs that all Oracle instances needed to be physical.

And then I handed them the proof they needed by giving them a template limited to 256MB of memory (but allocated ~3GB) and performance was obviously terrible and I didn't catch it until they had started a case with Oracle and it all went FUBAR from there. This incident cost my group a lot of credibility with that team - and this was at a time when virtualization and consolidation was new and scary and everyone treated their physical servers in the DC like pets.

End result - our first new physical windows servers in years.

dnavas · Jul 2, 2020

NostaSeronx said:
The issue of slow I/O isn't an issue.

If PCIe 6 is actually delivered in 2021 (spec, not h/w), and RAM follows similarly, then I agree, I/O is actually starting to pick up the pace again. It'll be interesting to see how pcie6 is received considering the move from NRZ to PAM4 (and not strictly from, because backwards compatibility requires NRZ *and* PAM4), but bandwidth is certainly getting its day in the sun. It's about time, too.

But...

Also, they need to go for clocks again.

...doesn't follow from that. So long as you have sufficient compute throughput, bandwidth can be put to use. There's a ton of bandwidth on your graphics card, and your clocks are likely less than half what your CPU's is. OTOH, if you want to ... say ... run an IDS/IPS software firewall on your brand spanking new 400Ghz fiber drop, and you need packet passing latency to be measured in nanoseconds, then yes, you might need clocks. Heck, running a software firewall at 10Ghz is probably a stretch right now, never mind the network speeds that a faster bus might unlock. Anyway, clocks are about latency and/or single-threaded behavior. Amdahl has something to say here, but you want to eye the applications that require extremely low latency.

However, I suspect we're going to see much higher parallelism before higher clocks. If I can make my firewall rule resolution happen in parallel and build a simple hardware bit search to find which rule (if any) applies first, I can make my software solution run 10s of times faster. Call me when you see a path to 50Ghz CPUs. So long as we can extract parallelism, I expect core scaling to continue to be an easier problem and a better solution in most cases than higher clocks.

Speaking of which, whatever happened to Zen2 embedded V3xxx? It'd be nice to at least move to pcie4, never mind the theoretical pcie6 we're talking about here

Speculation: Ryzen 4000 series/Zen 3

Lifer

Platinum Member

Diamond Member

Senior member

Senior member

Diamond Member

Platinum Member

Golden Member

Diamond Member

Diamond Member

Diamond Member

Senior member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Lifer

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Senior member