AMD Ryzen Gen 2 Set For Q2 2018

krumme · Mar 9, 2018

Atari2600 said:
Perhaps the explanation is quite simple - they expect a minor respin to yield them enough improvement for a clock bump later on, so can launch a 2800X in 6 months time.

A 2800x wouldnt/will not make much difference. It would still be behind in pubg (that i can't find on the list of games on the ppt) and old stuff.
Instead put the best stuff aside and into the 2950x and it will be a tough processor that beats all the sklx stuff.

The Stilt · Mar 9, 2018

krumme said:
into the 2950x and it will be a tough processor that beats all the sklx stuff.

As a 7960X owner I might be biased, but IMO you cannot compare Threadripper with any of the SKL-X SKUs. Threadripper is a 2P solution and that causes some limitations, or is otherwise generally unoptimal for consumer workloads.

bsp2020 · Mar 9, 2018

The Stilt said:
As a 7960X owner I might be biased, but IMO you cannot compare Threadripper with any of the SKL-X SKUs. Threadripper is a 2P solution and that causes some limitations, or is otherwise generally unoptimal for consumer workloads.

What are the consumer workloads that you are talking about? Just curious. I never thought that a person would buy 7960X class machine for consumer class workload. In my mind, consumer workload means web browsing, document editing, gaming (not 120Hz competitive gaming), etc. So, I like to hear what you consider consumer workloads.

DrMrLordX · Mar 9, 2018

FIVR said:
No, it's definitely that they are holding back.

If you say so. I still see no indication from Intel that they will launch anything faster than the 8700k in the next sixth months. Any launch beyond that, and they are running up on a short market window for the product.

Markfw · Mar 9, 2018

The Stilt said:
As a 7960X owner I might be biased, but IMO you cannot compare Threadripper with any of the SKL-X SKUs. Threadripper is a 2P solution and that causes some limitations, or is otherwise generally unoptimal for consumer workloads.

Why not ? its not a 2P solution, thats EPYC. AMD even says that your 7960X is the competition for the 1950X, and at $1565 vs $900 (current prices.)

PeterScott · Mar 9, 2018

The Stilt said:
As a 7960X owner I might be biased, but IMO you cannot compare Threadripper with any of the SKL-X SKUs. Threadripper is a 2P solution and that causes some limitations, or is otherwise generally unoptimal for consumer workloads.

I don't remember seeing that in reviews. IIRC there was a couple of VERY specific use cases where performance dropped, but most workloads TR looked quite good, and better price/performance than the Intel HEDT.

Though I am really only interested in more consumer parts, not the HEDT parts, so I could be remembering that wrong.

yummycandy · Mar 9, 2018

Markfw said:
Why not ? its not a 2P solution, thats EPYC. AMD even says that your 7960X is the competition for the 1950X, and at $1565 vs $900 (current prices.)

I think he means the NUMA-Nodes and their depending problems with them. SKL-X has only 1 NUMA-Node.

Markfw · Mar 9, 2018

yummycandy said:
I think he means the NUMA-Nodes and their depending problems with them. SKL-X has only 1 NUMA-Node.

If you set it to gaming mode, you one then have one numa node I am pretty sure..... It does BOTH.

xblax · Mar 9, 2018

Non-consumer workloads shouldn't suffer much from two different NUMA Nodes because they are parallelized carefully to avoid inter-thread dependencies as much as possible. If however windows assigns consumer tasks with less than 8 threads to cores across two different NUMA nodes that would just be a bad scheduler implementation of Microsoft.

I don't really get why Windows moves tasks across cores any way. Sure that distributes heat better, but it also causes a lot of unnecessary cache faults and context switches. They should rather keep tasks at a core and don't interrupt it if possible.

formulav8 · Mar 9, 2018

Markfw said:
If you set it to gaming mode, you one then have one numa node I am pretty sure..... It does BOTH.

Gaming mode shuts down half of the TR cores though.

mattiasnyc · Mar 9, 2018

I believe you can set NUMA / NON-NUMA separately from how many cores are used. Correct?

William Gaatjes · Mar 10, 2018

xblax said:
Non-consumer workloads shouldn't suffer much from two different NUMA Nodes because they are parallelized carefully to avoid inter-thread dependencies as much as possible. If however windows assigns consumer tasks with less than 8 threads to cores across two different NUMA nodes that would just be a bad scheduler implementation of Microsoft.

I don't really get why Windows moves tasks across cores any way. Sure that distributes heat better, but it also causes a lot of unnecessary cache faults and context switches. They should rather keep tasks at a core and don't interrupt it if possible.

It is not to distribute heat that the thread migration occurs, it is more about being as energy efficient as possible.
At the time of Windows 7, tablets were not the first thing in mind and windows was run on desktops or laptops.
At the time of windows 8 and later on windows 10 , microsoft had a paradigm shift towards energy efficiency since windows 10 is thought of as a mobile OS. Meaning save as much power as possible.
Of course, the different competing programming groups at microsoft always cause good ideas to not come fully to fruit in the long run.

This is kind of why :

https://www.gamersnexus.net/news-pc/2870-ryzen-power-plan-update-min-frequency-90-pct

“Win7 keeps all physical cores awake, and parks SMT cores. Win10 keeps one physical and one logical core away (Core0+1), then parks the rest as often as possible. This change alone is what’s responsible for the cases where Win7 was faster than Win10 gaming performance, not the scheduler as the community thought.”

And my reasoning about it :

This is strange behavior at first sight for windows 10 in comparison except from a energy efficiency perspective.
I can image that if a thread is run on a logical core, that windows 10 would migrate the thread to a pysical core instead when utilization of the cores jump to max.
I can imagine that the kernel has performance counters to track how much utilization there is, i mean the taks manager shows it so it makes sense the kernel uses it as well.
So for energy efficiency keep threads on physical core and (SMT) logical core and keep all other physical and accompanying logical cores parked and move from the used logical core(because a physical core and logical smt core share the same hardware) to a possible free but parked physical core when core utilization is getting maxed out : Thread migration.
Just an idea, i do not know if this is really the case.

krumme · Mar 10, 2018

The Stilt said:
As a 7960X owner I might be biased, but IMO you cannot compare Threadripper with any of the SKL-X SKUs. Threadripper is a 2P solution and that causes some limitations, or is otherwise generally unoptimal for consumer workloads.

Skl x comes factory oc out the box with tdp numbers beeing wrong.
If amd does a more agressive binning because they have no 2800x model they end up with more good dies for TR. Combine that with near the same oc as Intel does for some variants and i am sure it will prove faster.
You will always select a 8700k for best gaming so in relation to consumer workloads i dont know what numa related issues is relevant but avx2 and 512 is surely faster on the Intel solution but imo its few workloads.

The Stilt · Mar 10, 2018

Markfw said:
Why not ? its not a 2P solution, thats EPYC. AMD even says that your 7960X is the competition for the 1950X, and at $1565 vs $900 (current prices.)

Threadripper IS a 2P solution. The only difference to conventional MP solutions is that both of the processors reside on the same substrate.

If the workloads are NUMA aware, then obviously everything is fine.
However if, or rather when the workloads aren't NUMA aware you'll should be using UMA mode instead.
In UMA mode the latency is pretty atrocious since some of the memory accesses always have to cross the die boundary (through GMI), causing the latency to increase even further beyond the already high base level.

Threadripper is an excellent solution for certain, very specific workloads especially when considering it's price.
However, personally I went with a 7960X instead because I wanted something which is a true single CPU solution and has better performance in >= 256-bit workloads.
And that's after the fact that I have 1950X and a board for it sitting in the shelf, so didn't make my choice based solely on assumptions either.

The Stilt · Mar 10, 2018

krumme said:
Skl x comes factory oc out the box with tdp numbers beeing wrong.

While that might technically be the case, the statement above is still untrue.

Just like any other CPU from AMD or Intel, SKL-X obeys it's set power limits.
The issue isn't that Intel would be shipping them with a wrong TDP rating, because they aren't.
The issue is that the motherboard ODMs are altering the default power limits (PL1, PL2) and their durations (Tau) automatically.

For 165W SKL-X SKUs the power limits default and should remain at 165W (PL1) and 206W (PL2). However on most systems, including my own
the power limits defaulted to 4096W (automatically set by the ODM). Obviously any power limited chip, which has it's power limiters disabled will exceed it's advertized TDP.
I've measured the actual power consumption of the 7960X using DCR, with the power limits and times correctly configured and the CPU obeys them just as it should.

Same exact thing is done by the ODMs on Z370 platform as well.

krumme · Mar 10, 2018

The Stilt said:
While that might technically be the case, the statement above is still untrue.

Just like any other CPU from AMD or Intel, SKL-X obeys it's set power limits.
The issue isn't that Intel would be shipping them with a wrong TDP rating, because they aren't.
The issue is that the motherboard ODMs are altering the default power limits (PL1, PL2) and their durations (Tau) automatically.

For 165W SKL-X SKUs the power limits default and should remain at 165W (PL1) and 206W (PL2). However on most systems, including my own
the power limits defaulted to 4096W (automatically set by the ODM). Obviously any power limited chip, which has it's power limiters disabled will exceed it's advertized TDP.
I've measured the actual power consumption of the 7960X using DCR, with the power limits and times correctly configured and the CPU obeys them just as it should.

Same exact thing is done by the ODMs on Z370 platform as well.

Thats what i would call technically right, but imo what matter is what hits the market.
What we saw was mb manufactures having a hard time with skl x - the freq and consumption was a surprice. They we stumbling. Far from the calm introduction we are used to from Intel. Bios was a mess as well.
We saw lots of throtling on standard settings and over the top tdp - and not small numbers (i know its a practice we have seen for year and similar tricks)
We saw a major jump in consumption from bwe and a loss in efficiency vs prior gen (as shown eg by thg)

-> It was a sudden change in policy because of TR appearing on the market. They went for top freq.

However we evaluated that, at the end of the day, if AMD wants to they can chose to go the same route.
If true the 2700x is already a step down that way with higher freq and more consumption. And with the 2800x aparently out the picture they can probably bin more for those high end TR cores. I think thats what we will see.

Peter Watts · Mar 10, 2018

Are there any indications that the new zen+ motherboards will have onboard graphics?

PeterScott · Mar 10, 2018

Peter Watts said:
Are there any indications that the new zen+ motherboards will have onboard graphics?

?

They will support it if the CPU does, so Raven Ridge CPUs will provide graphics. New zen+ (Ryzen 2700x) will not.

MarkPost · Mar 10, 2018

The Stilt said:
Threadripper IS a 2P solution. The only difference to conventional MP solutions is that both of the processors reside on the same substrate.

If the workloads are NUMA aware, then obviously everything is fine.
However if, or rather when the workloads aren't NUMA aware you'll should be using UMA mode instead.
In UMA mode the latency is pretty atrocious since some of the memory accesses always have to cross the die boundary (through GMI), causing the latency to increase even further beyond the already high base level.

Threadripper is an excellent solution for certain, very specific workloads especially when considering it's price.
However, personally I went with a 7960X instead because I wanted something which is a true single CPU solution and has better performance in >= 256-bit workloads.
And that's after the fact that I have 1950X and a board for it sitting in the shelf, so didn't make my choice based solely on assumptions either.

Well, I would say that "256-bit workloads" are very specific workloads too. Anyways, it would be nice if you tell us some concrete examples where TR isnt a good solution, I mean concrete HEDT apps.

The Stilt · Mar 10, 2018

MarkPost said:
Anyways, it would be nice if you tell us some concrete examples where TR isnt a good solution, I mean concrete HEDT apps.

Which workloads you qualify as such?

256-bit workloads aren't too specific IMO, as all of the modern video encoders (X264, X265, VP9, AV1) implement AVX2 and some even AVX-512.

Among the music production software, and even with something non-exotic as Blender (with the physics engine implemented) you can see the further amplified effect of the high latency from TRs 2P config.

Obviously most software intended to be run on server grade hardware support NUMA anyway.
If NUMA is supported then obviously the latency isn't any more of an issue than it is on desktop Ryzens.

Atari2600 · Mar 10, 2018

krumme said:
A 2800x wouldnt/will not make much difference. It would still be behind in pubg (that i can't find on the list of games on the ppt) and old stuff.
Instead put the best stuff aside and into the 2950x and it will be a tough processor that beats all the sklx stuff.

Behind yes, but if it means AMD have another unit they can charge a higher price for then it makes financial sense for them.

[There is a world beyond the epeen contest...]

MarkPost · Mar 10, 2018

The Stilt said:
Which workloads you qualify as such?

256-bit workloads aren't too specific IMO, as all of the modern video encoders (X264, X265, VP9, AV1) implement AVX2 and some even AVX-512.

Among the music production software, and even with something non-exotic as Blender (with the physics engine implemented) you can see the further amplified effect of the high latency from TRs 2P config.

Obviously most software intended to be run on server grade hardware support NUMA anyway.
If NUMA is supported then obviously the latency isn't any more of an issue than it is on desktop Ryzens.

Well I've not checked lately, but when TR was released, it perfomed pretty well in video encoding tasks. SKL-X was only faster with x265 codec if I remember correctly. About Blender it peformed pretty well too, dont know about music production though

Kenmitch · Mar 10, 2018

MarkPost said:
Well I've not checked lately, but when TR was released, it perfomed pretty well in video encoding tasks. SKL-X was only faster with x265 codec if I remember correctly. About Blender it peformed pretty well too, dont know about music production though

In the end the only real thing that matters is if the cpu meets your performance needs in the things your going to do. I wouldn't really worry about what a cpu sucks at if your not going to be doing those things anyways....Those things are more for internet arguments.

Kenmitch · Mar 10, 2018

Atari2600 said:
Behind yes, but if it means AMD have another unit they can charge a higher price for then it makes financial sense for them.

[There is a world beyond the epeen contest...]

I thought I read somewhere that the no 2800x rumor wasn't true....Something about it being different. Couple of days back in one of the links or might have been a link off of one of them.

krumme · Mar 10, 2018

MarkPost said:
Well I've not checked lately, but when TR was released, it perfomed pretty well in video encoding tasks. SKL-X was only faster with x265 codec if I remember correctly. About Blender it peformed pretty well too, dont know about music production though

One can say x265 is the important thing today and h264 less so. Time works for skl here.

But take a TR 1950 at 900usd and compare it to a similar priced skl x on x265 encoding and whats the difference then?
Straight of the head i would guess most of the advantage of those wide vectors is then gone due to price disadvantage.

AMD Ryzen Gen 2 Set For Q2 2018

Diamond Member

Golden Member

Member

Lifer

Moderator Emeritus, Elite Member

Platinum Member

Junior Member

Moderator Emeritus, Elite Member

Member

Diamond Member

Senior member

Lifer

Diamond Member

Golden Member

Golden Member

Diamond Member

Member

Platinum Member

Senior member

Golden Member

Golden Member

Senior member

Diamond Member

Diamond Member

Diamond Member