Ryzen: Strictly technical

Page 77 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.
Status
Not open for further replies.

B-Riz

Golden Member
Feb 15, 2011
1,530
676
136
Asus X470-Pro, 2700X. Booted this morning for first time, everything on Auto, core voltage was at 1.4+. Set negative offset, but it is still high, needs more tweaking.

My 1700 / AB350 Gaming 3 is set with negative offset and all core at 36.25 multiplier, no stability issues.
 

Timur Born

Senior member
Feb 14, 2016
277
139
116
What happened to the Civilization AI test on Tomshardware? All CPUs are listed at least 30% faster compared to any previous reviews. They get 12 seconds for a stock 1800X, while mine needs about 18.5 seconds (either stock or OCed to 3.95). No way of comparing the 2700X with the 1800X or 8700K with that data.

Too bad, because everyone else keeps listing FPS only for these games, while in practice I have to wait half a minute for Total War's AI to finish its turns.
 

The Stilt

Golden Member
Dec 5, 2015
1,709
3,057
106
It's really all we have now that the boost is so good First time I've seen decreasing a voltage to be the correct move for higher clocks.

The issue which people might be facing with undervolting is the vast difference between the cores.
Check out the voltage the CPU calculates for the best core and the worst core, on this CPU:

The best core (core 2):


The worst core (core 7):


100mV difference.
 

repoman0

Diamond Member
Jun 17, 2010
4,543
3,464
136
The issue which people might be facing with undervolting is the vast difference between the cores.
Check out the voltage the CPU calculates for the best core and the worst core, on this CPU:

The best core (core 2):


The worst core (core 7):


100mV difference.

Interesting, so you are setting process affinity here? For some reason I was assuming that the CPU might not necessarily even boost the worst cores up to 4.35 GHz. But I guess it can't pick and choose what core the OS wants to run something on, and it should behave the same in terms of clock speed regardless of what the OS chooses.

I'd like to understand why the negative offset voltage makes the boost algorithm choose to use higher frequencies in the first place. I don't have enough of an understanding of it to tell if it's a quirk of my particular board or what. Before applying the offset I was a little disappointed in the behavior of the boost on my sample, wouldn't hold 4.35 on one core load and really still won't, though it at least touches it every once in a while.
 

The Stilt

Golden Member
Dec 5, 2015
1,709
3,057
106
Interesting, so you are setting process affinity here? For some reason I was assuming that the CPU might not necessarily even boost the worst cores up to 4.35 GHz. But I guess it can't pick and choose what core the OS wants to run something on, and it should behave the same in terms of clock speed regardless of what the OS chooses.

I'd like to understand why the negative offset voltage makes the boost algorithm choose to use higher frequencies in the first place. I don't have enough of an understanding of it to tell if it's a quirk of my particular board or what. Before applying the offset I was a little disappointed in the behavior of the boost on my sample, wouldn't hold 4.35 on one core load and really still won't, though it at least touches it every once in a while.

I'm using manual affinity, however I'm also using the "Eternal XFR" or ASUS Performance Enhancement Mode 3.
Just to check what the practical difference between the cores is (at Fmax).

Normally the worse / worst cores will not boost the to max.
 
Reactions: Drazick

blackarchon

Junior Member
Mar 14, 2017
7
1
11
What happened to the Civilization AI test on Tomshardware? All CPUs are listed at least 30% faster compared to any previous reviews. They get 12 seconds for a stock 1800X, while mine needs about 18.5 seconds (either stock or OCed to 3.95). No way of comparing the 2700X with the 1800X or 8700K with that data.

Too bad, because everyone else keeps listing FPS only for these games, while in practice I have to wait half a minute for Total War's AI to finish its turns.
One of the last patches for Civ 6 brought some performance changes. You cannot compare results from different game versions.
 

IEC

Elite Member
Super Moderator
Jun 10, 2004
14,361
5,020
136

scannall

Golden Member
Jan 1, 2012
1,948
1,640
136
One thing I am noticing from the VRM on some of the high-end X470 boards is they are extremely overkill.

http://www.overclock.net/forum/27199649-post2954.html
https://www.hardwareluxx.de/community/f12/pga-am4-mainboard-vrm-liste-1155146.html

Could it be that Ryzen 2 next year will have 12 core CPUs for the desktop platform?
I think that is very unlikely. It would make the IF topology a LOT more complex, pretty much a ground up redesign. Not saying no way, no how. Just very unlikely.
 

Wall Street

Senior member
Mar 28, 2012
691
44
91
One thing I am noticing from the VRM on some of the high-end X470 boards is they are extremely overkill.

Could it be that Ryzen 2 next year will have 12 core CPUs for the desktop platform?

I think the reason for overkill VRMs may be that Precision Boost Overdrive basically works by disabling the power limiter.
 
Reactions: coercitiv

mat9v

Member
Mar 17, 2017
25
0
66
I think that is very unlikely. It would make the IF topology a LOT more complex, pretty much a ground up redesign. Not saying no way, no how. Just very unlikely.
Unless the CCX would grow to 6 cores, as I suppose it is "destined to" grow in 7nm While 3 CCX complexes would complicate things, two bigger ones would largely not affect the overall design - after all CCXs are connected to IF by L3 cache and you would simply add cores and connect them to L3 cache on the other end. It is by no means trivial task, as you would need to rearrange parts to keep rectangular shape of the whole CPU but since they are doing it for 7nm, maybe they would be able to do it for 12nm as well It is also not outside the realm of possibilities that it would be an early high risk production on 7nm - with a right price as Intel 8-core part will probably be much pricier then 8700K.
 
Last edited:

IRobot23

Senior member
Jul 3, 2017
601
183
76
Asus X470-Pro, 2700X. Booted this morning for first time, everything on Auto, core voltage was at 1.4+. Set negative offset, but it is still high, needs more tweaking.

My 1700 / AB350 Gaming 3 is set with negative offset and all core at 36.25 multiplier, no stability issues.

I don't think that you need any tweaking for your system.

With better cooler it will OC up to ~4.1GHz. High Voltage (which is still in margin of save voltage) is one of a factor that will cause degradation. You have to realize that with 1.425V while runing at 50% load is nowhere near as dangerous as running at 100% load.

I know there is lots of people claiming that their CPU degraded below 1.4V with 80C while playing games in a month or two. Basically this is impossible.
If I have a CPU that will have near zero degradation below 1.45V at full 100%, 1.5V can do much more damage in few months than 1.45V in years.

So no you don't need tweaking unless you want manual OC. I would suggest you to OC RAM to 3600MT/s, if possible otherwise 3466-3200MT/s.
 

Timur Born

Senior member
Feb 14, 2016
277
139
116
One of the last patches for Civ 6 brought some performance changes. You cannot compare results from different game versions.
I am using the latest version myself and measure around 18.5 seconds on a 1800X (OC or stock), which already is a bit faster than all the older reviews' results. But 12 seconds would be a huge jump in performance that I am not getting here.

Does the GPU impact AI performance? If so then spending any kind of money on a new CPU would be rather useless in comparison.

Can anyone test XFR2 using Total War (Warhammer)? This game is completely bottlenecked (maxed out core) by its main-thread, to the point where AI calculations can drop the frame-rate. But on the other hand XFR on the 1800X does absolutely nothing for it, because all the small low load extra threads keep the 1800X at a 3.7 GHz maximum (aka no boost to 4.1 GHz for that bottlenecking main thread). If XFR2 can handle these situations better then it's a real incentive to switch a Zen to a Zen+.
 
Last edited:

IRobot23

Senior member
Jul 3, 2017
601
183
76
I am using the latest version myself and measure around 18.5 seconds on a 1800X (OC or stock), which already is a bit faster than all the older reviews' results. But 12 seconds would be a huge jump in performance that I am not getting here.

Does the GPU impact AI performance? If so then spending any kind of money on a new CPU would be rather useless in comparison.

Can anyone test XFR2 using Total War (Warhammer)? This game is completely bottlenecked (maxed out core) by its main-thread, to the point where AI calculations can drop the frame-rate. But on the other hand XFR on the 1800X does absolutely nothing for it, because all the small low load extra threads keep the 1800X at a 3.7 GHz maximum (aka no boost to 4.1 GHz for that bottlenecking main thread). If XFR2 can handle these situations better then it's a real incentive to switch a Zen to a Zen+.

I am interested in that too... maybe AMD could optimize XFR2 for that.
 

wahdangun

Golden Member
Feb 3, 2011
1,007
148
106
I am using the latest version myself and measure around 18.5 seconds on a 1800X (OC or stock), which already is a bit faster than all the older reviews' results. But 12 seconds would be a huge jump in performance that I am not getting here.

Does the GPU impact AI performance? If so then spending any kind of money on a new CPU would be rather useless in comparison.

Can anyone test XFR2 using Total War (Warhammer)? This game is completely bottlenecked (maxed out core) by its main-thread, to the point where AI calculations can drop the frame-rate. But on the other hand XFR on the 1800X does absolutely nothing for it, because all the small low load extra threads keep the 1800X at a 3.7 GHz maximum (aka no boost to 4.1 GHz for that bottlenecking main thread). If XFR2 can handle these situations better then it's a real incentive to switch a Zen to a Zen+.


It's already doing that, it's will boost to upmost limit, that is why in review some all core boost can reach to 4,2 ghz.
 

Timur Born

Senior member
Feb 14, 2016
277
139
116
It's already doing that, it's will boost to upmost limit, that is why in review some all core boost can reach to 4,2 ghz.
That's nice and dandy, but boosting all cores to a lower frequency instead of boosting two cores to a higher frequency doesn't help much with software that bottlenecks on one or two cores.
 

Wall Street

Senior member
Mar 28, 2012
691
44
91
That's nice and dandy, but boosting all cores to a lower frequency instead of boosting two cores to a higher frequency doesn't help much with software that bottlenecks on one or two cores.

The point is that the 4.2 ghz all core boost is barely below the maximum single core boost.

Rather than count the number of cores active and limit boost to a fixed number that represents some theoretical maximum possible power consumption for a given number of active cores, XFR2 now looks at the actual power consumption, and as long as that is below TDP (or 140W for the 2700x), the temps are in check and the cores can handle higher frequency/voltages, it will try to boost higher. If you have most of the cores running at 20-30% load, then they contribute very little to total power draw, which allows the chip to keep on boosting.

Moving all of the thread to two cores just to get a higher clock on those two cores also limits performance because those two faster cores need to do all that extra work.
 

Timur Born

Senior member
Feb 14, 2016
277
139
116
The point is that the 4.2 ghz all core boost is barely below the maximum single core boost.
Which is quite a let-down. Single - or rather dual - core boost should be able to hit higher frequencies and thus benefit all the myriads of apps that only make use of 1-2 cores worth of load.

Moving all of the thread to two cores just to get a higher clock on those two cores also limits performance because those two faster cores need to do all that extra work.
In practice you ofen have one thread maxing out one single core while all the other threads and processes sum up to one or one-and-a-half more cores worth of utilization. Running two cores at high frequency then speeds up that one-core bottlenecking thread without slowing down all the others.
 

Wall Street

Senior member
Mar 28, 2012
691
44
91
Which is quite a let-down. Single - or rather dual - core boost should be able to hit higher frequencies and thus benefit all the myriads of apps that only make use of 1-2 cores.

It does go higher by about 100 mhz. These clock speeds and voltages are well out of the efficient range so the last 100 mhz costs 20-30% more power draw.

Power increases linearly with frequency and at the square of the voltage increase. Ryzen requires more than a linear voltage increase above 4 ghz, so power increases at more than the cube of the frequency increase.
 

Timur Born

Senior member
Feb 14, 2016
277
139
116
And you mean that running 2 cores at 4.7 Ghz draws more power than running 8 cores at 4.2 GHz (and the rest idle/parked)?

On a side-note: Aggressive core parking or core affinity settings can be used to make XFR1 more useful, at least for some applications. Unfortunately software like Total War cannot be used with this, because it is so badly written that its CPU utilization increases from a maximum of 12% to fully chocking all remaining cores once you decrease affinities or increase parked cores.

If XFR2 is able to handle these things more elegantly by itself then it is useful.
 

LightningZ71

Golden Member
Mar 10, 2017
1,661
1,945
136
Unless the CCX would grow to 6 cores, as I suppose it is "destined to" grow in 7nm While 3 CCX complexes would complicate things, two bigger ones would largely not affect the overall design - after all CCXs are connected to IF by L3 cache and you would simply add cores and connect them to L3 cache on the other end. It is by no means trivial task, as you would need to rearrange parts to keep rectangular shape of the whole CPU but since they are doing it for 7nm, maybe they would be able to do it for 12nm as well It is also not outside the realm of possibilities that it would be an early high risk production on 7nm - with a right price as Intel 8-core part will probably be much pricier then 8700K.

I would imagine that AMD would not want to change the CCX design itself for ZEN anytime soon. The 4 cores in each CCX are effectively directly connected to each other, the shared L3 in the CCX, and the port to the IF for the whole chip. That's 5 high speed links per core. If you increase the CCX count to 6, that gives each core 7 total links, with 5 links from each core crossing the CCX to get to the other cores. The CCX gets a LOT more complicated as it goes from having 6 inter-procesor links to 12, doubling them. It also degrades the CCX L3 cache efficiency unless you also increase it by 50%, increasing the size of the whole CCX further. This will result in having to rearrange the functional units in the die. Having larger CCXs means that they will have greater requirements to communicate between them. this will likely require that the IF be modified to increase it's ability to transmit data between the CCXs.

The alternative is to rearrange the functional units in the die to accommodate a third CCX. The CCX designs can be retained (save for any internal core changes that can be essentially done once and copied through the rest) to keep the shrink simple. What gets more complicated? An additional port is needed in the IF, which was designed as a flexible, adaptable interconnection logic in the first place. That's basically it, aside from some internal chip management logic being tweaked to support it. What are the drawbacks? The IF will be more heavily utilized by inter CCX communications. This can be alleviated by increasing the speed or width of the IF pathways. That's not going to be the simplest of things to be sure, but, it will not only help with inter CCX communications, but it can also help with communication latency between the CCXs and the PCI bus, memory controllers, etc.

So, what we have is both approaches requiring the IF to be modified to accommodate them properly, both approaches requiring the die to get a new floor plan, and both impacting the multi-chip package interconnect traffic (more cores, more cross talk). One requires a major tear up of the CCX, the other just requires that a third one be added. If AMD is as resource strapped as they appears to be, which route do you think that they would take?

My guess is that they are going to make slight tweaks to the cores themselves, they will add a third CCX (probably rotating them 90 degrees and lining them up side by side), and they will increase the ability of the IF to move data around inside of the chip. I'd imagine that there will also be a new DRAM controller, likely with either faster DDR4 or maybe in a later revision, DDR5 support (though, I feel that that will require a new socket revision, like AM4+).
 
Reactions: Vattila

mat9v

Member
Mar 17, 2017
25
0
66
I would imagine that AMD would not want to change the CCX design itself for ZEN anytime soon. The 4 cores in each CCX are effectively directly connected to each other, the shared L3 in the CCX, and the port to the IF for the whole chip. That's 5 high speed links per core. If you increase the CCX count to 6, that gives each core 7 total links, with 5 links from each core crossing the CCX to get to the other cores. The CCX gets a LOT more complicated as it goes from having 6 inter-procesor links to 12, doubling them. It also degrades the CCX L3 cache efficiency unless you also increase it by 50%, increasing the size of the whole CCX further. This will result in having to rearrange the functional units in the die. Having larger CCXs means that they will have greater requirements to communicate between them. this will likely require that the IF be modified to increase it's ability to transmit data between the CCXs.

The alternative is to rearrange the functional units in the die to accommodate a third CCX. The CCX designs can be retained (save for any internal core changes that can be essentially done once and copied through the rest) to keep the shrink simple. What gets more complicated? An additional port is needed in the IF, which was designed as a flexible, adaptable interconnection logic in the first place. That's basically it, aside from some internal chip management logic being tweaked to support it. What are the drawbacks? The IF will be more heavily utilized by inter CCX communications. This can be alleviated by increasing the speed or width of the IF pathways. That's not going to be the simplest of things to be sure, but, it will not only help with inter CCX communications, but it can also help with communication latency between the CCXs and the PCI bus, memory controllers, etc.

So, what we have is both approaches requiring the IF to be modified to accommodate them properly, both approaches requiring the die to get a new floor plan, and both impacting the multi-chip package interconnect traffic (more cores, more cross talk). One requires a major tear up of the CCX, the other just requires that a third one be added. If AMD is as resource strapped as they appears to be, which route do you think that they would take?

My guess is that they are going to make slight tweaks to the cores themselves, they will add a third CCX (probably rotating them 90 degrees and lining them up side by side), and they will increase the ability of the IF to move data around inside of the chip. I'd imagine that there will also be a new DRAM controller, likely with either faster DDR4 or maybe in a later revision, DDR5 support (though, I feel that that will require a new socket revision, like AM4+).
1. Cores inside CCX are not connected to each other, they are only connected to L3 cache - so only 2 additional links to L3 cache would be needed. https://www.custompcreview.com/wp-content/uploads/2017/06/amd-ccx-epyc-cpu-presentation-slides-1.jpg - there you can see 6 links inside L3 cache and 4 links from L3 to L2 cache. If anything adding 2 cores would add 5 or 9 additional links between cache slices depending on if you wanted to keep each other to each other philosophy. It would also be nice to add another 8MB cache
2. Links between modules in Threadripper are made from IF links from the "ring" connecting two CCX complexes - as such they would not change.
3. Adding 2 cores to CCX, while i adds to complexity, would be nowhere as difficult as creating additional "port" on IF, it would be a strain on L3 cache but it is debatable how big of one it would be, on the other hand it would allow for programs that use 6 cores to not be forced to cross IF to communicate between each other - just that alone could be much more important to games that any other improvement AMD can come up with.
4. Adding IF port while simpler to do would be much worse performance-wise, increasing IF frequency is something that AMD is already struggling with (otherwise we would be all running with IF at 1:1 and not 1:2), changing it's width would only compound the problem. Sure this is something that will have to be taken care of in the future because ZEN2 will definitely be in it's 64 cores variety based on (4 modules) 16 cores modules - the only question is if AMD goes the "bigger CCX" or "more CCXs" route to do that. My only hope is it will be the first one. On the other hand increasing IF throughput and lowering latency is a great goal in itself as it will pay dividends in the future. AMD will have to increase throughput between modules in EPYC anyway by either higher IF frequency or more wider links.
It is worth remembering that all those IF links suck a lot of power and are the main reason ZEN is not very efficient at low loads and completely suck at idle.
 

Timur Born

Senior member
Feb 14, 2016
277
139
116
I meant Precision Boost earlier, not XFR, albeit in essence I just talk about frequencies at single/dual-core load. Sorry for the confusion.

According to The Stilt cores need to deep sleep in C6 state in order for PB to boost single cores even higher than all-core boosts. I would hope that C3 is enough, but we don't get to control these states via BIOS for the time being anyway. Both the Balanced and AMD Ryzen Balanced power profiles don't seem to allow single cores to go to C6 long enough for PB1 to kick in long enough.

I did a quick test on my 1800X using the Cinebench Single-Core benchmark. The 1800X does 4.1 GHz single/dual-core and 3.7 GHz all-core PB1 + XFR1. All tests using the AMD Ryzen Balanced power profile.

No core parking: 154
Core parking 50%: 156
Core parking 100% (min 1 core unparked): 159

More cores parked means PB1 being more active in its highest frequency single/dual-core state. This can lead to better performance for any load that doesn't sum up to more than 1-2 cores total.

All cores overclocked to 3.95 GHz (no parking): 161

The latter is interesting, because even with all those core parking tricks PB1 still falls out of single/dual-core mode too often to reach the single-core score of the all-core overclock (at nominally lower frequency).

If PB2 + XFR2 really gets around these restrictions then it should be much more useful.
 
Last edited:

mtcn77

Member
Feb 25, 2017
105
22
91

Wall Street

Senior member
Mar 28, 2012
691
44
91
And you mean that running 2 cores at 4.7 Ghz draws more power than running 8 cores at 4.2 GHz (and the rest idle/parked)?

Per Gamer's Nexus data, going from 4.0 Ghz (78 amps at 1.162 volts) to 4.2 Ghz (115 amps at 1.381 volts) requires 75% more power for a 5% frequency increase. So yes, each extra 100Mhz past ~4 Ghz is that bad.

Also, 4.7 ghz is just completely not possible on Ryzen 2000 without sub-ambient cooling.
 
Status
Not open for further replies.
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |