Official AMD Ryzen Benchmarks, Reviews, Prices, and Discussion

Page 115 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Azix

Golden Member
Apr 18, 2014
1,438
67
91
Causes of poor gaming relative to CPU performance of Ryzen:

1. Windows is load-balancing across CCXes.
This means that a thread is being moved around on the CPU - which is normal - so that a single core isn't used more than others. On Ryzen, that needs to happen ONLY within a CCX, otherwise you will incur a massive penalty when that thread no longer finds its data in the caches of the CCX.

2. SMT hurts single threaded performance due to shared structure.
Ryzen statically partitions three structures to support SMT:
a. Micro-op queue (dispatcher)
b. Retirement queue
c. Store queue

This means that, with SMT enabled, these resources are cut, potentially, in HALF (mind you, these are just queues that impact throughput of a single thread).

3. Memory latency quirks still not worked out.
Gaming can be quite sensitive to memory latency and bandwidth. These issues will be, most likely, remedied with BIOS updates.

Combined you can see, clearly, what is happening and most of the reviews make sense.

A Windows driver update to treat each CCX almost as if it were its own CPU will help immensely. The SMT problem is likely PERMANENT... unless AMD can adjust the partitioning with microcode, which I doubt.

What this all means is simple: once the Windows update has landed, BIOSes are patched up, and SMT is disabled, an 8-core Ryzen will likely be competitive with a quad i7 in gaming while blowing past it in multi-threaded. If all you do is game, then the 1700 may well become a very valid option that will work increasingly better in future games.

This also lets us know where Zen 2 will be able to improve the most. Make the impacted queues competitively shared (or just a little larger), improve inter-CCX communications, decouple the L3 speed from core speeds (for higher core clocks), and a few other relatively simple tweaks and you have a second generation Ryzen that steals the show.

We also know why AMD hasn't released anything other than their 8-core chips - these issues need to be ironed out in production. You need thousands of eyes and testers and numerous companies each responding to their customers' needs to get a grip on what is most important to fix before finalizing Zen 2.


What is your source? regarding SMT in particular. Because if there was a major issue there, these chips would not be so close to the 6900k outside of gaming, would they?
 

George Ormonde

Junior Member
Mar 3, 2017
2
0
1
Any advice, My main application is matlab,
With a variety of parallel and sequential work loads,

the 3 build I'm looking at are
~$900 for Ryzen 1700 ---The performance of the Ryzen in compute is tempting, but the platform seems like a risk at the moment.
~$900 for i7-7700K ---- Boring safe choice, might be the best
~$1200 for I7-6800K ---- more than I want to spend, but may be worth it for the compute and option to expand memory and gpu in future

any advice ?
 

blublub

Member
Jul 19, 2016
135
61
101
Would someone be so kind and sum up what the fuzz is all about?

I was busy last 24h and by skimming reviews and I don't see why the sentiment is so bad

Thy
 
Reactions: french toast

rvborgh

Member
Apr 16, 2014
195
94
101
is Ryzen is seen by the OS as two NUMA nodes (CCX = node)?

i also use Process Lasso to push my apps onto the 16 "fast" 3.6 GHz cores on my 48 core Opteron. works very well.

i'd be interested in seeing someone OC say 2 out of 4 cores in each CCX to 4.5 GHz... and turn off SMT, use Process Lasso to set thread affinity to the "fast" cores, and then rerunning the gaming benchmarks.

SMT issues will be mitigated to the point that they are only a 2~5% penalty, instead of 10~20% as they can be now. I think the SMT penalty is being made to appear larger than it is by poor scheduling across the CCXes. Fixing that, by itself, should help reduce the deficit.

I intend to use Process Lasso to force any poorly threaded (and problematic) games/apps onto just one CCX. I suspect that VMWare Player will be just such an app, where I will need to force it to the second CCX. That, though, would also allow me to assign games to the first CCX, so I can then play games while running calculations in VMWare Player (usually in Linux Mint these days, but also Haiku OS).

That's a capability I will lack with any other reasonably priced CPU.
 

moonbogg

Lifer
Jan 8, 2011
10,637
3,095
136
Arr poor you, Ryzen is being so unfair to little old moonbog, if you are searching for anti ryzen sob story gaming comparisons that highlight the buggy nature of this release , whilst ignoring what has been said about bios,SMT bug, game optimization etc then we cant help you.
Joker productions using a different gigabyte mobo on a better bios somewhat proves this over 10 games, its not like it was a 1or 2 two game cherry picked fluke at 4k?
Why dont you swap your mobo for the gigabyte and get the best of both worlds?
Failing either that or being patient why dont take it back and get a 4 core intel, or pay 1000$ for 6900k, that way you can put this terrible injustice behind you and we can get some peace.

https://www.youtube.com/watch?v=Lay7YuqPscQ

I noticed that. He doesn't seem to be having as large of an issue as some other reviewers. Not sure what to make of it. I am seriously skeptical that a BIOS update can effect game performance very much, but I'd be very happy if that were the case. The CPU has the performance for some things, so hopefully it can be fixed.
 
Reactions: french toast

looncraz

Senior member
Sep 12, 2011
722
1,651
136
What is your source? regarding SMT in particular. Because if there was a major issue there, these chips would not be so close to the 6900k outside of gaming, would they?

AMD is my source. And years of logic simulations (usually for software, though, but it works for hardware just as well).



This shows the nature of the SMT implementation. I almost called this exact configuration a good six months before AMD made it public - not bragging, it's simply the most logical route given their architecture disclosure in the GCC patch.

Those statically partitioned parts are what we don't know enough about. Those queues are vital for performance. When you statically partition and you are using round-robin for the front-end, you are probably doing equal partitioning of those resources (50/50). That limits the throughput of each thread permanently while SMT is enabled.

The micro-op queue (dispatch), for example, will be able to issue 6 uops per cycle per thread with SMT disabled. With SMT enabled the most logical route would be to have alternating cache lines used for the threads, so now 6uops are issued per thread - every other cycle. You just induced a 1-cycle penalty in addition to cutting uop dispatch throughput in half. This is why Ryzen is 6-wide instead of the contemporary 4-wide.

On retirement we see the same thing. Here, though, AMD was really smart and went 8-wide, this guarantees that you can retire faster than you can issue - helping to keep the core flowing in either SMT on or off... so long as you can store those results... which happens for one thread per cycle.

(Please note, when I say cycle here, I'm talking about the cadence for each individual unit and not necessary each clock cycle).
 

french toast

Senior member
Feb 22, 2017
988
825
136
https://www.youtube.com/watch?v=Lay7YuqPscQ

I noticed that. He doesn't seem to be having as large of an issue as some other reviewers. Not sure what to make of it. I am seriously skeptical that a BIOS update can effect game performance very much, but I'd be very happy if that were the case. The CPU has the performance for some things, so hopefully it can be fixed.
Skepticism is good, there was so much hype around ryzen that kind of issue was going to deflate people, i do feel amd should have told people about game optimizations and improving bios at the launch event, that way people would have tempered their expectations, either that or they should have waited a month and released Ryzen 5/3 alongside.
Its clear as this is a brand new uarch that software needs optimizing even outside of bios, memory reading benchmarks are apparently not reading AMDs uarch properly giving false latency readings, games are optimized for intels uarch, then you have the flaky bios.
Lots to improve.
 

looncraz

Senior member
Sep 12, 2011
722
1,651
136
is Ryzen is seen by the OS as two NUMA nodes (CCX = node)?

i also use Process Lasso to push my apps onto the 16 "fast" 3.6 GHz cores on my 48 core Opteron. works very well.

i'd be interested in seeing someone OC say 2 out of 4 cores in each CCX to 4.5 GHz... and turn off SMT, use Process Lasso to set thread affinity to the "fast" cores, and then rerunning the gaming benchmarks.

I don't think Windows is aware of that need yet. If I were Microsoft, that's exactly how I'd choose to treat it - as two discrete quad core CPUs. They already have enough scheduling logic in the kernel to make that an easy task. Supposedly, though, AMD has submitted a driver to Microsoft. Having never seen what the CPU driver can do in Windows, I can't speculate how AMD would go about treating Ryzen.

I plan to do exactly what you just suggested
 

piesquared

Golden Member
Oct 16, 2006
1,651
473
136
Ryzen is the clear winner in $/perf, theres no question. Nobody-nobody- puts together a beefed up enthusiast PC to blindingly focus in on gaming and nothing else. These are multi purpose machines after all.
Fact is that Ryzen is more than good enough for gaming for many people. Not everyone needs their chips to be at the top of the charts as long as they get a good gaming experience. The vast majority of people buy hardware that does just that. With Ryzen you get that plus a multithreaded powerhouse. And this is just starting where very little software has recieved any optimizations. Its a no brainer. Im good with the gaming scores considering how powerful Ryzen is in so many other applications. It competes very effectively IMO.
But really the star of the show is Zen. The implications of this core for server, data center and the combinations of Naples, HSA, Vega, Instinct, Radeon SSG, Infinity Fabric etc is big time. The core is extremely efficient when running at its optimal frequency. This may be reflected in intel's stock being downgraded recently by Bernstien.
Zen is an incredible feat of engineering and very few believed they could do a grounds up design and come out competing with the best chips on the market 4 years later! In an industry that moves as fast as tech. On a shoe string budget compared to its competitor! I say bravo theyve earned my support. *respect
 

cusideabelincoln

Diamond Member
Aug 3, 2008
3,269
12
81
Causes of poor gaming relative to CPU performance of Ryzen:

1. Windows is load-balancing across CCXes.
This means that a thread is being moved around on the CPU - which is normal - so that a single core isn't used more than others. On Ryzen, that needs to happen ONLY within a CCX, otherwise you will incur a massive penalty when that thread no longer finds its data in the caches of the CCX.

2. SMT hurts single threaded performance due to shared structure.
Ryzen statically partitions three structures to support SMT:
a. Micro-op queue (dispatcher)
b. Retirement queue
c. Store queue

This means that, with SMT enabled, these resources are cut, potentially, in HALF (mind you, these are just queues that impact throughput of a single thread).

3. Memory latency quirks still not worked out.
Gaming can be quite sensitive to memory latency and bandwidth. These issues will be, most likely, remedied with BIOS updates.

Combined you can see, clearly, what is happening and most of the reviews make sense.

A Windows driver update to treat each CCX almost as if it were its own CPU will help immensely. The SMT problem is likely PERMANENT... unless AMD can adjust the partitioning with microcode, which I doubt.

What this all means is simple: once the Windows update has landed, BIOSes are patched up, and SMT is disabled, an 8-core Ryzen will likely be competitive with a quad i7 in gaming while blowing past it in multi-threaded. If all you do is game, then the 1700 may well become a very valid option that will work increasingly better in future games.

This also lets us know where Zen 2 will be able to improve the most. Make the impacted queues competitively shared (or just a little larger), improve inter-CCX communications, decouple the L3 speed from core speeds (for higher core clocks), and a few other relatively simple tweaks and you have a second generation Ryzen that steals the show.

We also know why AMD hasn't released anything other than their 8-core chips - these issues need to be ironed out in production. You need thousands of eyes and testers and numerous companies each responding to their customers' needs to get a grip on what is most important to fix before finalizing Zen 2.


Gaming application performance typically responds well to cache sizes (as well as latency). How does the fact Ryzen's L3 cache is an exclusive victim cache play into all of those, as opposed to if it were an inclusive one?
 

french toast

Senior member
Feb 22, 2017
988
825
136
Most games dont scale past 12 threads, but some are maxing out at 8.
Its clear the best gaming cpus (up to 400$)are going to be 1600x/1700/1700x/6800/7700k? in the medium to long term, if your on a budget get a pentium.
 

looncraz

Senior member
Sep 12, 2011
722
1,651
136
Gaming application performance typically responds well to cache sizes (as well as latency). How does the fact Ryzen's L3 cache is an exclusive victim cache play into all of those, as opposed to if it were an inclusive one?

The L3 will hold global data as needed in addition to evicted cache lines, so there should be pretty much no penalty. Gaming on one CCX should be superior to gaming across the CCX - and by no small margin (15%+). (This is why AMD calls it mostly-exclusive).

Once games are patched they can separate their threads according to their data locality - AI, networking, physics on CCX 0, graphics, sound, data streaming on CCX 1... data interaction already mostly occurs in RAM due to volatile shared structures and coherency requirements.

If those threads exist (or other threads for similar purposes), patches should be rather simple.
 

rvborgh

Member
Apr 16, 2014
195
94
101
can't wait to see the results i think pinning things to a specific CCX should go a long way to fixing some of these issues.

i would also be curious if a CCX could be disabled, and the AIDA tests rerun as well on the remaining CCX.

I don't think Windows is aware of that need yet. If I were Microsoft, that's exactly how I'd choose to treat it - as two discrete quad core CPUs. They already have enough scheduling logic in the kernel to make that an easy task. Supposedly, though, AMD has submitted a driver to Microsoft. Having never seen what the CPU driver can do in Windows, I can't speculate how AMD would go about treating Ryzen.

I plan to do exactly what you just suggested
 

beginner99

Diamond Member
Jun 2, 2009
5,223
1,598
136

This is getting more and more confusing. hardware.fr shows huge gains in BF1 with SMT disabled. Couple pages back the pcgamer review showed a tiny gain by disabling SMT:



Also they then both reach around 140 fps, so almost identical. So this clearly is a mixture of board and software issues. And they both used the Asus board. Maybe different BIOS version or windows settings?

Anyway results are all over the place so i can't make a real purchase decisions. just going to wait a bit longer. If this isn't fixed in 2 month time, I will wait fro Skylake-X...
 

Head1985

Golden Member
Jul 8, 2014
1,866
699
136
This is getting more and more confusing. hardware.fr shows huge gains in BF1 with SMT disabled. Couple pages back the pcgamer review showed a tiny gain by disabling SMT:



Also they then both reach around 140 fps, so almost identical. So this clearly is a mixture of board and software issues. And they both used the Asus board. Maybe different BIOS version or windows settings?

Anyway results are all over the place so i can't make a real purchase decisions. just going to wait a bit longer. If this isn't fixed in 2 month time, I will wait fro Skylake-X...
GPU bottleneck
 
Reactions: lightmanek

cusideabelincoln

Diamond Member
Aug 3, 2008
3,269
12
81
This is getting more and more confusing. hardware.fr shows huge gains in BF1 with SMT disabled. Couple pages back the pcgamer review showed a tiny gain by disabling SMT:



Also they then both reach around 140 fps, so almost identical. So this clearly is a mixture of board and software issues. And they both used the Asus board. Maybe different BIOS version or windows settings?

Anyway results are all over the place so i can't make a real purchase decisions. just going to wait a bit longer. If this isn't fixed in 2 month time, I will wait fro Skylake-X...
Pcgamer's test scenario looks to be more GPU-limited.
 

JoeRambo

Golden Member
Jun 13, 2013
1,814
2,105
136
So, with more reviews pouring, we can have a retrospect to events and "leaks" leading to the launch and discuss finer points of Zen:

1) AMD has been mum about CCX interconnects and L3 because they are hurting and will continue to hurt performance. 16MB of L3 per CPU is nice and dandy, but each core can access 8MB of those @ limited speed and that speed is also being hurt by coherency traffic - not optimal. I'd take Intel's multi ring HCC CPU any day over this. They better have some extra souce (in form of extra links?) for rumoured 32 core server chip, coherency traffic between 8 CCX chunks will kill it.
2) SMT has 10-15% perf penalty for being enabled, Intel has sorted out these things ~Nehalem gen. Very disappointing for desktop loads and okayish for servers.
3) OC rumours are pretty much shot down too, 1 core OC or not, this is another Polaris chip from AMD, loaded with features to extract the most out of process and bin, leaving scraps on table for overclockers. ~4Ghz is nice for 1700, but not so nice for top end CPU.
4) Larger L2 cache is nice, a lot of perf is coming out of it, goes a long way of showing how Intel has been milking customers with 256KB/s of L2. It turned out a lot of workloads can benefit very much from extra L2 and 4x as slow L3 is a gap too wide. SKL-X could have some major surprises for performance, some workloads will benefit nicely.

What stays the same between AMD releases is the tune of "wait for OS/game/software updates". We were supposed to get extra performance from Phenom when TLB bug was sorted out by recompiles, Bulldozer was about to shine after OS patching and scheduling to one thread of module and counts of execution threads increased overall. Now OS needs to be aware of CCX and treat it as NUMA node to extract optimal performance?
 
Reactions: psolord
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |