Ashes of the Singularity User Benchmarks Thread

Page 12 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

TheELF

Diamond Member
Dec 22, 2012
4,026
753
126
The CPU forum is ^ that a way.

Yeah well, you see the exact same arguments here that you see in the big fx vs intel core arguments,more cores doing much more stuff in parallel vs fewer cores doing stuff much faster.

If you think about it GPUs are just very specialized CPUs,so far so that AMD is calling everything a compute core no matter if x86 or GCN.
 

Carfax83

Diamond Member
Nov 1, 2010
6,841
1,536
136
There's no dynamic light source for spells & weapon effects. In the demo showed where they mention async compute, there is. So from a viewer PoV, one's very static, other isn't.

Honestly, it's damn near impossible to see dynamic lighting in any of those videos, as the footage quality just isn't there. The only video where dynamic lighting is in your face is the one I posted earlier which demonstrated dynamic global illumination.

Edit: The contention here is that NV drops off in performance in Ashes in the full test (but gains perf in the draw call test!) is perhaps due to poor async compute/shading performance OR poor implementation by Oxide. For the blame on Oxide to be true, you would need to show proof that Kepler/Maxwell is great at async compute/shading. As I said, I only see examples of gamedevs praising async compute/shading for GCN, not for Kepler/Maxwell.

Let's first come to an agreement on what asynchronous compute actually is.

To my understanding, asynchronous compute allows devs to execute compute shaders in parallel with the rendering, but out of sync. The last part is important, because it means that rendering performance should not be affected since the GPU is using SPARE cycles for asynchronous compute.

Now AMD's approach is to use 8 dedicated ACEs, who's only duty is to process asynchronous compute. 8 ACEs is a very large number, but it probably works for AMD because their architecture has a big problem with under utilization and would expectedly have a lot of spare cycles.

This is in sharp contrast to NVidia's Maxwell, which has no dedicated ACE or similar counterpart, but instead uses the GMU along with their HyperQ technology to keep the GPU as occupied as possible; an approach which seems to be very successful as Maxwell is an extremely efficient architecture and does not have the under utilization problem that GCN has.

So in light of this, I submit that cross comparisons between AMD and NVidia in the realm of asynchronous compute is FUTILE.

If you could magically strap on 8 ACEs to Maxwell, I doubt it would make one single bit of difference since Maxwell's single GMU has very little problem keeping the GPU occupied.

So what's causing the DX12 path to be slower than the DX11 path in AotS for Maxwell? I think it's because Oxide's DX12 optimization for NVidia isn't up to par with the driver optimizations NVidia made for DX11.

Remember, in DX12, the developers have much closer access to the hardware than in DX11, which puts the burden of responsibility for performance mostly on them. With DX11 it was the opposite. NVidia did a TON of driver optimization on the side to give them an edge.

With GCN, the ACEs I believe are controlled in hardware. So the devs probably don't have to do much, if anything to exploit them. But since NVidia does not have dedicated ACEs, tight management of the GMU and HyperQ will become critical for performance.

This explains why the GTX 980 Ti was able to be pretty much on par with the Fury X in the benchmarks, whilst other Maxwell GPUs like the GTX 980, 970 and 960 with smaller shader arrays experienced slowdowns in DX12. That's because they need greater optimization due to having less spare cycles than the GTX 980 Ti and Titan X.

In DX11, NVidia used their drivers to manage the GMU, but in DX12, the developers will probably have to get their hands dirty when it comes to tapping into it.
 

VR Enthusiast

Member
Jul 5, 2015
133
1
0
Yeah, some alleged ps4 dev says maxwell might have to use a context switch which might have a penalty and at the same time admits that it might not even be useful to have more than one asynchronous pipeline.

Sounds like for sure nvidia is doomed. Yep...

Makes perfect good sense right?

Oh yeah, then he says they don't have a chance because of mantle.

Do you accept that AMD's LiquidVR and Asynchronous shaders are better at removing latency than Nvidia's solution, which is unacceptable for VR as stated by established industry veterans?
 
Feb 19, 2009
10,457
10
76
http://www.anandtech.com/show/9124/amd-dives-deep-on-asynchronous-shading

Async shaders run concurrently on the same shaders but not in sync with rendering, in theory** it should not cause rendering to stall. But it also isn't "spare cycles", which can be confusing/misleading to label it as such. That shaders can be doing rendering work but it can also do compute workloads simultaneously.



** Why it stalls rendering in Ashes is the interesting bit. Zlatan offered an explanation its due to context switch on NV GPUs having a performance hit. Some of you think Oxide is just doing a bad job at it for NV. Who is right, time will tell.
 
Last edited:

sontin

Diamond Member
Sep 12, 2011
3,273
149
106
With GCN, the ACEs I believe are controlled in hardware. So the devs probably don't have to do much, if anything to exploit them. But since NVidia does not have dedicated ACEs, tight management of the GMU and HyperQ will become critical for performance.

Hyper-Q is controlled by the hardware:
http://blogs.nvidia.com/blog/2012/08/23/unleash-legacy-mpi-codes-with-keplers-hyper-q/

This explains why the GTX 980 Ti was able to be pretty much on par with the Fury X in the benchmarks, whilst other Maxwell GPUs like the GTX 980, 970 and 960 with smaller shader arrays experienced slowdowns in DX12. That's because they need greater optimization due to having less spare cycles than the GTX 980 Ti and Titan X.
No, they are just hitting the load wall much earlier. All nVidia cards are losing performance when they are nearly at 100% load.

In DX11, NVidia used their drivers to manage the GMU, but in DX12, the developers will probably have to get their hands dirty when it comes to tapping into it.
Hyper-Q isnt exposed through the DX11 API. DX11 doesnt know such a concept of different queues. In DX12 you need to write directly against every "engine".

Maxwell was designed for multithreaded environment. That is true. But for parallel environment its capabilities are limited with one ACE. In DirectX12 the schedules are parallel, whereas on DirectX 11 the schedules were multiple but were not happening at the same time. That is he difference between the mechanic here. And thats why Maxwell GPUs are getting worse performance. The problem lies in Asynchronous Shading, whether we like it or not.

Have you re-read asynchronous compute? Nothing happens automatically with DX12. If a developer wants to use AC, he needs to programm against every engine. If they are not exposed through the driver then they cant be used.
As far as i know there doesnt exist any differences between the DX11 and DX12 path. So the workload is exactly the same. DX12 will always be faster if a developer is optimizing it for the hardware. The only way a DX12 path will be slower is when the developer has done it intentionally.

We can argue abiut this. Blaming Oxide is like blaming Microsoft for giving DirectX 12. With this API you can't lock any developer from performance. It is simply not possible. You cannot make one vendor look better than the other. That is not the nature of DX12. And no, im not software engineer. But i understand what means that the application talks directly to the GPU, without the CPU intervention.
So, why is Microsoft able to make DX12 faster on nVidia hardware? Do you really think that Oxide doesnt know DX12 and they are now trying to cover their bases with the blame game - it must be the driver or some magical mainboard setting?!
 
Last edited:

3DVagabond

Lifer
Aug 10, 2009
11,951
204
106
Yeah well, you see the exact same arguments here that you see in the big fx vs intel core arguments,more cores doing much more stuff in parallel vs fewer cores doing stuff much faster.

If you think about it GPUs are just very specialized CPUs,so far so that AMD is calling everything a compute core no matter if x86 or GCN.

How are you drawing the parallel with FX CPU's failed performance though?
 
Feb 19, 2009
10,457
10
76
Btw, does anyone have a video of Ashes running in DX11 mode? I wanna know if there's thousands of dynamic lights (all those units firing & explosions cast a light in DX12 via async compute), since I wonder how they actually do it (if its there) in DX11 at all.
 

Qbah

Diamond Member
Oct 18, 2005
3,754
10
81
Anyone else gets a déjà vu of DX9 and FX5xxx series from 12 years ago? :awe: Just the situation, obviously - a whole tier difference after moving to the new technology where the high-end cards couldn't compete (price/perf) at all using the latest DX.
 

TheELF

Diamond Member
Dec 22, 2012
4,026
753
126
How are you drawing the parallel with FX CPU's failed performance though?
I'm not,I was just making fun of The Programmers fail post.
If you're supposed to ditch your nvidia cards because of a game (alpha benchmark) that is coming out in a year,than you should ditch your fx cpus for the same reason.
 

3DVagabond

Lifer
Aug 10, 2009
11,951
204
106
I'm not,I was just making fun of The Programmers fail post.
If you're supposed to ditch your nvidia cards because of a game (alpha benchmark) that is coming out in a year,than you should ditch your fx cpus for the same reason.

I think he's referring to DX12 as the reason to ditch nVidia, not the particular benchmark. Still don't get the reference to FX CPU's. Not too many people running them over Intel currently. Who's supposed to dump them? Maybe The Programmer uses them and I missed it? That would make your post make sense.
 

VR Enthusiast

Member
Jul 5, 2015
133
1
0
The thing is that 99% (if not more) of AAA games coming out are console ports and people are sitting here talking about the 1 game designed on pc for pc that is showcasing a worst case scenario for drawcalls,sure its a valid point for this kind of games, but this kind of games will be very sparse.
And it's not even finished yet so you can't draw conclusions yet.

The consoles are GCN based and both have asynchronous shaders. This is why this is so important, game devs will (they already do in a few cases) use asynchronous shaders to get the maximum performance out of their console games and AMD will see the benefit in PC games because of that.
 

TheELF

Diamond Member
Dec 22, 2012
4,026
753
126
I think he's referring to DX12 as the reason to ditch nVidia
Well the only reference to the expected performance that he has (or accepts) is this benchmark,and he is trying to push people towards amd based solely on this one bench,so it's fair to do the opposite and push people away from amd.
 

TheELF

Diamond Member
Dec 22, 2012
4,026
753
126
The consoles are GCN based and both have asynchronous shaders. This is why this is so important, game devs will (they already do in a few cases) use asynchronous shaders to get the maximum performance out of their console games and AMD will see the benefit in PC games because of that.
How much faster do you think high end pc gpus are in contrast to the console apu-gpu?
Console games will be restricted by what a console gpu can do and not what a pc gpu can do,devs doing additional stuff for the pc version is not the norm.
 

VR Enthusiast

Member
Jul 5, 2015
133
1
0
How much faster do you think high end pc gpus are in contrast to the console apu-gpu?
Console games will be restricted by what a console gpu can do and not what a pc gpu can do,devs doing additional stuff for the pc version is not the norm.

It's not additional stuff though, that's the point. It would be additional to remove the async shaders from a console port!

Async shaders are a potential bottleneck on current gen Nvidia cards that devs will either have to remove to increase performance on Nvidia, or just leave it as it is and accept that Nvidia cards will perform less optimally. What do you think they'll do?
 

flopper

Senior member
Dec 16, 2005
739
19
76
This is in sharp contrast to NVidia's Maxwell, which has no dedicated ACE or similar counterpart, but instead uses the GMU along with their HyperQ technology to keep the GPU as occupied as possible; an approach which seems to be very successful as Maxwell is an extremely efficient architecture and does not have the under utilization problem that GCN has.



So what's causing the DX12 path to be slower than the DX11 path in AotS for Maxwell? I think it's because Oxide's DX12 optimization for NVidia isn't up to par with the driver optimizations NVidia made for DX11.



This explains why the GTX 980 Ti was able to be pretty much on par with the Fury X in the benchmarks, whilst other Maxwell GPUs like the GTX 980, 970 and 960 with smaller shader arrays experienced slowdowns in DX12. That's because they need greater optimization due to having less spare cycles than the GTX 980 Ti and Titan X.

In DX11, NVidia used their drivers to manage the GMU, but in DX12, the developers will probably have to get their hands dirty when it comes to tapping into it.

amd 390 is on par with a 980ti for dx12.
Got to love that Nvidia sold you guys old tech and you think you got a good deal for the win 10 and the future wiuth dx12.

I foresee a load of Nvidia tactics and PR to fix (bribe) developers to fix dx12 games their way and cripple it for other brands as usual. no moral or ethics there.
 

sontin

Diamond Member
Sep 12, 2011
3,273
149
106
It's not additional stuff though, that's the point. It would be additional to remove the async shaders from a console port!

Async shaders are a potential bottleneck on current gen Nvidia cards that devs will either have to remove to increase performance on Nvidia, or just leave it as it is and accept that Nvidia cards will perform less optimally. What do you think they'll do?

Okay, i have one question:
How can "Async shaders" be a a "potential bottleneck on current gen Nvidia card" when this is a new features of DX12 (sic: DX11) to improve the performance of "current gen Nvidia card".

So, where is your proof that "Async shaders" costs performance on nVidia hardware when this concept doesnt exist in DX11 and nVidia shows a massive performance improvement with Hyper-Q.
 
Last edited:

VR Enthusiast

Member
Jul 5, 2015
133
1
0
Okay, i have one question:
How can "Async shaders" be a a "potential bottleneck on current gen Nvidia card" when this is a new features of DX11 to improve the performance of "current gen Nvidia card".

Async shaders are a DX12 thing, there is no way to use them in DX11.

So, where is your proof that "Async shaders" cost performance on nVidia hardware when this concept doesnt exist in DX11 and nVidia shows a massive performance improvement with Hyper-Q.

The proof is in the benchmarks which show Nvidia loses perf in DX12 compared to DX11.
 

VR Enthusiast

Member
Jul 5, 2015
133
1
0
Where is the proof that this benchmark is using "Async shaders"?

Looks like it's from the reviewers guide.

Ashes of the Singularity Benchmark Technical Features:


3. Asynchronous Shaders: This allows the schedule of work for the GPU that will be performed. Traditionally, the GPU’s command queue would have had stalls, and DirectX 12 essentially provides more work done for free.

Read more at http://www.legitreviews.com/ashes-o...chmark-performance_170787#vHlbJSkeq2XB4r7D.99

How can "Async shaders" be a a "potential bottleneck on current gen Nvidia card" when this is a new features of DX12 to improve the performance of "current gen Nvidia card".

Who said async shaders improve performance on current gen Nvidia cards? If it's Nvidia who say it I'd ignore that and look for other sources for confirmation. Nvidia's async shader solution doesn't appear to do anything good in VR which makes me think that they don't have a good method period.
 
Last edited:

sontin

Diamond Member
Sep 12, 2011
3,273
149
106
Ah, thank you.

Where is your proof, that "Asynchronous Shaders" is the reason for the performance impact of DX12?

I like this part:
MSAA is implemented differently on DirectX 12 than DirectX 11. Because it is so new, it has not been optimized yet by us or by the graphics vendors. During benchmarking, we recommend disabling MSAA until we (Oxide/Nidia/AMD/Microsoft) have had more time to assess best use cases.
Read more at http://www.legitreviews.com/ashes-o...chmark-performance_170787#yUildSjYlRwdgrqz.99
And then they are blaming nVidia for the problems with MSAA(!):
Fundamentally, the MSAA path is essentially unchanged in DX11 and DX12. Any statement which says there is a bug in the application should be disregarded as inaccurate information.
Read more at http://www.legitreviews.com/ashes-o...mark-performance_170787/2#cQov5Yh1BSSE1sKC.99
This company is a joke.


/edit: DX12 is not an vendor specific API. "Asynchronous Shaders" (aka Asynchronous Compute) is defined by Microsoft to use more compute units to process information at the same time. It doesnt cost performance on hardware which supports DX12. Worst case nothing happens because there is no difference to DX11.
 
Last edited:
Feb 19, 2009
10,457
10
76
Ah, thank you.

Where is your proof, that "Asynchronous Shaders" is the reason for the performance impact of DX12?

I like this part:
And then they are blaming nVidia for the problems with MSAA(!):
This company is a joke.


/edit: DX12 is not an vendor specific API. "Asynchronous Shaders" (aka Asynchronous Compute) is defined by Microsoft to use more compute units to process information at the same time. It doesnt cost performance on hardware which supports DX12. Worst case nothing happens because there is no difference to DX11.



Only joke here is you sontin, repeating the same smear on Oxide (3rd or 4th time already after its been debunked!) after they already confirmed the MSAA bug is in NVIDIA's DX12 drivers and even offered to help them fix it.

Go ahead, find a source or evidence where NV continues that and denies the bug is in their drivers.

This isn't communist Russia, repeating the same lies over and over again doesn't make it true.

"DX12 is not an vendor specific API. "Asynchronous Shaders" (aka Asynchronous Compute) is defined by Microsoft to use more compute units to process information at the same time. It doesnt cost performance on hardware which supports DX12. Worst case nothing happens because there is no difference to DX11." - Are you a DX12 programmer? Please don't pretend to be so certain. Again, anyone want to argue specifics, I take the words from developers above that of forum warriors when it comes to facts.

Did you want to address the VR issue too while you're at it? Let's see you smear all the VR devs too because they seem to think GCN is awesome while Maxwell is lacking at that. While you're at it, smear NV's engineers because even they think its not up to scratch.

Seriously this is exactly what I meant when I said "Who's drawing conclusions?" Right away we have people like the above, who insist Oxide is being dirty against NV.
 
Last edited:

VR Enthusiast

Member
Jul 5, 2015
133
1
0
Ah, thank you.

Where is your proof, that "Asynchronous Shaders" is the reason for the performance impact of DX12?

I like this part:
And then they are blaming nVidia for the problems with MSAA(!):
This company is a joke.

I'm not sure what Nvidia said but what Oxide says is true regarding there being no bug in the application. It is clear from the benchmarks (without MSAA) that this is not the reason why Nvidia scores so badly in DX12.

/edit: DX12 is not an vendor specific API. "Asynchronous Shaders" (aka Asynchronous Compute) is defined by Microsoft to use more compute units to process information at the same time. It doesnt cost performance on hardware which supports DX12. Worst case nothing happens because there is no difference to DX11.

So what is causing it? It's not MSAA, look at the benchmark from that site I just linked. No MSAA, worse performance in DX12 compared to DX11.



So what else is it if not MSAA and not the async shaders? The obvious answer is it is the async shaders.
 

TheELF

Diamond Member
Dec 22, 2012
4,026
753
126
It's not additional stuff though, that's the point. It would be additional to remove the async shaders from a console port!

Async shaders are a potential bottleneck on current gen Nvidia cards that devs will either have to remove to increase performance on Nvidia, or just leave it as it is and accept that Nvidia cards will perform less optimally. What do you think they'll do?

Not what I was saying.
Console games will have async in the amount of what the apu-gpu can handle, this will be way less than what even a middle class desktop gpu can handle.

Async shaders are a potential bottleneck on current gen Nvidia cards ONLY IF YOU DEVELOP TO PUSH THE BIGGEST OF DESKTOP CARDS,ashes is doing that (and after a year of optimizing until it comes out a lot will change) but a classic console port will not.
 

VR Enthusiast

Member
Jul 5, 2015
133
1
0
Not what I was saying.
Console games will have async in the amount of what the apu-gpu can handle, this will be way less than what even a middle class desktop gpu can handle.

Async shaders are a potential bottleneck on current gen Nvidia cards ONLY IF YOU DEVELOP TO PUSH THE BIGGEST OF DESKTOP CARDS,ashes is doing that (and after a year of optimizing until it comes out a lot will change) but a classic console port will not.

I don't believe this to be the case. I believe that the method being employed in async shaders (developed on GCN) just doesn't work very well on Maxwell.

Ashes might be worst case but I don't think so. I think that when devs really start using async shaders ie pushing all 8 ACEs on a PS4 for example, then you'll start to see worst case for Nvidia.

Look at that graph from Carfax above, you can see even the 370 gains with DX12 and it only has 2 ACEs. There are no benchmarks showing lower fps on AMD in DX12. That probably means that Ashes is nowhere near pushing the maximum limit (8 ACEs) on later GCN cards.

I think some of the variation in the totals we see is probably down to it not being a set benchmark. Because of that we can't say for sure that it's an awesome DX12 benchmark but the overall picture should be obvious.
 
Last edited:
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |