computerbaseAshes of the Singularity Beta1 DirectX 12 Benchmarks

Page 40 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Azix

Golden Member
Apr 18, 2014
1,438
67
91
You have only facts from a biased game developed by a biased company which will defend always their sponsor - look at the response to the Guru3D story about the display problem of frames.

why are you allowed to constantly slander a developer contrary to the facts?

Infraction issued for derailing thread.
-- stahlhart
 
Last edited by a moderator:

Paul98

Diamond Member
Jan 31, 2010
3,732
199
106
DX12 and Async compute are in the process of being added to Unreal Engine 4. Considering that is an engine where AMD is currently relatively weak, that alone will be a large boost to all the games that use that engine once it's fully implemented.
 

PhonakV30

Senior member
Oct 26, 2009
987
378
136
@Sontin
whether you like or not , AMD cards gain Massive boost from DX12/Vulkan.Time is coming soon.Gears of war will use Async Compute so don't surprise if R9 390X is close to GTX 980Ti.All DX12/Vulkan Benchs that we saw it before , show huge boost on AMD cards due to removing API overhead.This is Fact.
 
Last edited:

Paul98

Diamond Member
Jan 31, 2010
3,732
199
106
What I am most excited about is that now GPU's can use algorithms that were previously ether impossible or only possible on the consoles. We could see some HUGE increases in frame rates moving forward.
 

IllogicalGlory

Senior member
Mar 8, 2013
934
346
136
You have only facts from a biased game developed by a biased company which will defend always their sponsor - look at the response to the Guru3D story about the display problem of frames.
There is no display problem. The problem is the mix of AMD DX12 + FCAT, which produces false results.

http://www.extremetech.com/extreme/223654-instrument-error-amd-fcat-and-ashes-of-the-singularity
“Where we measure with FCAT is definitive though, it’s what your eyes will see and observe.” Guru3D is wrong. FCAT records output data, but its analysis of that data is based on assumptions it makes about the output — assumptions that don’t reflect what users experience in this case.
Measured correctly, we get this:

 
Last edited:

Dygaza

Member
Oct 16, 2015
176
34
101
Can't find it. What section is it?

\Documents\My Games\Ashes of the Singularity\settings.ini

[System]
FullScreen=1
Resolution=1920,1080
VSync=0
FreeSync=0
HotLoadEnabled=0
UIScale=1.0
CameraPanSpeed=1.0
BindCursor=FullScreen
AFRGPU=0
AsymetricGPU=0
SkipMovie=1
AutoSave=1
HealthBarsAlways=0
SteamAvatars=1
ForceStop=0
EmulateFullscreen=0
AsyncComputeOff=0
 

Dygaza

Member
Oct 16, 2015
176
34
101
One good thing I've noticed when testing dx11 versus dx12, is that when dx12 runs out of video memory, you don't get the stutter you get with dx11. You just get a bit lower performance for few frames. So yeah naturally it ain't that smooth, but it's still playable. When under dx11 you get those small pauses.
 

3DVagabond

Lifer
Aug 10, 2009
11,951
204
106
There is no display problem. The problem is the mix of AMD DX12 + FCAT, which produces false results.

http://www.extremetech.com/extreme/223654-instrument-error-amd-fcat-and-ashes-of-the-singularity

Measured correctly, we get this:


Thanks for taking your time with this response. I'm going to bookmark your reply. We'll be constantly hearing about the Guru3D graph in the hope nobody posts the facts and it slips by and it becomes fact. I'm already to the point that I can't be bothered to type it anymore. Some people are simply going to perpetuate it anyway.

Besides, who could possibly play the game/run the bench and think this represented what they saw when they didn't see any stuttering? I don't know what happened @ Guru3D with this. They are usually pretty good. Although, now that I think about it their subjective analysis is sometimes off. Even if their data is good.

Look at all of the supposed "dropped frames". lol Anyone who ran this on an AMD card and then saw this graph would know there was some issue with the reporting because their eyes saw something totally different than this.
 

airfathaaaaa

Senior member
Feb 12, 2016
692
12
81
This picture shows the difference between API OpenGL and API Vulkan. If you would "develop" just to the API there wouldnt be any difference at all.



And Fable Legends doesnt count?
You have only facts from a biased game developed by a biased company which will defend always their sponsor - look at the response to the Guru3D story about the display problem of frames.

man talking about stubborness... if they were biased so much as you say why nvidia hasnt come forth so far regarding the issue considering that this biased dev as you say is one of the reasons behind the whole "nvidia cant do async compute"? infact after some pr wars between oxide and nvidia they actually released this https://developer.nvidia.com/dx12-dos-and-donts this is the close as we will get about nvidia answering the whole dx12 capability disaster....
its really funny(if not sad) that even nvidia is admitting it and yet people dont
 

provost

Member
Aug 7, 2013
51
1
16
I haven't read the review, but did the reviewer try to reach out to AMD to get its comments prior to publishing it, as is customary?
And, I don't know if this has anything to do with anything here, but when i run Afterburner OC along with OSD, I experience obvious stuttering. I know from my benching days of Kepler Titan, 780 Ti and 690 that running two programs such as Precision X and Afterbuner, NV inspector caused a lot of issues, stuttering, etc. So, I don't know if this is a software conflict or not... and please ignore if my post does not seem like it has anything to do with the price of tea in China... still new to AMD, so sharing some experiences... However, while not running AB and/or overclock , my gaming was smooth as butter. May be it was just my experience.. as different hardware configuration can cause issues too. So others may have a different experience.... {end O/T blabbing..sorry}
 

Mahigan

Senior member
Aug 22, 2015
573
0
0
Thanks for taking your time with this response. I'm going to bookmark your reply. We'll be constantly hearing about the Guru3D graph in the hope nobody posts the facts and it slips by and it becomes fact. I'm already to the point that I can't be bothered to type it anymore. Some people are simply going to perpetuate it anyway.

Besides, who could possibly play the game/run the bench and think this represented what they saw when they didn't see any stuttering? I don't know what happened @ Guru3D with this. They are usually pretty good. Although, now that I think about it their subjective analysis is sometimes off. Even if their data is good.

Look at all of the supposed "dropped frames". lol Anyone who ran this on an AMD card and then saw this graph would know there was some issue with the reporting because their eyes saw something totally different than this.
AMD support the new WDM2.0 feature of DX12 and Windows 10. NVIDIA haven't yet implemented it into their driver so NVIDIA are still using DirectFlip.

Since FCAT was built for DirectFlip (that's what it monitors) then it shows some weird results when used to gauge AMDs DX12 frame times.

AMD have said that they'll re-add DirectFlip support in a future driver as a means of gauging their frame times.
 

TheELF

Diamond Member
Dec 22, 2012
3,993
744
126
man talking about stubborness... if they were biased so much as you say why nvidia hasnt come forth so far regarding the issue considering that this biased dev as you say is one of the reasons behind the whole "nvidia cant do async compute"? infact after some pr wars between oxide and nvidia they actually released this https://developer.nvidia.com/dx12-dos-and-donts this is the close as we will get about nvidia answering the whole dx12 capability disaster....
its really funny(if not sad) that even nvidia is admitting it and yet people dont
What do you mean?
All I found is this, saying that it is possible although it might slow thing down instead of improving speed.
Which is what we saw in a lot of the benches,little improvement or even reduction in speed,for async compute plus graphics.
Check carefully if the use of a separate compute command queues really is advantageous
Even for compute tasks that can in theory run in parallel with graphics tasks, the actual scheduling details of the parallel work on the GPU may not generate the results you hope for
Be conscious of which asynchronous compute and graphics workloads can be scheduled together
 

Mahigan

Senior member
Aug 22, 2015
573
0
0
This picture shows the difference between API OpenGL and API Vulkan. If you would "develop" just to the API there wouldnt be any difference at all.



And Fable Legends doesnt count?
You have only facts from a biased game developed by a biased company which will defend always their sponsor - look at the response to the Guru3D story about the display problem of frames.
Develop just to the Vulkan API?



Asynchronous compute + graphics "is" developing to the API. That's an NVIDIA presentation on Vulkan. The first part is the Vulkan API and the second part is GeForce on Vulkan.

The presentation was given by Piers Daniell of NVIDIA during SIGGRAPH 2015.

Asynchronous compute + graphics is central to both the DX12 and Vulkan APIs. NVIDIAs lack of support for the feature is an NVIDIA problem and not a matter of biased developer's.
 
Last edited:

3DVagabond

Lifer
Aug 10, 2009
11,951
204
106
AMD support the new WDM2.0 feature of DX12 and Windows 10. NVIDIA haven't yet implemented it into their driver so NVIDIA are still using DirectFlip.

Since FCAT was built for DirectFlip (that's what it monitors) then it shows some weird results when used to gauge AMDs DX12 frame times.

AMD have said that they'll re-add DirectFlip support in a future driver as a means of gauging their frame times.

Yes. My point though is if you looked at those FCAT results but when you watched the run through you didn't see any stuttering would you accept them as correct? Would you even throw that graph up there knowing full well it's not representative of the performance and likely completely erroneous?


Yet, when they got the results above for the 980 SLI they stated this;
The frame-drops in FCAT we are still investigating. Rest assured you cannot see/detect these frame-drops yourself. Hence we think it could be an issue with our FCAT system.

I'll tell you the reason they couldn't see stutters. They were running 155fps average. They instead though write it off to an FCAT issue? I've also never seen the results of their "investigation".

With the AMD result, that they accepted, it was running 60fps locked like vsync was enabled even though it wasn't (you think this might have clued them in something wasn't reporting right?). There were areas with massive dropped frames that easily would have been visible. As well as 30fps to 60fps stutters. Did they report that you could see anything? No! Did they report that you couldn't see it? Why not?

They even went as far as saying.
FCAT then, FCAT is always definitive in the sense as to what you see on-screen compared to what numbers are crunched in the game engine. Above a plotted frame-time results of the test run @ 2560x1440 performed with a single Radeon R9 Fury (4GB) in 2560x1440.

Then they tacked this on to the end;
Update: hours before the release of this article we got word back from AMD. They have confirmed what we are seeing. Radeon Software 16.1 / 16.2 does not support a DirectFlip in DX12, which is mandatory to solve to this specific situation/measurement. AMD intends to resolve this issue in a future driver update. Once that happens we'll revisit FCAT.

What they are saying, without really making it clear, is that without the DirectFlip support that FCAT is FUBAR. The results in fact are not, "always definitive". Poor, poor journalism. Hilbert just wasn't on top of his game with these reviews, at least with his analysis, I guess?
 

Mahigan

Senior member
Aug 22, 2015
573
0
0
Yes. My point though is if you looked at those FCAT results but when you watched the run through you didn't see any stuttering would you accept them as correct? Would you even throw that graph up there knowing full well it's not representative of the performance and likely completely erroneous?


Yet, when they got the results above for the 980 SLI they stated this;


I'll tell you the reason they couldn't see stutters. They were running 155fps average. They instead though write it off to an FCAT issue? I've also never seen the results of their "investigation".

With the AMD result, that they accepted, it was running 60fps locked like vsync was enabled even though it wasn't (you think this might have clued them in something wasn't reporting right?). There were areas with massive dropped frames that easily would have been visible. As well as 30fps to 60fps stutters. Did they report that you could see anything? No! Did they report that you couldn't see it? Why not?

They even went as far as saying.


Then they tacked this on to the end;


What they are saying, without really making it clear, is that without the DirectFlip support that FCAT is FUBAR. The results in fact are not, "always definitive". Poor, poor journalism. Hilbert just wasn't on top of his game with these reviews, at least with his analysis, I guess?
Could be various reasons. One very possible reason is that people just don't want to, or find it hard, to believe that Hawaii grade GPUs are really as powerful as we're seeing.

There's also the "AMD drivers suck" line which has been paraded as absolute truth ever since the ATi days. So people are more prone to blaming AMD rather than an NVIDIA tool (FCAT is an NVIDIA tool).

AMD also had a bad history with frame times (especially true on dual GPU solutions).

I can't say for sure why he said what he said but we do know as a matter of fact that it's not true.
 

airfathaaaaa

Senior member
Feb 12, 2016
692
12
81
What do you mean?
All I found is this, saying that it is possible although it might slow thing down instead of improving speed.
Which is what we saw in a lot of the benches,little improvement or even reduction in speed,for async compute plus graphics.
it was on the very first dont..
Don’ts

Don’t rely on the driver to parallelize any Direct3D12 works in driver threads
On DX11 the driver does farm off asynchronous tasks to driver worker threads where possible – this doesn’t happen anymore under DX12
While the total cost of work submission in DX12 has been reduced, the amount of work measured on the application’s thread may be larger due to the loss of driver threading. The more efficiently one can use parallel hardware cores of the CPU to submit work in parallel, the more benefit in terms of draw call submission performance can be expected.
 

Mahigan

Senior member
Aug 22, 2015
573
0
0
it was on the very first dont..
Don’ts

Don’t rely on the driver to parallelize any Direct3D12 works in driver threads
On DX11 the driver does farm off asynchronous tasks to driver worker threads where possible – this doesn’t happen anymore under DX12
While the total cost of work submission in DX12 has been reduced, the amount of work measured on the application’s thread may be larger due to the loss of driver threading. The more efficiently one can use parallel hardware cores of the CPU to submit work in parallel, the more benefit in terms of draw call submission performance can be expected.
And that explains a lot.

New theory for AMDs high DX11 API overhead:
So here's the deal, AMD likely don't have a multi-threaded driver. This explains their API overhead under DX11.

It's not about deferred contexts or multi-threaded command listing. It's about driver submissions to the Command Buffer.
In between the Host Application Compute Driver and the System Memory spacing reserved for the Command Buffer.

Basically, we need to assume that for AMD, a DX11 PC has but a single CPU with a single Core. This CPU is busy handling in game physics, simulations, AI as well as translating the command lists into ISA for submission to the command buffer (so even if you use multi-threaded command listing to record the command lists over many threads, AMDs driver only uses the primary thread to translate them all into ISA). So long as the command buffer has commands left to process, the GPUs memory controller can retrieve the commands via the PCI Express Bus and place them into GPU memory.

The problem arises when the CPU is busy with other taxing work and cannot translate and submit commands fast enough to keep the command buffer from going empty. Then we get a GPU stall (GPU waiting on the CPU).

So now it makes sense that the AMD engineer stated that they increased the size of the command buffer for Polaris in order to boost single threaded performance. This would make sure that there is a large enough buffer in order to avert a GPU stall. Why? Because the CPU will be placing more commands into the buffer so that when the CPU is busy, the GPUs memory controller can keep pulling commands from the buffer and thus avert a stall because more commands reside in the buffer than before.

I guess this was cheaper than developing a DX11 multi threaded driver or there's another incompatibility elsewhere hardware wise.
 
Last edited:

guskline

Diamond Member
Apr 17, 2006
5,338
476
126
Dygaza, first thank you for the assistance on disabling Async Compute on my GTX980TI rig below.

I ran the benchmark both with it enabled and disabled at the following settings
Resolution 3440 x 1440p MSAA 8x and all other settings High.

With Async enabled
Average=35.4fps
Normal=40.6
Medium=34.9
Heavy=31.8

With Async disabled:
Ave=36.5
Normal=41.8
Medium=35.9
Heavy=32.8
 
Last edited:

3DVagabond

Lifer
Aug 10, 2009
11,951
204
106
Dygaza, first thank you for the assistance on disabling Async Compute on my GTX980TI rig below.

I ran the benchmark both with it enabled and disabled at the following settings
Resolution 3440 x 1440p MSAA 8x and all other settings High.

With Async enabled
Average=35.4fps
Normal=40.6
Medium=43.9
Heavy=31.8

With Async disabled:
Ave=36.5
Normal=41.8
Medium=35.9
Heavy=32.8

Everything is within 1fps except for medium which is a lot slower w/o async? Strange result.
 

guskline

Diamond Member
Apr 17, 2006
5,338
476
126
I decided to run both rigs below at the same resolution, 2560x1440 with AsyncCompute on and off. In addition on the 4790k rig I ran it with CF enabled and disabled(i.e. single card)

The 5960x is OC'd to 4.4 Ghz and the EVGA GTX908TI SC is clocked at 1102 vcore.

The 4790k is OC'd to 4.7Ghz and both R9 290 Sapphire-X Tri-X OC cards run at 1000 vcore

For the GTX980TI;
Enabled
Ave=38.3fps
Norm=43.1
Med=38.3
Heavy-34.4

Async disabled:
Ave=40.1
Norm=45.4
Med=40.2
Heavy=35.8

For the R9-290
Single Card
Enabled
Ave=28
Norm=32.8
Med+28
Heavy=24.5

Single R9-290
Async disabled
Ave=26.8
Norm=30.8
Med=27.1
Heavy=23.5

For both R9 290s in CF
Async enabled
Ave=48.3
Norm=55.4
Med=48.1
Heavy=42.8

Async disabled
Ave=48
Norm=55.7
Med=47.6
Heavy=42.5
 

guskline

Diamond Member
Apr 17, 2006
5,338
476
126
Everything is within 1fps except for medium which is a lot slower w/o async? Strange result.

Made a mistake in typing which I corrected above. TheMedium enabled score was 34.9 NOT 43.9, transposition in typing ---sorry.:'(
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |