(Discussion) Futuremark 3DMark Time Spy Directx 12 Benchmark

Page 11 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

railven

Diamond Member
Mar 25, 2010
6,604
561
126
Well I don't know why I bothered reading since my last post.

Good luck Jarnis, I can only imagine what you're facing because you guys aren't doing it the AMD way. id software thrown to the wolves but now magical they are the best devs in the industry.

I should put the red glasses back on, this is becoming unbearable.
 

Elixer

Lifer
May 7, 2002
10,371
762
126
To sum all this up, AMD has a bigger bathtub (for Async compute) than nvidia, and Time Spy isn't filling the tub to the max possible for whatever reason, could be a engine limitation, or it could be they have to answer to their sponsors or could be something else.
We know that nvidia's bathub doesn't have as much capacity as AMD's, and if they would overflow the smaller bathtub, that would cause delays since it has to wait for nvidia's bathtub fill level to get to a lower level again. (Funny thing about 'signing off', if AMD would say fill it up more, and nvidia would say that is too much already, who would win? Who controls the tie breaker vote, and what happens if it goes against the side that is saying it is too much?)

Maybe they need a 'hardcore' version of Time Spy to keeping adding work queues until the video card cried uncle? Pretty much like how increasing tessellation count sooner or later will bog down the currently best video card that is out there after a certain threshold.

That said, synthetic tests rarely ever match actual game engines, even if they are deterministic (which pretty much all engines are these days).
3dmark should open the source, and stick it on GPUopen.
 
Last edited:

Elixer

Lifer
May 7, 2002
10,371
762
126
A game dev's perspective compared to 3dmark?
(Would be cool if Time Spy was re-written to support Vulkan as well)
DirectX 12 and Vulkan are conceptually very similar and both clearly inherited a lot from AMD’s Mantle API efforts. The low level nature of those APIs moves a lot of the optimization responsibility from the driver to the application developer, so we don’t expect big differences in speed between the two APIs in the future. On the tools side there is very good Vulkan support in RenderDoc now, which covers most of our debugging needs. We choose Vulkan, because it allows us to support Windows 7 and 8, which still have significant market share and would be excluded with DirectX 12. On top of that Vulkan has an extension mechanism that allows us to work very closely with AMD, NVIDIA and Intel to do very specific optimizations for each hardware.”

Full interview goes live this weekend at: http://www.dsogaming.com/news/id-so...1-and-on-why-it-chose-vulkan-over-directx-12/
 

boozzer

Golden Member
Jan 12, 2012
1,549
18
81
Well I don't know why I bothered reading since my last post.

Good luck Jarnis, I can only imagine what you're facing because you guys aren't doing it the AMD way. id software thrown to the wolves but now magical they are the best devs in the industry.

I should put the red glasses back on, this is becoming unbearable.
when a dev give fps increase of 20-60% with a single patch/API. magical is correct.

why are you so down?

when it was thrown to the wolves it was because beta to retail, in the span of 1 week, fps went to crap for 1 side of the gpu wars. why do you act like it came out of no where?
 

selni

Senior member
Oct 24, 2013
249
0
41
I really cannot imagine how frustrating it must be for FM at the moment to respond to this sort of completely uninformed ranting. Anandtech just published a really good pascal architecture summary, go read that at least first and realize how wrong most of these comments are. Facts just don't matter and the amount of misinformation is incredible - people are just latching on to bad analogies or tiny details they think they understand but don't.

That or deciding that unlike pretty much every other benchmark to date this particular one needs multiple codepaths because a slide said it was best practice. It is sure, but it's not obvious it'll ever be common to given the extra work. That was always going to be a potential problem with DX12/vulkan being lower level.
 

Elixer

Lifer
May 7, 2002
10,371
762
126
http://radeon.com/radeon-wins-3dmark-dx12/

...but I'm sure someone will still claim it is somehow biased or "not a proper DX12 benchmark". I mean, clearly AMD is hiding the truth or something.
Gotta love PR people pointing to other PR people.

And from that link to this link http://radeon.com/asynchronous-compute/

Evolving Asynchronous Compute with the Quick Response Queue

Today’s graphics engines provide many opportunities to take advantage of asynchronous compute, but some tasks can still struggle to reach peak benefit because they don’t explicitly know when the graphics card has started executing new work. In these cases, we can resolve the problem by guaranteeing to the game engine that a task will start and end in a certain amount of time.

In order to meet this requirement, these time-critical tasks must be given higher priority access to processing resources than other tasks. One way to accomplish this is through the use of preemption, which works by temporarily suspending all other GPU tasks until the high-priority task can be completed. However, preemption often causes costly time delays as the old tasks are wound down before the new one is started; this can potentially manifest as undesirable stuttering in games.

Instead, the Polaris architecture supports another method for handling time-critical tasks called the Quick Response Queue (QRQ). Tasks submitted to this queue get preferential treatment from GPU resources, while running asynchronously, so they can overlap with other workloads. The game developer can even control how, when, and how much of the GPU is being used by a QRQ task through a hardware component of the Polaris architecture called the Hardware Scheduler (HWS).
Here is the white paper about it http://amd-dev.wpengine.netdna-cdn....10/Asynchronous-Shaders-White-Paper-FINAL.pdf

If the aim of this benchmark is to show eye candy, then, good job!
If the aim is to get meaningful results that can be used as a performance estimate on future DX 12 / Vulkan games, then, this benchmark just isn't there at this time.
 
Feb 19, 2009
10,457
10
76
No kidding AMD would down-talk a feature they don't use.

To correct you, they can do preemption just as well since GCN 1.0, but it's not their best result. GCN flies with parallel workloads, not serial + preemption, which Pascal excels at.

To sum all this up, AMD has a bigger bathtub (for Async compute) than nvidia, and Time Spy isn't filling the tub to the max possible for whatever reason, could be a engine limitation, or it could be they have to answer to their sponsors or could be something else.
We know that nvidia's bathub doesn't have as much capacity as AMD's, and if they would overflow the smaller bathtub, that would cause delays since it has to wait for nvidia's bathtub fill level to get to a lower level again. (Funny thing about 'signing off', if AMD would say fill it up more, and nvidia would say that is too much already, who would win? Who controls the tie breaker vote, and what happens if it goes against the side that is saying it is too much?)

Maybe they need a 'hardcore' version of Time Spy to keeping adding work queues until the video card cried uncle? Pretty much like how increasing tessellation count sooner or later will bog down the currently best video card that is out there after a certain threshold.

That said, synthetic tests rarely ever match actual game engines, even if they are deterministic (which pretty much all engines are these days).
3dmark should open the source, and stick it on GPUopen.

Your summary is spot on. Anyone with GPUView can see that GCN is not being fully utilized in Time Spy, much lower utilization than other DX12/Vulkan implementation.

Is it fair to say it's what developers will do? No it is not.

Developers are focused on console optimizations and it is on that platform that they push parallel Async Compute heavily to see major gains, on weak low shader count hardware. The approach taken with Time Spy does not benefit low shader count GPUs as evident with all the various benchmarks been done. It's a poor implementation of Async Compute, targeting the lowest common denominator, Pascal's preemption + light Async workloads.

I would not have any issue with it if the devs came out and said it, but to pretend it's representative of the gaming industry moving forward, is wrong.

I really cannot imagine how frustrating it must be for FM at the moment to respond to this sort of completely uninformed ranting. Anandtech just published a really good pascal architecture summary, go read that at least first and realize how wrong most of these comments are. Facts just don't matter and the amount of misinformation is incredible - people are just latching on to bad analogies or tiny details they think they understand but don't.

That or deciding that unlike pretty much every other benchmark to date this particular one needs multiple codepaths because a slide said it was best practice. It is sure, but it's not obvious it'll ever be common to given the extra work. That was always going to be a potential problem with DX12/vulkan being lower level.

You could not be more wrong. Tech forums are the most informed gamers around. Not your typical social media groups where they swallow PR whole without question. Here, we investigate, go to the source, booting up GPUView to compare GPUs.

It's from tech forums such as these that we uncovered 3.5GB 970. Where we proved Maxwell was incapable of Preemption. Incapable of Async Compute. We didn't swallow PR whole and investigated.

And yes, good DX12/Vulkan needs multi-paths to target different architectures. This is what a low-level API is all about, it shifts the responsibilities onto the developers. If they are incapable of it, then stick to DX11.

http://www.overclock3d.net/news/gpu..._they_chose_opengl_vulkan_over_directx11_12/1

DirectX 12 and Vulkan are conceptually very similar and both clearly inherited a lot from AMD’s Mantle API efforts. The low-level nature of those APIs moves a lot of the optimization responsibility from the driver to the application developer, so we don’t expect big differences in speed between the two APIs in the future.

On the tools side there is very good Vulkan support in RenderDoc now, which covers most of our debugging needs. We choose Vulkan, because it allows us to support Windows 7 and 8, which still have significant market share and would be excluded with DirectX 12.

On top of that Vulkan has an extension mechanism that allows us to work very closely with AMD, NVIDIA and Intel to do very specific optimisations for each hardware.
 
Reactions: Grazick

selni

Senior member
Oct 24, 2013
249
0
41
You could not be more wrong. Tech forums are the most informed gamers around. Not your typical social media groups where they swallow PR whole without question. Here, we investigate, go to the source, booting up GPUView to compare GPUs.

It's from tech forums such as these that we uncovered 3.5GB 970. Where we proved Maxwell was incapable of Preemption. Incapable of Async Compute. We didn't swallow PR whole and investigated.

And yes, good DX12/Vulkan needs multi-paths to target different architectures. This is what a low-level API is all about, it shifts the responsibilities onto the developers. If they are incapable of it, then stick to DX11.

http://www.overclock3d.net/news/gpu..._they_chose_opengl_vulkan_over_directx11_12/1

And yet we have this thread and much worse elsewhere with a significant number of people claiming things like pascal does preemption not async compute, that multiple render paths are now required despite the very titles being used as comparisons not doing that (should AotS or Doom have stuck to DX11/OGL?i of course not) and complaining that there's less compute in a canned benchmark than AotS. I mean...
 

jckaboom

Junior Member
Jul 20, 2016
2
0
0
Gotta love PR people pointing to other PR people.

And from that link to this link http://radeon.com/asynchronous-compute/


Here is the white paper about it http://amd-dev.wpengine.netdna-cdn....10/Asynchronous-Shaders-White-Paper-FINAL.pdf

If the aim of this benchmark is to show eye candy, then, good job!
If the aim is to get meaningful results that can be used as a performance estimate on future DX 12 / Vulkan games, then, this benchmark just isn't there at this time.


Thanks for the link.
There, they said than overlapping multiple task in multiple queues maximize performance. That Make sense.
But the queues on Fury x are two vs gtx1080 than is using 2+"extra queues than do not originate from the engine...".you can read that on the FM WEB and see on the gpuview pictures.

http://www.futuremark.com/pressreleases/a-closer-look-at-asynchronous-compute-in-3dmark-time-spy

So, can some one explain the difference on queues and how much performance gains if there is any?
 
Last edited:
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |