(Discussion) Futuremark 3DMark Time Spy Directx 12 Benchmark

Red Hawk · Jul 17, 2016

Dygaza said:
My bad for not being clear. I'm not disputing the fact that there is a lot of tesselation. Just the claim that AMD cards are suffering from it. As clearly it's not holding them back, atleast in these framerates yet.

Gotcha, so we're in agreement then.

FM_Jarnis said:
DX12 can handle a ton more geometry - it is one of the major features of it - so we put a ton of geometry in the scenes. That was one of the reasons for the "Museum" theme - lots of transparent display cases with massive number of unique objects. We also raided our old assets fairly extensively and touched them up for physically based rendering to push up the scene complexity.

At the start of the Demo you can see Fire Strike (okay, a piece of it, with some simplifications, but the geometry of the area that is in the case is the same) and it continues running in the display case all trough the demo, just small part of the whole scene. The demo also has *two* museums and you see the "past" version in the time scope throughout the scene.

AMD hasn't had a problem with large amount of tessellation for several years. You're thinking of very old cards...

Yoink. The Fire Strike case has the same geometry as that area in the Fire Strike demo? And it keeps running while the rest of the Time Spy demo plays? Crazy!

tweakboy · Jul 17, 2016

Will DX12 work on Windoiws 7 ?

ThatBuzzkiller · Jul 18, 2016

FM_Jarnis said:
No, it has a single DX12 FL 11_0 code path.

How exactly are you guys advertising support for async compute when seperate compute queues are not exposed for FL 11_0 and 11_1 much like how resource binding or ExecuteIndirect has no equivalent for D3D11.3 ?

Was there any special work around to get async compute working despite only having a FL 11_0 codepath ?

Very peculiar choice on somewhat only emphasizing async compute when DX12 also exposes bindless resources which is natively supported on ALL relevant IHVs including consoles too so I'm surprised why your team didn't make use of that functionality despite support for it being universal both hardware wise and software wise (well as far as consoles go) but it will soon come to PC ...

Bindless is really interesting for deferred texturing ...

jj109 · Jul 18, 2016

Feature level doesn't dictate the number of queues available.

Using FL11_0 doesn't mean using DX11.3

jj109 · Jul 18, 2016

Elixer said:
As was previously shown in this thread, the Fury card is faster than Nvidia's newest lineup using a fairly well optimized Vulkan game (Doom), and that is a hard pill to swallow for people who just dropped down $500+ for their newest toy.

So, once again, it proves that 3Dmark is basically worthless as a real world test to show what a full implementation of DX12/Vulkan can do.

By a full implementation DX12 you mean using GCN specific shaders in a low level vendor specific code path? Intrinsic shaders are where most of the Doom Vulkan gains are coming from.

FM_Jarnis · Jul 18, 2016

You cannot make a fair benchmark if you start bolting on vendor-specific and even specific generation architecture centered optimizations.

DX12 is a standard. We made a benchmark according to the spec, up to the graphics card vendors how their products work implementing the spec (if they do not follow it, MS won't certify the drivers, so they do follow it).

Beyond that, we will be publishing an official clarification on this issue, probably later today or tomorrow. I fear it won't placate all the people who are going nuts over this with their claims, but we'll do our best.

linkgoron · Jul 18, 2016

FM_Jarnis said:
You cannot make a fair benchmark if you start bolting on vendor-specific and even specific generation architecture centered optimizations.

DX12 is a standard. We made a benchmark according to the spec, up to the graphics card vendors how their products work implementing the spec (if they do not follow it, MS won't certify the drivers, so they do follow it).

Beyond that, we will be publishing an official clarification on this issue, probably later today or tomorrow. I fear it won't placate all the people who are going nuts over this with their claims, but we'll do our best.

The question is what does FM want to do with the benchmark tool. If you just want to benchmark FL 11_0, then OK. For examaple, it could be a tool to tell you if your OC is OK, or if your performance is around the same as others with similar systems.

However, if most *actual* DX12 games use (at least) some vendor specific optimizations, or (at least) some parts of FL 12_0, then FM is basically a meaningless number at least in the sense of representing how actual _real world_ engines perform.

3DVagabond · Jul 18, 2016

FM_Jarnis said:
You cannot make a fair benchmark if you start bolting on vendor-specific and even specific generation architecture centered optimizations.

DX12 is a standard. We made a benchmark according to the spec, up to the graphics card vendors how their products work implementing the spec (if they do not follow it, MS won't certify the drivers, so they do follow it).

Beyond that, we will be publishing an official clarification on this issue, probably later today or tomorrow. I fear it won't placate all the people who are going nuts over this with their claims, but we'll do our best.

Who's going nuts? People have questions as some of the statements are a bit vague.

dzoni2k2 · Jul 18, 2016

FM sells a product and users have legitimate concerns. It's in their best interest to address those concerns.

PhonakV30 · Jul 18, 2016

FM_Jarnis said:
You cannot make a fair benchmark if you start bolting on vendor-specific and even specific generation architecture centered optimizations.

DX12 is a standard. We made a benchmark according to the spec, up to the graphics card vendors how their products work implementing the spec (if they do not follow it, MS won't certify the drivers, so they do follow it).

Beyond that, we will be publishing an official clarification on this issue, probably later today or tomorrow. I fear it won't placate all the people who are going nuts over this with their claims, but we'll do our best.

Question is that You do allow IHV to cheat their score?
When benchmark asks GPU to do Parallel Queues (Or doing Concurrency Tasks) , Driver should do exactly what benchmark says not making them into single queues.those maxwell's score are invalid.I have no problem for GCN and Pascal but maxwell , well i can't accept.

When you say I can't do it then your score will be lower than the one that you can do it.

Flapdrol1337 · Jul 18, 2016

behrouz said:
Question is that You do allow IHV to cheat their score?
When benchmark asks GPU to do Parallel Queues (Or doing Concurrency Tasks) , Driver should do exactly what benchmark says not making them into single queues.those maxwell's score are invalid.I have no problem for GCN and Pascal but maxwell , well i can't accept.

When you say I can't do it then your score will be lower than the one that you can do it.

Why does it matter what the driver and graphics card do if the result on screen is the same?

sontin · Jul 18, 2016

behrouz said:
Question is that You do allow IHV to cheat their score?
When benchmark asks GPU to do Parallel Queues (Or doing Concurrency Tasks) , Driver should do exactly what benchmark says not making them into single queues.those maxwell's score are invalid.I have no problem for GCN and Pascal but maxwell , well i can't accept.

When you say I can't do it then your score will be lower than the one that you can do it.

So, you want that the benchmark doesnt work on Kepler, Maxwell and Intel GPUs?
Or do you want that nVidia is disabling the "compute queue" for Kepler and Maxwell and forcing developers to create multiple renderpaths?

Samwell · Jul 18, 2016

linkgoron said:
The question is what does FM want to do with the benchmark tool. If you just want to benchmark FL 11_0, then OK. For examaple, it could be a tool to tell you if your OC is OK, or if your performance is around the same as others with similar systems.

However, if most *actual* DX12 games use (at least) some vendor specific optimizations, or (at least) some parts of FL 12_0, then FM is basically a meaningless number at least in the sense of representing how actual _real world_ engines perform.

All games so far use pure FL11_0, that's the same as FM. Actually by using FL12_0 it would be meaningsless, because it will be some time, till we see DX12_0 games.

Dygaza · Jul 18, 2016

sontin said:
So, you want that the benchmark doesnt work on Kepler, Maxwell and Intel GPUs?
Or do you want that nVidia is disabling the "compute queue" for Kepler and Maxwell and forcing developers to create multiple renderpaths?

I don't usually agree with you, but this post nailed it rather well. Seriously I still can't understand why people moan how nvidia handle's the code in their hardware. If their card can't run pareller code, then it's only natural they run it serial. Even if nvidia would eneble async compute in their Maxwell (Kepler aswell) driver, it would still run it serially since it can't do it any otherway.

sontin · Jul 18, 2016

Async Compute is activated. People need to start to understand that Async Compute is a API feature. A software concept how to schedule workload to a GPU.

It doesnt say anything how the driver and hardware is handling the workload.

Would nVidia disable the "compute queue" developer wouldnt be able to create a compute commandlist. Instead of having a good compartibility between different hardware plattforms they would be forced to explizit write their code towards any vendor.

Or simple: It wouldnt make sense to use Async Compute at all...

trinibwoy · Jul 18, 2016

behrouz said:
Question is that You do allow IHV to cheat their score?
When benchmark asks GPU to do Parallel Queues (Or doing Concurrency Tasks) , Driver should do exactly what benchmark says not making them into single queues.those maxwell's score are invalid.I have no problem for GCN and Pascal but maxwell , well i can't accept.

When you say I can't do it then your score will be lower than the one that you can do it.

This sums up the problem right here.

The DX12 API does not mandate that async queues be executed in parallel. That is a hardware implementation detail. If hardware runs those tasks sequentially it's not "cheating". Parallelism does not change the output that you see on the screen.

HWfreak · Jul 18, 2016

FM_Jarnis said:
We've actually discussed about the option of using a third party graphics engine for a 3DMark test - our artists at least would sure love the more mature art pipelines of full game engines - but our benchmark development program members have indicated that it would reduce the usefulness of 3DMark to them. "If you want to benchmark with a game engine, run a game made using that engine."

3DMark Time Spy engine is specifically written to be a neutral, "reference implementation" engine for DX12 FL11_0.

Less useful to them?

I understand the engine your using may well be better suited to what you're doing but TBH its a bit of a copout to make a DX12 benchmark that does not even use the full feature set of DX12.

"Specifically written to be a neutral"
Its a benchmark application, its supposed to push GPU's for all they are worth, if that means AMD come out on top because of A-Sync Shading then let Nvidia cry about it, maybe its the push they need to update their out dated hardware, with developers constantly bending to their will innovation is choked, thank you very much.

Still waiting on A-Sync for my 970 Nvidia......

Flapdrol1337 · Jul 18, 2016

FM_Jarnis said:
We've actually discussed about the option of using a third party graphics engine for a 3DMark test - our artists at least would sure love the more mature art pipelines of full game engines - but our benchmark development program members have indicated that it would reduce the usefulness of 3DMark to them. "If you want to benchmark with a game engine, run a game made using that engine."

3DMark Time Spy engine is specifically written to be a neutral, "reference implementation" engine for DX12 FL11_0.

Are there useful features in 11_1?

Afaik even kepler has almost all 11.1 features and there are no fermi or vliw dx12 drivers anyway. Seems likely game devs would use those if they're useful.

dzoni2k2 · Jul 18, 2016

trinibwoy said:
This sums up the problem right here.

The DX12 API does not mandate that async queues be executed in parallel. That is a hardware implementation detail. If hardware runs those tasks sequentially it's not "cheating". Parallelism does not change the output that you see on the screen.

I agree with you. The problem I and from what I understand others have is what exactly is the objective of TimeSpy benchmark and it's AC implementation. If the objective is to test hardware ability to execute graphics and compute queues concurrently and in parallel fashion, then it obviously failed in that mission. So what is the point of TimeSpy AC then if one vendor serializes those queues which is basically the same as turning it off in custom settings.

If that was not the objective then what was? That's pretty much all I guess.

sontin · Jul 18, 2016

Kepler is Feature Level 11_0 under DX11 and DX12.

dzoni2k2 said:
I agree with you. The problem I and from what I understand others have is what exactly is the objective of TimeSpy benchmark and it's AC implementation. If the objective is to test hardware ability to execute graphics and compute queues concurrently and in parallel fashion, then it obviously failed in that mission. So what is the point of TimeSpy AC then if one vendor serializes those queues which is basically the same as turning it off in custom settings.

If that was not the objective then what was? That's pretty much all I guess.

But Time Spy is "testing" the AC implementation. You get more performance when the hardware supports it.

I mean how exactly should Futuremark punish nVidia? Even if nVidia would schedule the compute workload into the compute queue the hardware is unable to process it concurrently with the graphics queue...

HWfreak · Jul 18, 2016

What Nvidia and FM call A-Sync is not A-Sync in the same sense what AMD call A-Sync, the sort of A-Sync FM Time Spy does not even have, the sort of A-Sync that makes a stock 390X beat a stock 980TI in Hitman.

What AMD call A-Sync is Hardware level A-Synchronous Compute Shading, what Nvidia and FM call A-Sync is software level Call Pre-Emption, AMD's solution is far better.

AMD themselves explain it best.

https://www.youtube.com/watch?v=v3dUhep0rBs

Yakk · Jul 18, 2016

HWfreak said:
What Nvidia and FM call A-Sync is not A-Sync in the same sense what AMD call A-Sync, the sort of A-Sync FM Time Spy does not even have, the sort of A-Sync that makes a stock 390X beat a stock 980TI in Hitman.

What AMD call A-Sync is Hardware level A-Synchronous Compute Shading, what Nvidia and FM call A-Sync is software level Call Pre-Emption, AMD's solution is far better.

AMD themselves explain it best.

https://www.youtube.com/watch?v=v3dUhep0rBs

Bingo!

Interchanging the terms Async and Pre-emption balancing is extremely misleading and.. Wrong.

One could almost suspect an IHV's marketing team is working overtime to confuse everyone as much as they can.

FM_Jarnis · Jul 18, 2016

HWfreak said:
Less useful to them?

I understand the engine your using may well be better suited to what you're doing but TBH its a bit of a copout to make a DX12 benchmark that does not even use the full feature set of DX12.

The number of cards that could run "full feature set of DX12" is still too small. There would've been quite a bit of complaining from people who would've simply said "I can run Hitman, Ashes of Singularity etc. on DX12 on my <insert DX12 FL11.0 card here>, why doesn't 3DMark work? You guys suck".

So we had to start from somewhere. This will not be the "last 3dmark ever". FL12 is definitely interesting, but games are not yet using it, so it is more of a 2017 thing.

dogen1 · Jul 18, 2016

behrouz said:
Question is that You do allow IHV to cheat their score?
When benchmark asks GPU to do Parallel Queues (Or doing Concurrency Tasks) , Driver should do exactly what benchmark says not making them into single queues.those maxwell's score are invalid.I have no problem for GCN and Pascal but maxwell , well i can't accept.

When you say I can't do it then your score will be lower than the one that you can do it.

No. It's microsoft that allows that. Asynchronous does not mean concurrent. Concurrency is a possibility depending on the hardware, not a requirement.

3DVagabond said:
Who's going nuts? People have questions as some of the statements are a bit vague.

He's most likely talking about the children on the 3d mark steam board.

HWfreak said:
What Nvidia and FM call A-Sync is not A-Sync in the same sense what AMD call A-Sync, the sort of A-Sync FM Time Spy does not even have, the sort of A-Sync that makes a stock 390X beat a stock 980TI in Hitman.

What AMD call A-Sync is Hardware level A-Synchronous Compute Shading, what Nvidia and FM call A-Sync is software level Call Pre-Emption, AMD's solution is far better.

AMD themselves explain it best.

[/QUOTE] I thought the hitman devs...nes not EXACTLY how it's supposed to be used?

Samwell · Jul 18, 2016

ThatBuzzkiller said:
It's you that's spreading misinformation ...

https://msdn.microsoft.com/en-us/library/windows/desktop/ff476154(v=vs.85).aspx

How do you explain why the library interface for D3D11.3 is different from D3D12 ?

Why doesn't D3D11.3 expose seperate queues or synchronization primitives for them ?

Because DX11.3 is based on DX11. You don't understand the difference between API and hardware levels. DX11.3 was introduced to expose hardware features which were introduced in DX12/12.1 like CR or ROV in the DX11 API. You can use your old high level DX11 engine and add this features. The DX11 API has feature levels from 11.0-11.3.

DX12 is a different API with lower level programming and support for Multiengine (Async), Executeindirect and so on. These are general features of DX12. You always support them. Hardware levels then from 11_0 to 12_1 have different features in mind. Just because Async is only introduced with the DX12 API doesn't mean, you don't have it with with feature level 11. Don't confuse DX11.2 or DX11.3 with feature levels of DX12.

But i'm out here, i wanted to help you to understand the api better, but this is my last try. You can ask the people at beyond3d, which have more knowleadge and they can explain it to you even more, but summarized it will be the same.

(Discussion) Futuremark 3DMark Time Spy Directx 12 Benchmark

Diamond Member

Diamond Member

Golden Member

Senior member

Senior member

Member

Platinum Member

Lifer

Member

Senior member

Golden Member

Diamond Member

Senior member

Member

Diamond Member

Senior member

Member

Golden Member

Member

Diamond Member

Member

Golden Member

Member

Senior member

Senior member