(Discussion) Futuremark 3DMark Time Spy Directx 12 Benchmark

sontin · Jul 19, 2016

Wow. Thank you for your openness. Oh and i like these GPUView screens.

Dygaza · Jul 19, 2016

FM_Jarnis said:
Our clarification on how Time Spy works:

http://www.futuremark.com/pressreleases/a-closer-look-at-asynchronous-compute-in-3dmark-time-spy

Very good read, thanks for your very thick skin.

I'm no expert of GPUview, but one thing does catch my eye, and that's the same that people has been saying about aots and doom, is that they have a lot more compute in them. Nothing wrong in that though, aots is known to be very compute heavy, and there is no standard how much compute game needs to have. Some has more , some has less.

sontin · Jul 19, 2016

What you see in GPUView is time and not workload. It doesnt say anything about the amount work a card has to do.

/edit: You can see it with Fury X and GTX1080: The first compute workload looks "longer" on the GTX1080. But it is just the time the hardware has needed to finish the job.

Dygaza · Jul 19, 2016

sontin said:
What you see in GPUView is time and not workload. It doesnt say anything about the amount work a card has to do.

/edit: You can see it with Fury X and GTX1080: The first compute workload looks "longer" on the GTX1080. But it is just the time the hardware has needed to finish the job.

I didn't mean these GPUview shots from futuremark. But ones posted from Aots and Doom compared to Time Spy. Aots had a lot more compute queyes running compared by time. I might read it wrong. But that's not even important, every game and application has different balance between graphics and compute tasks.

Nevermind now I understand.

Elixer · Jul 19, 2016

I find this interesting.
Fury w/ASYNC:

1080 w/ASYNC:

Fury wo/ASYNC:

1080 wo/ASYNC:

AMD goes from 2 (w/Async) contexts to 1(wo/Async), Nvidia goes from 4(w/Async) to 2(wo/Async).

Is that what 3dMark is doing, or is this what the drivers are doing?

FM_Jarnis · Jul 19, 2016

That is, as far as we can tell, that is what drivers or D3D is doing. "not ours, probably drivers, but we can't be 100% sure" was what the programmers said.

Ask NVIDIA?

Elixer · Jul 19, 2016

FM_Jarnis said:
That is, as far as we can tell, that is what drivers or D3D is doing. "not ours, probably drivers, but we can't be 100% sure" was what the programmers said.

Ask NVIDIA?

I see. Does it do the same if you were to rename 3dmark to something else?
Looks like the nvidia drivers are detecting 3dmark is running, and running "optimizations".

While on the this subject, those same tests run in crossfire/SLI would also shed much more light on this...

Also, how big are the ETL files from the tests you are doing?

FM_Jarnis · Jul 20, 2016

Yes, it does the same no matter what the executable is named. If latest NVIDIA drivers do not yet even auto-switch to dGPU on laptops with 3DMarkTimeSpy.exe, they sure don't have any such shenanigans in it. Plus we obviously check for executable-name-based "optimizations" when doing 3DMark driver approval.

No idea about the ETL file size - recommend emailing Jani Joki and asking (the email address is at the end of the statement)

Azix · Jul 20, 2016

whats with all the fences?

How do you check for executable based optimizations? Detect that a driver is detecting that your software is running? And are exe name based optimizations the only way to optimize?

FM_Jarnis · Jul 20, 2016

Azix said:
whats with all the fences?

How do you check for executable based optimizations? Detect that a driver is detecting that your software is running? And are exe name based optimizations the only way to optimize?

Good questions, suggest you email them to Jani Joki, our director of engineering. He can answer them a lot better than I can. Email at the bottom of the statement we did.

Silverforce11 · Jul 20, 2016

Detailed write-up here at the usage of "Async Compute" in Time Spy DX12:

http://www.overclock-and-game.com/news/pc-gaming/50-analyzing-futuremark-time-spy-fiasco

I find it funny that NV showed of "Async Compute" on DX11 with Pascal at one point.

Preemption has never been about parallel execution of work. Yet they market it as "Async Compute" anyway.

The approach of one-size-fits-all that FM has taken with Time Spy shows NV's Preemption tech can indeed improve shader utilization, that at least, is a good thing.

dogen1 · Jul 20, 2016

Silverforce11 said:
Detailed write-up here at the usage of "Async Compute" in Time Spy DX12:

That article is full of errors and is just plain poorly written.

I find it funny that NV showed of "Async Compute" on DX11 with Pascal at one point.

And we've already established there is nothing preventing this from being possible in DX11. There's just no explicit control of it on the application side. It was odd for them to show it that way though.

FM_Jarnis · Jul 20, 2016

Silverforce11 said:
Detailed write-up here at the usage of "Async Compute" in Time Spy DX12:

http://www.overclock-and-game.com/news/pc-gaming/50-analyzing-futuremark-time-spy-fiasco

I find it funny that NV showed of "Async Compute" on DX11 with Pascal at one point.

Preemption has never been about parallel execution of work. Yet they market it as "Async Compute" anyway.

The approach of one-size-fits-all that FM has taken with Time Spy shows NV's Preemption tech can indeed improve shader utilization, that at least, is a good thing.

I find the statement about "CPU not affecting graphics score" to be quite funny. The whole point of the graphics tests is to be an isolating GPU test that is not affected by the processor. To be a GPU benchmark, you know?

Games where CPU clock speed matters are generally CPU-limited, which means the result is not truly reflecting the capabilities of the GPU as it is being held back by the CPU. I should point out that at least the GPUview shots of Doom that I've seen show that it *is* CPU limited. No idea how or why, but anyway. Now I haven't tested this myself, I have no idea what system was used for those shots etc. so please don't start an argument over that. This is just a friendly hint that the author may want to investigate Doom and figure out if it is CPU limited. I think it may be.

Oh, and the claim that Time Spy somehow is starving the GPU of things to process while being unaffected by CPU overclock...

Silverforce11 · Jul 20, 2016

dogen1 said:
That article is full of errors and is just plain poorly written.

And we've already established there is nothing preventing this from being possible in DX11. There's just no explicit control of it on the application side. It was odd for them to show it that way though.

Errors or just an inconvenient truth?

Need I remind you... wait, I should. NVIDIA claimed Maxwell could do Preemption too, back in 2014. Nobody questioned them, except for a few, myself included. We were proven right. Now NV claims they can do Async Compute? FUD. Preemption is not Async Compute, it's not Multi-Engine. It's a hack, a band-aid.

Sorry bud, you've established some fantasy is not the same as WE, the collective, established that DX11 is capable of parallel graphics + compute + copy workloads.

It would be nice if you could find some credible developers who say such things about DX11 to back your claims, but when you talk about this subject, it's always back to DX12/Vulkan/Mantle.

Silverforce11 · Jul 20, 2016

@FM_Jarnis
That article ain't mine. It's from a tech blogger.

I only wanted to cover the Async Compute portion of it. And unless GPUView lies, Time Spy is not submitting work to GCN in an optimized manner at all. Period.

Again it falls back to your one-size-fits-all approach. This is NOT how DX12/Vulkan should be done, closer to the metal puts the ONUS on the developer to target architectures properly to extract peak performance out of them.

If you disagree, you just disagreed with both AMD & NV and what they tell developers their recent GDC presentation on this topic, clearly stating that DX12/Vulkan needs separate rendering paths for different architectures.

Edit: If that can't be done for whatever reason and the fall back is a low hanging fruit approach, don't then try to claim that your DX12 implementation is how the gaming industry is going to do it. There's zero evidence for it when we compare GPUView with current DX12 games vs Time Spy.

dogen1 · Jul 20, 2016

Silverforce11 said:
It would be nice if you could find some credible developers who say such things about DX11 to back your claims, but when you talk about this subject, it's always back to DX12/Vulkan/Mantle.

I've already cited andrew lauritzen on it.

FM_Jarnis · Jul 20, 2016

Silverforce11 said:
@FM_Jarnis
That article ain't mine. It's from a tech blogger.

Apologies. My mistake. I've edited my post to reflect this.

guskline · Jul 20, 2016

Silverforce11: A question about the article you cited which I read. Surely the writer, who doesn't state his background, appears to profer that TimeSpy was either intentionally or unwittingly "used" by Nvidia to the detriment of AMD, yet near the end he mentions that BOTH AMD and Nvidia signed off on TimeSpy.

" It would be nice if Futuremark would create two paths, but apparently they will not since AMD and Nvidia signed off the current release."

AMD has been vigilant about Mantle, Async Compute etc.

I find it hard to believe, that AMD, or more particulary RTG would sign off on the Time Spy release, if what you assert was so much against them. Does that make sense?

96Firebird · Jul 20, 2016

Wow, that website referenced above is one of the most biased sites I've ever seen posted here. Suggest you don't post any of their articles again, unless you want to be laughed at...

Jarnis, I commend you for trying to explain to the people here the choices FM made for this benchmark, but do understand you are talking to some who will just not accept anything you say. Don't take it personal, just the way things are 'round here.

Zstream · Jul 20, 2016

96Firebird said:
Wow, that website referenced above is one of the most biased sites I've ever seen posted here. Suggest you don't post any of their articles again, unless you want to be laughed at...

Jarnis, I commend you for trying to explain to the people here the choices FM made for this benchmark, but do understand you are talking to some who will just not accept anything you say. Don't take it personal, just the way things are 'round here.

Seriously? This is what you log on to Anandtech forums and post?

It's unfortunate that the developers did not include a major feature of technology. That I can live with, because I won't be purchasing the product. However, to dismiss the claims and state that's how future games will be developed is just conjecture and nauseating at best.

Some people really want technology pushed to the next level, some don't.

96Firebird · Jul 20, 2016

Zstream said:
Seriously? This is what you log on to Anandtech forums and post?

I don't need to log on, I'm automatically logged in on this computer.

Dygaza · Jul 20, 2016

What I really miss in 3dmark are those external tests. Like we used to have tests for fillrate etc. Perhaps we could see external tests added later.

FM_Jarnis · Jul 20, 2016

Dygaza said:
What I really miss in 3dmark are those external tests. Like we used to have tests for fillrate etc. Perhaps we could see external tests added later.

Well, we did add API Overhead test a while ago and we are indeed considering some potential feature tests, but nothing has been decided yet. Just general benchmark R&D, always ongoing...

dacostafilipe · Jul 20, 2016

I don't know what to think about this.

Okay, you want a single path to better compare GPU directly. Why not, makes sense!

But that also could give a specific GPU architecture an advantage because it's more suited to the way your engine works. You call it fairness, others call it biased. I think both sides have a point.

It's even worse for ASC (Async Compute). It would be a great opportunity to compare the max possible gaines between GPU vendors/generation with ASC, but with suboptimal utilisation on some vendors this is not possible at the moment.

FM_Jarnis · Jul 20, 2016

NeoLuxembourg said:
It's even worse for ASC (Async Compute). It would be a great opportunity to compare the max possible gaines between GPU vendors/generation with ASC, but with suboptimal utilisation on some vendors this is not possible at the moment.

It is not a suboptimal utilization. I have no clue where you pull that out of.

There is no way to magically change between "optimal" and "suboptimal". DX12 application just files work to two queues - DIRECT and COMPUTE and the rest is up to the driver. AMD engineers themselves worked on suggesting (vendor neutral) optimizations for the Time Spy async compute code and as far as I know they are happy with it.

Yes, there are other low level optimizations (not related to compute) that could be done that rely on specific architectures, but if you start down that road, where does it end? AMD and NVIDIA coding the benchmark for their own architectures? I mean, if we would do vendor-specific paths ourselves, as soon as first numbers are out, you would be posting in a brand new thread and going on and on as to how "Futuremark cannot optimize properly" or "Futuremark favored this and that vendor". With any luck, both green and red team "fans" will simultaneously claim the same thing.

Also as soon as a new architecture ships, first reaction would be "well it is not properly optimized for it, the scores are not comparable". What use such a benchmark would be for releases of new hardware using new architecture?

You seem to fundamentally misunderstand what is a benchmark and what it is supposed to do.

(Discussion) Futuremark 3DMark Time Spy Directx 12 Benchmark

Diamond Member

Member

Diamond Member

Member

Lifer

Member

Lifer

Member

Golden Member

Member

Lifer

Senior member

Member

Lifer

Lifer

Senior member

Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Member

Member

Senior member

Member