[bitsandchips]: Pascal to not have improved Async Compute over Maxwell

Lepton87 · May 18, 2016

sontin said:
Yes, it is possible. nVidia is offloading the compute part to the second GPU.

Are there any benchmarks of that Ashes of Singularity with high-end cards helped by dedicated compute cards? I'd like to know how 980Ti+Titan(or 390X) does compared to FURY+Titan(or 390X).

It seems the lack of AsynCompute is not a problem to a big portion of the enthusiasts market. A lot of high-end computer owners can install a dedicated compute card. Most can even have SLI+compute card.

krumme · May 18, 2016

LTC8K6 said:
If Pascal runs DX12 well, no one is going to care if it has AC or not, and no one is going to care what NV calls whatever it does have.

Heck, if it "brute forces" it's way to running DX12 well, no one will care, either. It may even appeal to a lot of people.

I think thats dependant on how clean sheet engines we get. How agressive asynch is used in the comming games and to what degree nv can disable it in the console ports and generally tesselate everything hidden so to speak.
It might perfectly be -and highly likely imo- that true asynch and ace like functionality doesnt really matter before nv new arch is ready. They are they market leader and have shown until now their betting is right. They dont control the tech development now but they are in control of the pc market. They can not dictate but manage/adapt and they are good at that.

sontin · May 18, 2016

Despoiler said:
No, what he said is that "you" ie a dev can specify to use static mode "if you really want to". Dynamic mode is the new mode and by saying "if you really want to" sounds like dynamic mode is the default behavior. Dynamic mode is done in software in the driver.

"You can full up the whole machine like you wanted". The decision is not up to the developer.

airfathaaaaa · May 18, 2016

Lepton87 said:
Are there any benchmarks of that Ashes of Singularity with high-end cards helped by dedicated compute cards? I'd like to know how 980Ti+Titan(or 390X) does compared to FURY+Titan(or 390X).

It seems the lack of AsynCompute is not a problem to a big portion of the enthusiasts market. A lot of high-end computer owners can install a dedicated compute card. Most can even have SLI+compute card.

airfathaaaaa · May 18, 2016

sontin said:
"You can full up the whole machine like you wanted". The decision is not up to the developer.

pretty sure it is

Despoiler · May 18, 2016

sontin said:
"You can full up the whole machine like you wanted". The decision is not up to the developer.

"You" is the developer. He is talking to a room of developers. Do you think he is talking to the video card? lol.

Lepton87 · May 18, 2016

airfathaaaaa said:

I saw those benchmarks and I thought those charts show what happens when both cards do the rendering and the compute instead of reserving one card for just compute.

Abwx · May 18, 2016

Lepton87 said:
I saw those benchmarks and I thought those charts show what happens when both cards do the rendering and the compute instead of reserving one card for just compute.

If the two cards were doing the renderings and computing equally then the scores would be the same when inverting the configuration from AMD + Nvidia to Nvidia + AMD.

thesmokingman · May 18, 2016

Lepton87 said:
I saw those benchmarks and I thought those charts show what happens when both cards do the rendering and the compute instead of reserving one card for just compute.

Don't you think that that is the most extravagant work around for compute? Why would one buy two cards just to dedicate one to compute? Someone link us up the jackie chan pic pulling hair out.

Lepton87 · May 18, 2016

thesmokingman said:
Don't you think that that is the most extravagant work around for compute? Why would one buy two cards just to dedicate one to compute? Someone link us up the jackie chan pic pulling hair out.

I don't talk about buying two cards just about keeping the card I currently own. It is very sensible and very CHEAP work-around for architectural inadequacy of NV. It will totally work for me.

If the two cards were doing the renderings and computing equally then the scores would be the same when inverting the configuration from AMD + Nvidia to Nvidia + AMD.

No, it is about drivers overhead which is different for NV and AMD.

thesmokingman · May 18, 2016

Lepton87 said:
I don't talk about buying two cards just about keeping the card I currently own. It is very sensible and very CHEAP work-around for architectural inadequacy of NV. It will totally work for me.

No, it is about drivers overhead which is different for NV and AMD.

That's pretty funny, you don't see the irony.

Wrong AC is Dx12 which doesn't bring with it that overhead.

Lepton87 · May 18, 2016

thesmokingman said:
That's pretty funny, you don't see the irony.

Wrong AC is Dx12 which doesn't bring with it that overhead.

Shifting gears, lets take a look at multi-GPU performance on the latest Ashes beta. The focus of our previous article, Ashes support for DX12 explicit multi-GPU makes it the first game to support the ability to pair up RTG and NVIDIA GPUs in an AFR setup. Like traditional same-vendor AFR configurations, Ashes AFR setup works best when both GPUs are similar in performance, so although this technology does allow for some unusual cross-vendor comparisons, it does not (yet) benefit from pairing up GPUs that widely differ in performance, such as a last-generation video card with a current-generation video card. None the less, running a Radeon and a GeForce card together is an interesting sight, if only for the sheer audacity of it.

Clearly I think I'm right and both cards are used for rendering.

thesmokingman · May 18, 2016

Lepton87 said:
Clearly I think I'm right and both cards are used for rendering.

What does that have to do with a dedicated card for compute that doesn't exist in the consumer space?

And your quoting Ashes with Explicit Multi-adapter. What has that to do with compute?

2is · May 18, 2016

AtenRa said:
We are not discussing if the game is good or bad or if it sold millions of copies.

Here we are talking about the Async Compute capabilities of the GPUs.

As of now, AoTS and Hitman does use Async Compute and thus we are using those two Games. Im sure Warhummer will use AC as well Deus Ex.
And knowing Dice, it is highly possible that BF1 will also be DX-12 and will use Async Compute as well.

So, it is understandable for certain people trying to divert the conversation outside of the threads Title, but that doesnt change the fact that Pascal cannot do Async Compute like GCN does.
And the only way NV hardware will be faster in DX-12 games its through its GameWorks initiative.

We are discussing a whole lot of things. I get that you don't like the fact that the pro-AMD argument goes to crap once you factor in relavence, but that's just the reality of the situation.

Lepton87 · May 18, 2016

thesmokingman said:
What does that have to do with a dedicated card for compute that doesn't exist in the consumer space?

And your quoting Ashes with Explicit Multi-adapter. What has that to do with compute?

I asked for benches with a dedicated compute card and someone posted those benches.
Not dedicated like the late AGEIA just a graphics card assigned that role in the drivers.

thesmokingman · May 18, 2016

Lepton87 said:
I asked for benches with a dedicated compute card and someone posted those benches.
Not dedicated like the late AGEIA just a graphics card assigned that role in the drivers.

Those graphs are a pre bench of Ashes implementation of Explicit Multi Adapter (using two dissimilar discrete gpus as one ie. sli/cfx) not a test of compute. Compute as far as we are concerned in not an aspect under control. In fact as of right now there is no way that I'm aware of to separate the compute as it is not controlled by the driver. It's out of the card makers hands and in the game developers. Nvidia had to ask AOTS dev to bypass Async Compute in their game during that whole Nvidia has no AC debacle till they got things sorted on their end. Stardock had to write specific code to interact with NV gpus in a specific manner to ignore that the driver listed it had AC when it didn't.

Thus how you imagine you're going to stick in a card for dedicated compute, well it does not compute. :sneaky:

Lepton87 · May 18, 2016

thesmokingman said:
Those graphs are a pre bench of Ashes implementation of Explicit Multi Adapter (using two dissimilar discrete gpus as one ie. sli/cfx) not a test of compute. Compute as far as we are concerned in not an aspect under control. In fact as of right now there is no way that I'm aware of to separate the compute as it is not controlled by the driver. It's out of the card makers hands and in the game developers. Nvidia had to ask AOTS dev to bypass Async Compute in their game during that whole Nvidia has no AC debacle till they got things sorted on their end. Stardock had to write specific code to interact with NV gpus in a specific manner to ignore that the driver listed it had AC when it didn't.

Thus how you imagine you're going to stick in a card for dedicated compute, well it does not compute. :sneaky:

First I asked if it is possible at all. Someone said that it is.

Today 03:24 PM
Lepton87 Can we resolve the performance hit by dedicating an another card to compute? Like we can dedicate a whole card to PSY-X? If so, I'll just keep the Titan in my computer along with the 980Ti. I had it for PSY-X but there were so few titles that I was going to give up on that idea but if I can do that for DX12 then that would be great.
ps. Is someone also annoyed by "IF NV doesn't support something it might just as well not exist"crowd? This is a terrible approach that stalls progress.

sontin

Quote:

Yes, it is possible. nVidia is offloading the compute part to the second GPU.

thesmokingman · May 18, 2016

Yea, that's credible. lol

EMA has only been done/shown to work in one of two true direct x games. You could try it if you want, stick a Tesla K20 which is the cheapest compute/accelerator at $3K into your rig and cross your fingers...?

Abwx · May 18, 2016

Lepton87 said:
First I asked if it is possible at all. Someone said that it is.

Yes but without giving any info, link, or anything that could substanciate his sayings, if i say no without even knowing what it is actually i wouldnt be wrong if all is needed is to say yes/no...

Sweepr · May 18, 2016

Ryan Smith said:
Just checked with Dan Baker. Async is still functionally disabled on Ashes when it detects an NVIDIA card, including the GTX 1080 (since they don't have one to test against yet).

https://forum.beyond3d.com/posts/1915300

Zodiark1593 · May 18, 2016

AtenRa said:
We are not discussing if the game is good or bad or if it sold millions of copies.

Here we are talking about the Async Compute capabilities of the GPUs.

As of now, AoTS and Hitman does use Async Compute and thus we are using those two Games. Im sure Warhummer will use AC as well Deus Ex.
And knowing Dice, it is highly possible that BF1 will also be DX-12 and will use Async Compute as well.

So, it is understandable for certain people trying to divert the conversation outside of the threads Title, but that doesnt change the fact that Pascal cannot do Async Compute like GCN does.
And the only way NV hardware will be faster in DX-12 games its through its GameWorks initiative.

Or, I dunno, just lots of brute. If they get enough brute under the hood to match or beat out their competitors with DX12 + Async, there probably wouldn't be a problem for anyone except Nvidia (brute is expensive).

As far as context switching, the high clock speeds should mitigate the penalties for doing so somewhat. (IE, 2K cores @ 2 GHz should suffer less than 4K cores at 1 GHz)

boozzer · May 19, 2016

Det0x said:
I will copy some posts from a other forum

What I noticed, is that a couple of reviewers got confused by the term "Async Compute", using it both for the DX11 extension for explicit preemption by high priority context, and the asynchronous queues in DX12. And mixing these together badly, stating that Pascal would now fully support Async Compute in DX12 because it can do preemption now, or that Maxwell could perform the context switch (the reassignment of SMMs) in DX12 at draw call borders.

I would say NVs marketing for fuzzing this term was a complete success.

Yes preemption and async are different. Sebbi over at beyond3d explained this greatly. Nvidia didn´t talk about async, only preemption and it improved its granularity in Pascal, compiler level doable.

Async in AoTS are compute tasks. The feature is called async-compute but you can run it like an "async-compute" architecture (ala GCN) capable of running at the same time both compute and graphics tasks at CU level or fully using entirely a SM for compute or graphics like Nvidia does.

No async compute in sight.

You can read and learn more over at beyond3d here:

https://forum.beyond3d.com/threads/nvidia-pascal-reviews-1080-and-1070.57930/page-6

*edit*

Dynamic load balancing is a thing - yes, and it is a hardware feature. But it's nowhere the same, or even remotely comparable to GCN's async execution via the independent command lists dispatched by the ACE units.

Dynamic load balancing is only for efficiently switching between compute and graphic workloads inside a single command list, respectively for eliminating the need for a full command buffer flush every time the partition scheme changes.

So you can essentially now:
Upload the next compute only command list while the previous mixed command list is still in execution as the SMMs may now switch the mode lazily after the finished the graphics portion.
Vice versa also when switching back to graphics.
The penalty for a driver screwup when you mix compute and graphics inside a single command list is also eliminated.

Technically, that means there is no longer a scheduling problem just from having compute portions in there, and by that you avoid stalling the command processor.

What it doesn't provide yet, is the resource sharing or the truly asynchronous scheduling AMDs hardware features. So it using asynchronous queues r compute is now only (almost...) "for free", but it's still not gaining you anything.

And without triggering actual, explicit preemption, you are not gaining truly asynchronous, independent execution yet either. You are still subject to all side effects resulting from cooperative scheduling.

But they are unfortunately still referring to their preemption extension for DX11 as "Async Compute" too. On purpose.

that is crazy level FUD. nv PR team are masters.

airfathaaaaa · May 19, 2016

thesmokingman said:
Don't you think that that is the most extravagant work around for compute? Why would one buy two cards just to dedicate one to compute? Someone link us up the jackie chan pic pulling hair out.

i bet we will see this in the future quite a lot...especially now with the new sli...
people might buy 2 for main cards and one for the rest (and we saw it on reddit too showcasing the nvidia demo of ray tracing )

2is · May 19, 2016

airfathaaaaa said:
i bet we will see this in the future quite a lot...especially now with the new sli...
people might buy 2 for main cards and one for the rest (and we saw it on reddit too showcasing the nvidia demo of ray tracing )

How much would you like to bet?

airfathaaaaa · May 19, 2016

2is said:
How much would you like to bet?

https://scontent.cdninstagram.com/t...g?ig_cache_key=MTI0NTAxNzEwMjAyOTc2MjM5OQ==.2

this was the system i was reffering to the middle one was being used for physics compute and bla bla pretty much anything else that graphics and rendering
(the pic is taken from kevin w. the origin pc ceo twitter )

[bitsandchips]: Pascal to not have improved Async Compute over Maxwell

Platinum Member

Diamond Member

Diamond Member

Senior member

Senior member

Golden Member

Platinum Member

Lifer

Platinum Member

Platinum Member

Platinum Member

Platinum Member

Platinum Member

Diamond Member

Platinum Member

Platinum Member

Platinum Member

Platinum Member

Lifer

Diamond Member

Platinum Member

Golden Member

Senior member

Diamond Member

Senior member