AMD vs NVidia asynchronous compute performance

Page 3 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

dogen1

Senior member
Oct 14, 2014
739
40
91
They called it a "subtle change", but now that asynchronous compute is being used in games thanks to Vulkan and DX12, it may not be so subtle anymore.

It could be that but I would guess it's down to the 290x's higher ratio of ALU - Geometry/tex fill/Pixel fill vs the 270x.

Maybe that async compute hardware bug that amd found (and worked around i guess) has an effect on it too.

It's not a huge difference here anyway, a 7% boost vs a 5% boost.
 

Red Hawk

Diamond Member
Jan 1, 2011
3,266
169
106
It could be that but I would guess it's down to the 290x's higher ratio of ALU - Geometry/tex fill/Pixel fill vs the 270x.

Maybe that async compute hardware bug that amd found (and worked around i guess) has an effect on it too.

It's not a huge difference here anyway, a 7% boost vs a 5% boost.
I don't think the problem comes down to limited fill rates, because the Xbox One and PS4 have smaller, slower chips than the 270X, yet they supposedly significantly benefit from async compute. The common denominator between the consoles and the 290X is they both use GCN 2.

I have a feeling the "async compute bug" is that very difference in the hardware between GCN 1 and 2. if so, it may just not be possible to truly fix it through driver updates.
 

dogen1

Senior member
Oct 14, 2014
739
40
91
I don't think the problem comes down to limited fill rates, because the Xbox One and PS4 have smaller, slower chips than the 270X, yet they supposedly significantly benefit from async compute.

Yeah it does. That's kinda the whole point. If a shader is stalled or bottlenecked by something else and can't fill the GPU, you run another shader(with a different bottleneck than the first) to maximize utilization.

If you scaled up your entire GPU exactly 2x, it would be able to run the first shader twice as fast, right?(or at least twice as many threads at once), and would have roughly the same percentage of idling compute units.

Now if you added even more compute units to the mix on top of that(hawaiii), even more are gonna be idle, because you didn't add more of the other hardware to keep them busy. This is the most likely reason why hawaii and fiji benefit more than most older GCN models. They have a higher ratio of ALU to everything else on the chip, comparatively.

The reason xbox and playstation can benefit so much is because they can tune it to each specific GPU, which makes it a lot easier. Imagine if there was a console with fury x in it. Async compute gains would be ridiculous lol.
 
Last edited:

Red Hawk

Diamond Member
Jan 1, 2011
3,266
169
106
Yeah it does. That's kinda the whole point. If a shader is stalled or bottlenecked by something else and can't fill the GPU, you run another shader(with a different bottleneck than the first) to maximize utilization.

Iif you scaled up your entire GPU exactly 2x, it would be able to run the first shader twice as fast, right?(or at least twice as many threads at once), and would have roughly the same percentage of idling compute units.

Now if you added even more compute units to the mix on top of that(hawaiii), even more are gonna be idle, because you didn't add more of the other hardware to keep them busy. This is the most likely reason why hawaii and fiji benefit more than most older GCN models. They have a higher ratio of ALU to everything else on the chip, comparatively.

The reason xbox and playstation can benefit so much is because they can tune it to each specific GPU, which makes a big difference. Imagine if there was a console with fury x in it. Async compute gains would be ridiculous lol.
I see your point. I guess the way to eliminate the architecture difference would be to test a Bonaire card and see if it can get a higher percentage improvement from async compute than the 270X. I do have access to a 260X, maybe I'll get around to testing it...
 

Red Hawk

Diamond Member
Jan 1, 2011
3,266
169
106
Bonaire only has 2 ACEs like the GCN 1 cards, FWIW.
It does, but the Anandtech article mentions that GCN 2 allows for more compute queues per ACE over GCN 1. That could still make a difference. And the Xbox One's GPU is basically a cut down Bonaire chip, equal in compute units to a 260 (no X). I doubt it has more ACEs itself.
 

Bacon1

Diamond Member
Feb 14, 2016
3,430
1,018
91
And the Xbox One's GPU is basically a cut down Bonaire chip, equal in compute units to a 260 (no X). I doubt it has more ACEs itself.

Was it the PS4 that had more ACEs? I do remember that the consoles differed there as well.
 

Red Hawk

Diamond Member
Jan 1, 2011
3,266
169
106
Was it the PS4 that had more ACEs? I do remember that the consoles differed there as well.
Possibly. I don't think the exact details on their ACEs have ever been released, but I remember the Doom devs mentioning in an interview that there's a difference in available queues between consoles.
 

Carfax83

Diamond Member
Nov 1, 2010
6,841
1,536
136
Was it the PS4 that had more ACEs? I do remember that the consoles differed there as well.

The PS4 has can queue a maximum of 64 compute queues with 8 ACEs vs only 16 for the Xbox One with 2 ACEs, so that's a pretty big difference.
 
Last edited:
Reactions: DarthKyrie

dogen1

Senior member
Oct 14, 2014
739
40
91
The PS4 has can queue a maximum of 64 compute queues with 8 ACEs vs only 16 for the Xbox One with 2 ACEs, so that's a pretty big difference.

Well, games right now only use a single compute queue afaik, at least on pc. I'm not sure having tons of queues will really make much of a difference anyway.
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |