[AMD_Robert] Concerning the AOTS image quality controversy

Page 6 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

renderstate

Senior member
Apr 23, 2016
237
0
0
To my understanding Pascal can now switch between Compute and Graphics at the GPC boundary, which definitely is a form of concurrent compute + graphics. I'm just not entirely convinced of the usefulness when you can only switch between the two in 25% granularity (GP106 might be limited to 33% or 50% granularity, GP102 could increase this to 17%).
In comparison, AMD can switch between the two workloads in >3% granularity on Hawaii to stuff holes in the graphics pipeline. This seems a lot more useful to be honest to fill both expected and unexpected gaps in the graphics workload.
It could very well be the case. I don't think NVIDIA discussed at what granularity they can switch between graphics and compute tasks and I agree that a finer granularity is likely to bring more performance to the table.

I am just trying to stop this non-sense of Pascal not supporting async compute. Perhaps AMD has a better implementation, which is hardly surprising giving they refined this capability over a number of years.
 

renderstate

Senior member
Apr 23, 2016
237
0
0
No point talking if nobody can understand each other. Which every post seems to imply.
Emh, some just don't want to understand (not you).

Anyway, the async compute deniers are a bit like the flat earthers, you can show them pictures of the earth taken from space but they will still tell you the earth is flat
 

sirmo

Golden Member
Oct 10, 2011
1,014
391
136
Yep.

What's your take on the lack of visual quality shown in the side by side comparison?
There is no lack of visual quality. Nvidia is not rendering the snow properly so the scene looks sharper when compressed in a youtube video.

Do you really think AMD is that dumb to benchmark compare a card running medium and a card running crazy settings? When we can easily verify it in 3 weeks.
 

Piroko

Senior member
Jan 10, 2013
905
79
91
It could very well be the case. I don't think NVIDIA discussed at what granularity they can switch between graphics and compute tasks and I agree that a finer granularity is likely to bring more performance to the table.
I'm going to assume that -if this GPC boundary context switch is true- an implementation with a lot of context switches and heavy use of small compute tasks will still run slower than not using concurrent c&g at all. But game engines that keep Pascals architecture in mind and use heavy, bulked compute workloads could be able to get a performance benefit out of both Pascal and GCN.

Ironically, that could hurt a quick adoption of concurrent c&g since you can just hack-job compute elements into your graphics engine on GCN and it will give you a performance benefit, but Pascal will likely need a "do it well or don't do it at all" approach.
 

renderstate

Senior member
Apr 23, 2016
237
0
0
Mistakes can be made. It could be a driver bug (either NVIDIA and/or AMD), an app bug, an error made by the person that recorded the benchmark/videos, etc..

With the information we have at this point I am more inclined to believe the developers never tested that FP16 path on Pascal. Pascal is simply ignoring the minprec hints, as the API spec document says it should do, and it's running the whole shader in full precision. In this sense there is not right or wrong, both GPUs are playing by the book and according the specs but still generate different results.

For instance neither OpenGL nor DX really specify how anisotropic filtering should really be implemented and HW vendors are free to use whatever they choose (and competition drove this feature to be implemented very well across all vendors over many years..)

Anyway, we will probably find out soon what is really happening.
 

Glo.

Diamond Member
Apr 25, 2015
5,763
4,667
136
Mistakes can be made. It could be a driver bug (either NVIDIA and/or AMD), an app bug, an error made by the person that recorded the benchmark/videos, etc..

With the information we have at this point I am more inclined to believe the developers never tested that FP16 path on Pascal. Pascal is simply ignoring the minprec hints, as the API spec document says it should do, and it's running the whole shader in full precision. In this sense there is not right or wrong, both GPUs are playing by the book and according the specs but still generate different results.

For instance neither OpenGL nor DX really specify how anisotropic filtering should really be implemented and HW vendors are free to use whatever they choose (and competition drove this feature to be implemented very well across all vendors over many years..)

Anyway, we will probably find out soon what is really happening.

It looks like it is not a bug, at all. Someone edited the source code, according to words a guy from Oxide/Stardock, of the Game for it to render the shaders incorrectly. Both Vendors: AMD and Nvidia have DIRECT access to the source code of the game, and can tune it for their hardware. That is what Brad Wardell have said: https://www.reddit.com/r/pcgaming/c...ng_the_aots_image_quality_controversy/d3t9ml4
Read it Again. And guess in which thread on reddit is this message? This: https://www.reddit.com/r/Amd/comments/4m692q/concerning_the_aots_image_quality_controversy/

Look at the problems: GTX 1080 renders shaders of the terrain incorrectly. Next step: Wardell from Oxide comments that somebody EDITED the source code of the game for their purpose to make their hardware look better.

We have to wait for the end of investigation from Stardock.
 

Minkoff

Member
Nov 7, 2013
54
8
41
Do you see a problem here?
...stuff...Someone edited the source code ...stuff... somebody EDITED the source code

Original quote from Brad Wardell
"...whether someone has reduced the precision on the new FP16 pipe...we can evaluate whether someone is trying to modify the game behavior"
 
Last edited:

ShintaiDK

Lifer
Apr 22, 2012
20,378
145
106
If someone random can edit source code and Oxide not checking it but still publishing it. Then Oxide got a huge problem with their procedures. Sounds like Oxide is making another excuse.
 

Glo.

Diamond Member
Apr 25, 2015
5,763
4,667
136
Do you see a problem here?


Original quote from Brad Wardell
"...whether someone has reduced the precision on the new FP16 pipe...we can evaluate whether someone is trying to modify the game behavior"

I agree, I misinterpret this part. My bad.
 

Glo.

Diamond Member
Apr 25, 2015
5,763
4,667
136
If someone random can edit source code and Oxide not checking it but still publishing it. Then Oxide got a huge problem with their procedures. Sounds like Oxide is making another excuse.

Not exactly. This is simply the case what has been discussed about DX12. Low-Level API means that you have to tune the game for specific hardware. I suppose rather than doing this on their own, Oxide has given the source code to AMD and Nvidia, to fine tune the code for their hardware.

What it implies that anyone from AMD or Nvidia can change the source code of the game.
 

MajinCry

Platinum Member
Jul 28, 2015
2,495
571
136
It's well known that GPU vendors get access to the source code of games. That's how they optimize shaders and fix renderers. Nobody's going to redo ENBSeries level of reverse engineering for every game around, which is why the vendors get them some HLSL and C++ goodness.
 

renderstate

Senior member
Apr 23, 2016
237
0
0
I'm going to assume that -if this GPC boundary context switch is true- an implementation with a lot of context switches and heavy use of small compute tasks will still run slower than not using concurrent c&g at all. But game engines that keep Pascals architecture in mind and use heavy, bulked compute workloads could be able to get a performance benefit out of both Pascal and GCN.

Ironically, that could hurt a quick adoption of concurrent c&g since you can just hack-job compute elements into your graphics engine on GCN and it will give you a performance benefit, but Pascal will likely need a "do it well or don't do it at all" approach.
I agree but it's nothing new. It happened pretty much for every new feature.
 

Mikeduffy

Member
Jun 5, 2016
27
18
46
Renderstate - I'm assuming that you're "Ieldra" from HardOCP, this correct?

I'm asking cause that Guy spends most of his day flooding every thread over there if asynchronous compute is ever mentioned. If it's not you, then I'm sorry for the call out.

Anybow, my point is: Nvidia stated that Maxwell will be able to perform asynchronous compute within the context of dx12 - this was last year, where is it?

I'm happy that Pascal can handle the workload, but Nvidia made big claims about Maxwell and it seems as though they don't want to acknowledge any faults within dx12.

Why won't anyone do a follow up article about Maxwell?
 
Last edited:

renderstate

Senior member
Apr 23, 2016
237
0
0
Renderstate - I'm assuming that you're "Ieldra" from HardOCP, this correct?[

I'm asking cause that Guy spends most of his day flooding every thread over there if asynchronous compute is ever mentioned. If it's not you, then I'm sorry for the call out.

Anybow, my point is: Nvidia stated that Maxwell will be able to perform asynchronous compute within the context of dx12 - this was last year, where is it?

I'm happy that Pascal can handle the workload, but Nvidia made big claims about Maxwell and it seems as though they don't want to acknowledge any faults within dx12.

Why won't anyone do a follow up article about Maxwell?


Sorry to disappoint you but I am not leldra.

Even if Maxwell supports async compute it doesn't mean you'll get improved performance out of it. Maxwell implementation is likely not very good to begin with.

Async compute is not a magical wand that makes every application go faster.
 

Piroko

Senior member
Jan 10, 2013
905
79
91
Anybow, my point is: Nvidia stated that Maxwell will be able to perform asynchronous compute within the context of dx12 - this was last year, where is it?

I'm happy that Pascal can handle the workload, but Nvidia made big claims about Maxwell and it seems as though they don't want to acknowledge any faults within dx12.

Why won't anyone do a follow up article about Maxwell?
The issue is that the term "Asynchronous Compute" was a bad choice to describe what is actually happening.

Technically Maxwell can do asynchronous compute - as in, if your context of the chip has been switched to the compute mode then you can have multiple compute threads scheduled asynchronously (sorry for the complicated sentence). DX11 or 12 doesn't matter in that case.
What it can't do (to the best of my knowledge) is to do both compute and graphics workloads at the same time on different parts of the GPU (also called concurrent compute and graphics). GCN can do this, it can freely schedule any number of CUs to graphics and the rest to compute at the same time with very little side effects.

It should have been called concurrent c&g from the start, really.
 

Mikeduffy

Member
Jun 5, 2016
27
18
46
Sorry to disappoint you but I am not leldra.

Even if Maxwell supports async compute it doesn't mean you'll get improved performance out of it. Maxwell implementation is likely not very good to begin with.

Async compute is not a magical wand that makes every application go faster.

Never said it was some magic wand.

Some say that asynchronous shading isn't needed by Nvidia because their GPUs are already close to 100% utilization, yet these seem people neglect the mention that increased utilization isn't the only benefit - what about reduced latency?

About Maxwell and asynchronous shading, Nvidia was disingenuous when they somehow said that their driver team would solve their deficiencies in this area. I would hope everyone could agree on this - no?
 

renderstate

Senior member
Apr 23, 2016
237
0
0
Never said it was some magic wand.



Some say that asynchronous shading isn't needed by Nvidia because their GPUs are already close to 100% utilization, yet these seem people neglect the mention that increased utilization isn't the only benefit - what about reduced latency?
I don't believe for a second utilization is close to 100%. Some rendering passes simply don't fill the shader cores, like shadow map rendering. Care to
Elaborate on how async compute on a fully utilized machine cuts latency? What latency are we talking about?
 

Piroko

Senior member
Jan 10, 2013
905
79
91
About Maxwell and asynchronous shading, Nvidia was disingenuous when they somehow said that their driver team would solve their deficiencies in this area. I would hope everyone could agree on this - no?
There's one possible explanation that isn't too far fetched: Maxwell does have different schedulers for graphics and compute. They probably thought that there could be a way to effectively use them at the same time with controlled boundaries defined within their driver stack. Probably ended in a compatibility nightmare, as changing a software scheduler often does.
 
Feb 19, 2009
10,457
10
76
It's not Async Compute or Async Shaders that are important, sure, getting shaders to handle graphics & compute better is good, but...

In DX12 and Vulkan, it's actually about Multi-Engine. Read up on it.

ROPs & DMAs deserve some love too.
 

Vesku

Diamond Member
Aug 25, 2005
3,743
28
86
If someone random can edit source code and Oxide not checking it but still publishing it. Then Oxide got a huge problem with their procedures. Sounds like Oxide is making another excuse.

If it's a GPU vendor submission then Oxide doesn't need any excuse. It will be the GPU vendor that will be scrambling for excuses.
 

Vaporizer

Member
Apr 4, 2015
137
30
66
With the developers getting so excited about async compute all the Maxwell users should put a lot of pressure on NV to get the "promised" driver that will "enable" async compute. And ot would be of interest if some reviewers will report on how the driver assisted compute will pan out. Because if no such driver is delivered some vendors would have bluntly lied to their customers. So waiting for almost a year and still counting.
 

sirmo

Golden Member
Oct 10, 2011
1,014
391
136
The issue is that the term "Asynchronous Compute" was a bad choice to describe what is actually happening.

Technically Maxwell can do asynchronous compute - as in, if your context of the chip has been switched to the compute mode then you can have multiple compute threads scheduled asynchronously (sorry for the complicated sentence). DX11 or 12 doesn't matter in that case.
What it can't do (to the best of my knowledge) is to do both compute and graphics workloads at the same time on different parts of the GPU (also called concurrent compute and graphics). GCN can do this, it can freely schedule any number of CUs to graphics and the rest to compute at the same time with very little side effects.

It should have been called concurrent c&g from the start, really.
Asynchronous Compute as a name makes perfect sense. It implies that you can feed the CUs with compute shaders at any point in the pipeline, without having to worry about the order they need to be executed in. It allows your engine to be multithreaded and queue graphics and compute shaders asynchronously, as in independent of each other.

Traditionally this was done synchronously, meaning compute shaders were coupled to the pipeline's order of things before they could be executed.

Maxwell and to some degree Pascal aren't as flexible as GCN in this regard.
 
Last edited:

Piroko

Senior member
Jan 10, 2013
905
79
91
Asynchronous Compute as a name makes perfect sense. It implies that you can feed the CUs with compute shaders at any point in the pipeline, without having to worry about the order they need to be executed in.
Correct, and Maxwell can do this as long as you already are in compute context.

It allows your engine to be multithreaded and queue graphics and compute shaders asynchronously, as in independent of each other.
That isn't described in the term "Async Compute" though, that's just something GCN can do beyond scheduling compute tasks asynchronously - to schedule compute along graphics with no need for global context switching.

This may be clear now, but it made for a lot of confusion when this whole discussion started and it probably lead to some of the contradictory statements that came from AMD/Nvidia/developers.
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |