Has nothing to do with the RTA units. RTA only working with Raytracing. And with that the shader core has to calculate the next BVH-leaf.
I'm referring to AMD's implementation. Microsoft's has the RT units shared for resources with the TMUs, so only one can do anything at any given clock for any given CU. The other cannot be used.
And yet Microsoft writes the opposite:
https://devblogs.microsoft.com/directx/dxr-1-1/
Pls dont take anything on the internet for real. DXR1.1 doesnt change anything for the tracing part of the rays:
You should have probably read further down too:
Inline raytracing gives developers the option to drive more of the raytracing process. As opposed to handing work scheduling entirely to the system. This could be useful for many reasons:
- Perhaps the developer knows their scenario is simple enough that the overhead of dynamic shader scheduling is not worthwhile. For example a well constrained way of calculating shadows.
- It could be convenient/efficient to query an acceleration structure from a shader that doesn’t support dynamic-shader-based rays. Like a compute shader.
- It might be helpful to combine dynamic-shader-based raytracing with the inline form. Some raytracing shader stages, like intersection shaders and any hit shaders, don’t even support tracing rays via dynamic-shader-based raytracing. But the inline form is available everywhere.
- Another combination is to switch to the inline form for simple recursive rays. This enables the app to declare there is no recursion for the underlying raytracing pipeline, given inline raytracing is handling recursive rays. The simpler dynamic scheduling burden on the system might yield better efficiency. This trades off against the large state footprint in shaders that use inline raytracing.
The basic assumption is that scenarios with many complex shaders will run better with dynamic-shader-based raytracing. As opposed to using massive inline raytracing uber-shaders. And scenarios that would use a very minimal shading complexity and/or very few shaders might run better with inline raytracing.
What I wrote was my own understanding after talking with Nemes. It seems like I was partially wrong with my understanding, but I'm not entirely sure. The first few points talk about inline RTRT as being almost like the opposite way around to achieving what I was talking about. Instead of limiting the results of sending rays out, instead it's
1. A technique performed when attempting to perform RTRT on a limited number of objects. The example MS give is with shadows, so you would use inline RTRT in a dimly lit area with a limited number of lighting objects.
2. Certain shaders - such as compute shaders - aren't compatible with the standard RTRT methods.
3. More stuff about support.
4. When working with simple recursive rays, inline RTRT can also be more efficient than the normal method.
That being said, nothing here is entirely contradictory to the basis for my point in that it's used for optimising scenarios where the work you want to do is more simple, I was just quite wrong on
how it does that. None of this also contradicts with what Nemes has said regarding ebing able to perform inline RTRT in the RT cores itself (or well, lack of that functionality). It says that the exact implementation is left to the hardware/drivers itself, but does not clearly specify how either would handle it at all.
Would love for some extra reading material though, so if you do actually find something that actually does prove what Nemes wrote wrong, please do share. Or something that proves him rught for that matter. For now, the few pages here will have to suffice...