GCN 1.x architecture is not actually compute heavy. RDNA 1 has the same compute throughput like VII clock for clock for example, if not more. Of course, RDNA can do additional scalar operations and better at branching code.
The issue is that a full wave64 need 4 cycles in GCN to complete using 4x SIMD16.
With compute loads it is always possible to keep the pipeline busy because the SIMD16 can always engage every cycle, executing something that is part of consecutive wavefronts.
So compute loads are better suited for GCN.
For graphics it could be that the whole wave has to complete to have something before scheduling the next wave(so there is a 4 cycle latency), or it could be that the wavefront is not so wide.
Thus GCN/Vega struggles to keep the SIMDs engaged always and this results in lower performance even though theoretically the TFLOPs is fairly high.
Instruction wise, RDNA HW can run all the GCN instructions.
Besides if we are talking for PC, the shader compiler will JIT the shader code anyway. Unlike consoles where the shader binaries are shipped precompiled.
That said, LLVM introduces a new set of instructions and extensions for RDNA2 which older GCN HW will not be able to run.
View attachment 29152