I understand that. I just look and see "spreadsheet dingus in LibreOffice" and simply cant imagine the software where this utilization of the gpu gives a user benefit? I dont question it can make a lot of software faster just that i cant imagine the software where the user will notice?
Its that simple. What software are we talking about? I dont know?
I haven't looked at enough of the code for stuff like Firefox, for example, but there's still plenty of software out there where eventually end-users will want a quicker race-to-idle. It will take faster storage and memory for people to fully-appreciate those improvements since those things are more likely to be bottlenecks.
The other thing to think about here is this: in an fp workload, using both the iGPU and CPU in its entirity, an A10-7850k approached the same 32-bit fp output as a decent i5/i7 Haswell doing the same. Think about that for a second. Stuff like OpenCL2.0 and HSA should have made cheaper APUs/CPUs capable of doing the same work as a higher-end CPU. We're still not getting as much "everyday" processing power out of stuff like i3s and A10s thanks to the shortage of proper OpenCL2.0 acceleration.
Excatly what I was gonna write ... well not excatly, but something along those lines .
What apps?
I could maybe see it in games ala physx, that is, an APU with discrete on the side.. Other than that, what? (that wouldnt allready benefit from completely offloading to a gpgpu)
Complete offload requires massively parallel code. iGPGPU can be utilized in any scenario where a small set of calculations can be carried out before a branch/dependency interrupts the parallel workflow. Ditto for SIMD really, but the iGPU can do it better. In the case of something like Carrizo or Kaveri, the iGPU can do it a lot better.
To use SIMD as an example, if I had code where I frequently needed to carry out calculations in blocks of 4, 8, 12, or 16 32-bit addition operations before reaching a dependency, then obviously 128-bit SIMD or (in the case of 8-large or 16-large blocks of calculations) 256-bit SIMD would be potentially helpful.
There is a lot of software out there that uses heavily parallel loads (Rendering), and software that could but currently doesn't (Like SPICE for example)... The programming techniques being taught today need to change, because blindly assuming that CPUs are going to get faster and faster in Frequency and IPC isn't helping things.
Well that, and also there needs to be more work making ubiquitous compilers that can help programmers utilize GPUs for this purpose. It's a shame that most of the OpenCL/HSA work done on Java9 died on the vine and won't make it into the next release.
Yea, I get that, but overall if I am rendering stuff, wouldnt I go the gpgpu route and do opencl? Rendering seems like a very specific task, what I am getting at, where is the day-to-day apps that would benefit from HSA? I can only come up with games! (and that might be enough given console deals, i dunno..)
See above. A lot of it has to do with underlying code of apps that we don't look at on a daily basis. I'll bet you there's more that could be accelerated via iGPUs than people think.