I would say a CPU is more advance than a GPU just because most of a GPU is the same components over and over (so i think). Where a CPU has many more different components to be designed and linked together.
But i would personally like to see this kind of technology and process,
Computer
- x86 CPU
- GPU Card
- FPGA Card
- Sound Card
- In/Out Chipset
- RAM
- Storage
Where the big one there is the FPGA Card. What if a user could buy a reconfigurable card for the price of a GPU (in time if the with process get more accepted and more companies design these units). What would you do with such a card? My answer is ANYTHING. If there was a standard for this type card and the OS drivers, so any software can see what kind you have and how much area, and given bandwidth. The companies that write software and games and then just have a switch in their code if FPGA found, program custom circuit for "Fire Physics" for example. Now when the game starts "full focus" you would have your x86 running typical non-parallel code, gpu taking care of all the graphics, then you would have this custom circuit designed for the game form that game company doing a task they seem fit that would be too slow on a CPU to do. It could really be anything really, just give the software companies the option. There are so many options here, you can distribute the area so multi-programs have sections of the FPGA (new FPGA can do dynamic programming). I know my wording is most likely not the best, but i think you get the idea. Instant Application specific speedup for any kind of code that shows parallelism. Then i guess you would ask who would write/make these custom hardware circuit? Well if high-level synthesis isn't matured enough, then why not hire computer engineers and (digital) Electrical Engineers, there pay grade is pretty much the same as a software engineer, seem all logical to me?
I know there is research being done so that this type of process is done completely behind the scenes, so you just have some software and the custom hardware with adept to the normal binary code to create speed up where needed and update the binary code going to the CPU, such devices called "Warp Processors",
http://www.cs.ucr.edu/~vahid/warp/. This research is great, and would be damn cool one day but i think there needs to be an slower approach to set in these types of devices and the possible speed-up into the computer industry as a whole not just the scientific and HPC crowd.
Just my thoughts, sorry if this is so off topic, but i think such an idea and device would help give user the power to run these types of loads instead of just waiting...... Also i am taking a Reconfigurable Computing Class right now and so all these types of idea are in my head of way to speed up typical things i do everyday and so i had to ramble to someone about one of them lol.