That's one hell of a crappy start for something that is supposed to represent the "future"...Intel showcases in their most advanced leading-edge processors and AMD just kinda tucks it into a market segment that represents the lowest margin sku's they can sell.
AMD has sort of been tucking products into various market segments since AM2 launched, though in some cases I think this has been due to the underperformance of some of their products since that time (Brisbane comes to mind). I think with the case of Llano, they want to get the chip into as many systems as possible as quickly while bribing developers to make use of OpenCL or DirectCompute or . . . something to modify their code so that as much fpu-intensive code as possible can be offloaded to the GPU element of the Llano APU.
Making a big deal about that now may not really be a part of their game plan. Making a big deal about what people can do with Llano/Ontario down the road once the code support is there, however . . . I dunno. Just speculating here. Obviously the code to take advantage of Llano isn't out there right now. Support for CPU + GPU based Fusion is almost non-existent at the present compared to, say CUDA-compiled apps. AMD may be waiting for developers to get on board before making much noise.
Fusion is not, and never has been about gaming: it's about moving most of the fpu functions off the general-purpose CPU and onto the specialized vector processors crammed onto the same die (or on a separate piece of silicon with its own HT link, or stuck on a PCI-e card, or . . .)
At face value, a tweaked Propus looks like it would get its arse handed to it by a Sandy Bridge product targeted at the same market segment, but a Propus with 480 stream processors doing some serious number crunching in relative harmony with the Propus cores is an entirely different ballgame. If Sandy Bridge's GPU is anything remotely like the one on Clarkdale, it will probably be revealed (at some point) that the Sandy Bridge GPU relies in no small part on the x86 cores for computing horsepower, making the Sandy Bridge GPU a poor choice for GPGPU operations.
If AMD can't get developers on board with fusion, then yeah, you've just got a Propus with a better IGP than the 890GX. Woop woop. Sandy Bridge will kill it in general computing power, and we'll see who wins the IGP battle. Maybe AMD keeps itself afloat in the same segment with better GPU horsepower, though for general purpose apps, without fusion-compliant code in the wild, that GPU horsepower might not matter.
As a side note, JF punted on a question I put to him awhile back with how APU-based Fusion is going to square with discrete graphics card-based Fusion, since right now, Fusion can either be an APU like Llano or it can be CPU + vid card working in tandem (think: Bulldozer + PCI-e implementation of Southern Islands).
If I recall correctly, the coders behind f@h commented, when developing their various GPU cores, that utilizing video cards for HPC was somewhat difficult since the turnaround time from PCI-e cards is so bad. It was like they had to code an entire module to load into the RAM on the vid card to minimize communications between the CPU and GPU - in other words, offload an algorithm and dataset to the GPU, let the GPU do all the crunching remotely, then have the GPU return the results as quickly as possible.
CPU + vid card Fusion is almost guaranteed to have the same problem since AMD has chosen not to leverage its HTX slot specification (up to this point) to facilitate superior communication between the CPU and GPU.
APU-based fusion suffers from a significant reduction in computational resource (read: fewer and slower stream processors) and no 1+ gb of memory dedicated to the stream processors involved, but since the stream processors are on the same die as the x86 cores, we can assume that the turnaround time of communications between the x86 cores and the stream processors would be much shorter than that of a CPU attempting to issue commands to a PCI-e-connected GPU. In fact, from recent uh . . . "discussions" of Barcelona, it looks like intra-die communications on K10/K10.5 chips happens about twice as fast as communications between devices connected via HT3.0.
Anybody coding to Llano will probably not have to worry about what is loaded into system memory and what is loaded into the segment of system memory cordoned off as frame buffer for the GPU. Anybody coding to Zambezi + Southern Islands will have to worry a great deal about what they keep in system memory and what they move into the vid card's dedicated memory.
Furthermore, an 80-90 cycle penalty to communicate with the stream processors still doesn't seem that awesome. I don't know if Llano will have shared l2 with the GPUs to solve that problem (I'm thinking not), but um, doesn't Bobcat have shared L2? And what if Bobcat shared its L2 with the stream processors that make up Ontario?
Doesn't that sort of play back into the idea Idontcare had about Bobcat saving the day for AMD sometime down the line . . . ?