I would say that AMD CMT does not work at all. It's misleading to compare core per core, because all the support infrastructure that the core needs to work just isn't there. 32nm Sandy Bridge without the CPU part is 30% smaller than the CPU part of a 32nm APU. And it's not blank space, is transistor budget that consumes power even when only leaking, making the design a lot less efficient than Intel's.
In servers we can see how CMT designs scale badly, because once AMD tried to go over 4 modules they could achieve only paltry clocks, while Intel could get much higher clocks of their big die parts. While here on the forums people are claiming for 8 core Steamroller, Intel has 12 core/24 threads parts that would mop the floor with AMD chips in whatever multithreaded task you can think of, and even the 8C parts would be enough to hold the line against whatever AMD throws at them. AMD can only look for the poor scaling of their designs to blame for their server debacle.
And speaking about sharing... SMT is about sharing *all* resources of the core, while CMT is about sharing just a few of the resources. Intel (and IBM, and Sun, and everyone with SMT) can go huge on core resources and IPC because the resources will be used by more than one thread at a given time, while AMD cannot, because if they go huge on core resources it might end up with only added leakage while the core sits still for lack of threads. This is the reason for the anemic core (which they tried to compensate with high clocks), and this is why you cannot expect much IPC from AMD parts(I'll really save that 90% post for the posterity). CMT ends up delivering a much more inflexible processor than CMT, one that only shines when there are a lot of light threads on the fly, and sucks at everything else.