Well if you look at it objectively..
AMD was never very good at single core speed or at least for a very long time now.
Now AMD claims 40% ipc raise...
They claim SMT on every core...
put one and one together and what is the objective outcome?
Zen will just be a new take on modules, 8 pipeline cores with forced smt from the get-go so no thread will be able to use more than 4 pipelines.
Should Not really surprise anyone, they use a very loose interpretation on the term core just look at the 12 "core" laptops popping up.
That is objective,much more so than to think that a company will manage to go from core2 speeds to haswell speeds with no R&D money and in a few months time.
AMD used to be IPC king. They fell behind only after Intel made the massive Core [2] jump.
Phenom II was nearly a 30% jump over Phenom for integer.
Also, your take on SMT doesn't even make sense. SMT just doesn't work that way, and can't work that way. The OS addresses instructions to a CPU and the core simply tags them with a thread ID for SMT. The instructions are thrown into the L1I cache and the instruction fetch will grab a set of instructions, without regard for the thread assignment*. The branch prediction unit will read these instructions in the fetch unit as they proceed to the predecode/pick buffer.
From here, they go into the decoders, which only care about which thread an instruction originates for register renaming (so the thread uses the right data). From there, the only other place in the entire core that cares about the thread ID again is the retirement reordering logic. After that, it's just memory that is written via the LSU.
* There are naive SMT implementations that do context switching and will only pull instructions from one thread at a time, but AMD's existing fetch logic does not work that way, and they can simply copy their existing logic to achieve the above. They already have register renaming, and a recent patent shows this being put to even better use (likely in Zen).
Proper SMT requires the following:
1. Thread tag per instruction (a single bit is popular, but limits to only one extra thread per core).
2. Per-thread registers, either by renaming from a pool of registers, or dedicated register.
3. Instruction retirement reordering respecting the thread tag - only needed for out-of-order execution.