We'll have to wait for benchmarks, but I'm growing ever more suspicious that Zen's SMT implementation is more like Power8's than anything Intel's produced so far. Intel's approach has been to allow a second thread to use unused CPU resources, but doesn't really over-provision those resources (a single thread can very nearly saturate the whole CPU). On Power8, they can scale up to 8 threads per core (Zen will only do 2), but they make that viable by doubling down on key CPU resources in the first place (Instruction Cache, rename registers, etc.). The end result is that the second SMT thread on Intel increases overall performance by around 15-25%, but on Power8 the second SMT thread can increase overall performance by around 60% in some workloads. In Layman's terms, Power8's 'hyperthreads' are more useful than Intel's.
AMD haven't talked about rename registers yet, but they have revealed that the instruction cache is 64KB per core; perhaps not-so-coincidentally, that's double the size of Skylake's instruction cache, and the same size as Power8's. The L1 Data cache is only 32K in all of these processors, but its rather odd in processor design to have your instruction cache be twice the size of your L1 data cache -- unless you have a good reason. There's only two reasons I can think of -- either that second thread chews through a lot more instructions than in competing SMT designs, or possibly the uOp Cache can spill to L1. Looking at the slide from HotChips that shows which CPU resources are exclusive, competitively shared, or arithmetically arbitrated, has me leaning toward the former, though they might not have overprovisioned CPU resources enough to match Power8 fully. There were also rumors months back about Zen doing some really novel things with SMT, which would seem to back that up.
The implication of that would be that Zen could run at a lower clockspeed than Intel's current Broadwell DE but still match in overall threaded performance (but perhaps giving up 10-15% single-threaded performance (not clock-normalized)). For the mainstream, they could release a quad-core CPU at similar clocks to Skylake, and outperform it in threaded workloads. In gaming workloads, since current consoles make 6-7 threads available to games, a quad-core Zen with 4 hyper-threads giving ~60% additional performance would give a lot bettter performance than a quad-core i7 with 4 hyper-threads giving ~20% additional performance. In fact, that Zen would would have a throughput comparable to 6-7 dedicated cores.
We won't know until someone does an architecture deep-dive or we have benches showing SMT gains much larger than intel's. But its looking increasingly likely from what I see.