Interesting, if it's accurate. These benchmarks are all over the board. Super Pi (typically an AMD weak point) sees ~30% improvement at the same clock speed as Richland, and wPrime scores are more than 40% better! But Cinebench only has about a 11% boost, and performance in the Excel benchmark (not sure what it is exactly - Monte Carlo simulation?) is virtually unchanged from Richland.
wPrime is a multi-threaded benchmark, so what we're seeing is partially IPC improvements and partially the removal (or at least reduction) of the CMT penalty. SuperPi is harder to explain; it has legacy x87 FPU code, single-threaded, and I can't imagine that this would have been a priority for AMD to optimize.
What kind of workload is Cinebench, exactly? Integer, floating-point, or a mixture of both? 3D rendering usually uses floating-point, so I'd guess that this is what's in play, and the lack of substantial FPU improvement is why Steamroller only sees lackluster gains. AMD presumably hopes that OpenCL+HSA will help fill that gap, but there are a lot of existing programs that can't or won't be rewritten. Excavator is supposed to increase FPU performance dramatically, but that's still a year or more away.
If the Excel benchmark is both single-threaded and FPU-bound, then that could explain the lack of meaningful gains there. But SuperPi remains a mystery; it should suffer the same problem, even more so because it uses legacy opcodes that they wouldn't be expected to bother with optimizing for. Wikipedia says the results can be dependent on memory bandwidth, so maybe that's what we're seeing? (But in that case, wouldn't the graphics core see bigger improvements as well? It seems to be bandwidth-limited in most games on both Richland and Kaveri.)