Phenom I (65nm) was ~5% slower per clock on average than 65nm Conroe according to plethora of desktop workloads on hardware.fr website.
Phenom II (45nm) was similarly slower (~5-6%) than 45nm Penryn (same source as above)
Piledriver is around 90-95% of ST IPC of 45nm K10 core, sometimes being notably faster and sometimes being notably slower. In modern workloads (read latest software and games) it is usually faster due to better ISA support and better optimizations and support for PD(and BD). On top of this it clocks much higher, both stock and manually OCing.
SR, if it delivers ~10-15% IPC increase, will be faster than K10 core. Then we won't have that big (if at all) module penalty when running two threads (due to 2x more instructions being issued per cycle per module). That penalty is around 15-20% depending on the workload. You can now calculate what it means for multithreaded code Vs PD and VS K10 (six core).