Cache latency was one of the big IPC killers on Bulldozer and its progeny. AMD's presentation slides specifically indicated improved cache performance for Zen, so at least there's an internal awareness of this issue and efforts taken to fix it.
Someone speculated that the instruction cache that Sandy Bridge got is what helped it quite a bit versus Bulldozer. I wonder how Bulldozer/Piledriver would perform with one.
Also, lower latency caches (especially L1) limit overclocking, right?
AMD may have decided that, because they didn't have the money to make a desktop-oriented CPU design, that they would aim for clockspeed for marketing appeal. This approach seems to have reached its height with 9590.
Many post as if AMD had no idea what the weaknesses would be but I think strategic compromises were probably made. The problem with betting so much on clockspeed is that savvy buyers would notice that single core performance is critical (particularly for the years when games were barely making use of the core in an i5). AMD may have felt enterprise dollars were more important. Since Bulldozer didn't get very far in enterprise, the bet didn't pan out. I also assume the choice to make a design like Bulldozer was due to the financial squeeze (Intel tactics with OEMs, for instance) that had been put on the company years prior, the consequence of which was little R&D money. One report, citing an anonymous engineer from AMD (which may not be true), is that the company made very heavy reliance on automated tools as a cost-cutting measure, which also hampered the efficiency of the design. And, finally, it didn't help that the foundries AMD was using weren't making the progress they had intended to, in terms of a replacement for a 32nm high power node.