Maybe I don't understand the definition of IPC from a technical perspective you are describing and confusing Performance Per Core with IPC? Sorry if I did . The way I look at it, if I get a Haswell CPU, how much faster will it be on average in modern apps @ 4.5ghz compared to an i5 2500K @ 4.5ghz? If it's only 10%, it would end up being the lowest real world increase in IPC/per core performance compared to C2Q --> Nehalem --> SB.
RS, that is the source of the confusion. You are comparing products, SKUs if you will, full-fledged multi-core chips which is not the same thing and will not tell you the IPC improvements.
IPC is intended to represent single-core single-thread performance, and it is obviously heavily dependent on the instruction mix of the specific application in question.
There are reasons people care to quantity IPC, same reasons people care to normalize clockspeeds for comparisons and so on. But to be sure IPC doesn't tell you how a final product is going to perform, but if you know IPC, and you know target clockspeeds, and you know thread-scaling performance, then you can compute the final product performance which is what you, as the end-user, are going to experience or read about in a review.
I think you are mixing the two and that is what is leading to the confusion.
If I had asked you how much better Nehalem was to Conroe the first question you'd have to ask is "which nehalem? and which conroe?".
But if I asked you for the IPC of each then that eliminates core count, cache arrangement (mostly, but not entirely), and clockspeed as variables.
At that point there is only one questions "IPC for which set of instructions?"
IPC for superPI is going to be different than the IPC for Cinebench R11.5 because those two apps use very different instructions in the code.
That is where we rely on Intel to know what they are doing, to have no reason to deceive or manipulate our expectations, when they state "using a broad range of consumer applications".
Once you know the IPC delta for say Haswell, now you need to factor in core count differences, if any, and clockspeed differences, if any, turbo-boost differences, if any, and you have your single-threaded performance gain, if any.
Why is single-threaded performance relevant? Ask anyone who has heard of bulldozer and they'll tell you just how much it matters.
Multithreaded performance may well be increasing
all the more, but that is a product of more than just IPC improvements. The final performance improvement in multithreaded apps will be dependent on clockspeeds and core counts.