Intel's re-compiled scores are very impressive.
What really impresses me, is how much the Athlon and P3 gain from the new compiler : Performing almost twice as fast as they do with the previous compile! From what I've heard/seen so far, it looks like Intel has probably the best vectorizing compiler out there, able to take more or less standard code and extract useful parallalism out of it. I don't know of anything else on the market that can do this.
It would be so interesting to see what the Athlon could do with optimizations actually made for it, but since AMD doesn't have the kind of budget required to write a compiler, I doubt we'll ever see this happen.
Among other things, I've heard that implementing prefetch commands in that kind of streaming-heavy app will give at least 20% improvement; the default recompile does NOT include the improvements from prefetch.
This has actually revised my opinion of the P4 upwards quite a bit. What has always made 3dnow, SSE, et al have very little impact, is that they required special coding. If you can get this kind of performance gain for no extra work on the programmer's side, Intel is going to be back on top in short order...