<<
I think the FPU is double-pumped and as RC5 doesn't use it I doubt a double-pumped FPU would benefit it much - I am not sure about the Integer unit speed. >>
No, it is the ALU's (integer units) that are double pumped, as are other areas that are related to the ALU's.
<<
I read in a recent article that both the FPU and integer-unit are double pumped... P4 could possibly rock on SETI and RC5... >>
No, the FPU is actually relatively weak. Intel is betting that code-writers will optimize for SSE2. With SSE2, Intel will be able to match what the Athlon can do in straight x86 (theoretically).
We know that S@H hasn't had great SIMD support, but with version 3, it seems that they did use some in some parts (S@H gusy...wanna clue me in better?). But they have not been optimized for SSE2, so who knows.
On the other front, SSE2 includes some integer stuff (like mmx), except it is wider than mmx. Hopefully, this means that RC5 can be bitsliced (remember DES? Bitslicing alone caused a massive increase in performance. Remember altivec with the G4 on RC5? That's bitslicing in action). So perhaps, with the double pumped ALU's, and SSE2 optimization, it could do well in RC5
<<
Isn't the P4 supposed to take a larger hit on branch prediction to?
If so the double pumped ALU units may not be as effective.... >>
Well, technically yes, but its not a "20 stage v. 12 stage" issue. I don't understand it fully, but Paul DeMone has stated that its not nearly so, and I would trust him on it. Also, Intel has stated that they've branch prediction has inmproved by 30% over the P6 core. The P6 core had a branch prediction of about 90-91%, so this is about 94% branch prediction - the highest on date is the K6-x, with about 95%. This will help to negate the effects of the long pipeline. By using a trace cache, instructions don't have to be decoded again and again, which in x86, takes up a LOT of resources, and a LOT of time. The trace cache will aid there.
For more information about teh P4, I suggest reading these three articles by Paul DeMone:
part1,
part2,
and part3, and some articles from Hans could be useful to read as well, which can be found at
this place, and another article
over here by the same guy
If you read all those, and understand anything from it (you don't have to understand what they are talking about necessarily), you'll see that, while everyone is ragging on the P4, it looks rather revolutionary in terms of architecture, but we'll
still have to
wait and see how it performs.
BK