A thought about the P4....

Deeko

Lifer
Jun 16, 2000
30,213
11
81
Something like this may have been posted in here before, but I searched and didn't find it. The Pentium4 may be reallly good at Rc5. Think about it, Rc5 is integer based, and clockspeed helps alot. And doesn't the P4's integer unit run at twice the regular speed? So a 1.5 GHz P4 would have an integer unit of 3 ghz, right? Wow, that could be really good in Rc5, rip my 'little' Tbird 800@952 apart
 

vss1980

Platinum Member
Feb 29, 2000
2,944
0
76
I think the FPU is double-pumped and as RC5 doesn't use it I doubt a double-pumped FPU would benefit it much - I am not sure about the Integer unit speed.
 

MrAtheist

Member
Aug 26, 2000
148
0
0
I read in a recent article that both the FPU and integer-unit are double pumped... P4 could possibly rock on SETI and RC5...
 

SilverBack

Golden Member
Oct 10, 1999
1,622
0
0
Err I read that the FPU wasn't a large consideration by Intel this time and it's performance is LESS than the P3.
I'm looking for the link now, when I find it I'll post it.
 

MikeyP

Member
Jun 14, 2000
170
0
0
The P4 has double pumped ALU units, as well as some other parts of its 20 stage pipeline. The FPU is not double pumped, and is extremely weak compared to the Athlon. However, the proposed scalability of the clock speed combined with SSE2 (which virtually eliminates the use of x87 for FPU) should allow it to be effective. It probably will be slower than the Athlon in FPU intensive tasks until applications are built ground up with SSE2 in mind. (Of course, by that time, AMD could have it too, as I believe they can implement both SSE and SSE2 freely.) The double pumped integer unit is hard to speculate about, we'll just have to wait and see the benches.
 

SilverBack

Golden Member
Oct 10, 1999
1,622
0
0
Isn't the P4 supposed to take a larger hit on branch prediction to?
If so the double pumped ALU units may not be as effective....
 

BurntKooshie

Diamond Member
Oct 9, 1999
4,204
0
0


<< I think the FPU is double-pumped and as RC5 doesn't use it I doubt a double-pumped FPU would benefit it much - I am not sure about the Integer unit speed. >>


No, it is the ALU's (integer units) that are double pumped, as are other areas that are related to the ALU's.




<< I read in a recent article that both the FPU and integer-unit are double pumped... P4 could possibly rock on SETI and RC5... >>



No, the FPU is actually relatively weak. Intel is betting that code-writers will optimize for SSE2. With SSE2, Intel will be able to match what the Athlon can do in straight x86 (theoretically).

We know that S@H hasn't had great SIMD support, but with version 3, it seems that they did use some in some parts (S@H gusy...wanna clue me in better?). But they have not been optimized for SSE2, so who knows.

On the other front, SSE2 includes some integer stuff (like mmx), except it is wider than mmx. Hopefully, this means that RC5 can be bitsliced (remember DES? Bitslicing alone caused a massive increase in performance. Remember altivec with the G4 on RC5? That's bitslicing in action). So perhaps, with the double pumped ALU's, and SSE2 optimization, it could do well in RC5



<< Isn't the P4 supposed to take a larger hit on branch prediction to?
If so the double pumped ALU units may not be as effective....
>>



Well, technically yes, but its not a &quot;20 stage v. 12 stage&quot; issue. I don't understand it fully, but Paul DeMone has stated that its not nearly so, and I would trust him on it. Also, Intel has stated that they've branch prediction has inmproved by 30% over the P6 core. The P6 core had a branch prediction of about 90-91%, so this is about 94% branch prediction - the highest on date is the K6-x, with about 95%. This will help to negate the effects of the long pipeline. By using a trace cache, instructions don't have to be decoded again and again, which in x86, takes up a LOT of resources, and a LOT of time. The trace cache will aid there.

For more information about teh P4, I suggest reading these three articles by Paul DeMone:


part1, part2, and part3, and some articles from Hans could be useful to read as well, which can be found at this place, and another article over here by the same guy

If you read all those, and understand anything from it (you don't have to understand what they are talking about necessarily), you'll see that, while everyone is ragging on the P4, it looks rather revolutionary in terms of architecture, but we'll still have to wait and see how it performs.

BK
 

ss59

Banned
Oct 9, 1999
794
0
0
I also remember a blurge being written that the P4 does alot of rc5 math in hardware. Might make a difference. The 20 stage pipeline shouldn't hurt too much cause rc5 is just doing same calculations over and over, shouldn't be hard for branch predection to work through. And isn't sse2 128bit?
 

ss59

Banned
Oct 9, 1999
794
0
0
Also intel is releasing a software package that optimizes code to sse2. Point and click optimization.
 

KarsinTheHutt

Golden Member
Jun 28, 2000
1,687
0
0
As far as clock cycles go... my thoughts are

P4 will be equal to P3 in integer Mhz for Mhz. The Double pumped ALU/improved prediction should make up for the problems posed by a hyperpipeline

P4 will suck majorly in x87

P4 will excell in code compiled with SSE2 optimized compilers

 

Deeko

Lifer
Jun 16, 2000
30,213
11
81
Alright, this is assuming that clock for clock the ALU of the P4 is the same as the P4, and that there are no special optimizations. That would mean a 1.5Ghz P4 would get 8.4 mkeys/sec! As in almost twice as much as a 500 MHz G4! Wow! To calculate that, I just went to teamanandtech.dhs.org and intered 3000 under the P3 section. That's frickin amazing. The P4 may end up sucking at everything else, but hey it will be the RC5 champ
 

sciencewhiz

Diamond Member
Jun 30, 2000
5,885
8
81
We had this discussion a few months ago. I checked the archives and couldn't find it, so let me summarize what I remember about our discussion.

RC5 uses purely integer calculations. This means that the poor FPU will not hurt RC5. One of the P4s main selling points is SSE2 which will not help RC5 at all. The QDR bus, another selling point, does not help RC5 but may possibly help SETI. The double pumped ALU should supposedly really help RC5, but there is a problem. Intel has a document on their website documenting the instructions of the proccessor and how many cycles each one takes. Shifts and rotates were listed as taking 2-4 clock cycles, while on a P3 they both take 1 cycle. RC5 depends heavily on rotates. This could mean that the P4 will be very slow at RC5. Intel also had another document that says that the P4 will be very fast at encrypting rc5. It is possible that because the p4 will be fast at encrypting it will be fast at decrypting rc5.

All these factors mean that we will have to wait and see what the performance really is.

My personal guess is that the P4 will be slower than an equivilently clocked P3 at RC5 with the current cores. Once someone creates a P4 rc5 core, then the P4 will be slightly to quite a bit faster than an equivilantly clocked P3
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |