RC5 on P4 been tested?

Straylight

Junior Member
Jul 21, 2000
14
0
0
I may have missed any threads related to this, but are there any benchmarks with RC5 on P4s? If not, I'll try to do one for you guys on a W2K box and P4-1.5Ghz. The small amount of time I have left at work, I may as well put it to good use. -Strayen
 

Train

Lifer
Jun 22, 2000
13,572
66
91
www.bing.com
yes the P4 has been benchmarked, look at Anands P4 review, it actuially did quite poorly. There is no core optimized yet for the IA-32 architecture, but hopefully we will have one soon.

I am going to try and tweak the current P3 core and then recompile it with Intels IA-32 compiler, but i still have to wait for my evaluation version of the compiler.

SSE2 looks pretty promising due to the amount of 128 bit operations it can do simultaneously, so hopefully this chip will be worth its price tag.
 

Straylight

Junior Member
Jul 21, 2000
14
0
0
which one did you want me to try it on? p3 or P4? if it's a P3, then no problem. if it's a p4, it may be a little tougher. but sure, throw me a detailed email with the latest build of your compiler and i'll see what i can do. xentar@onebox.com. i can d/l the files from there. don't wanna give out my work email for obvious reasons.
 

BurntKooshie

Diamond Member
Oct 9, 1999
4,204
0
0
another thing - could you bench it with all the different cores, and NOT just set to "autoselect"? That way we know which core is the best for the P4, and tweak that one....
 

Fandu

Golden Member
Oct 9, 1999
1,341
0
0
BK, I could have the different cores benchmarked, but I actually think it's best to start from scratch. Then one can easily make use of all the new instructions and it would be much easier to order the ops to make maximum use of the pipelines. I imagine that coding for such a deep pipeline is going to require much different ordering of the instructions. For sure when (if) Intel designs a core we'll have the benchmarks versus all the other cores as well.
 

BurntKooshie

Diamond Member
Oct 9, 1999
4,204
0
0
I agree that its best to start from scratch. BUT, think of it this way: if you can see the differences between the highest performing core on the P4 v. the others, the information about what works well, and what doesn't, can be used IN ADDITION to the "theory" of reading the Intel Optimization guides. What I'm saying is that, perhaps it'll be a chance to see "poor code for the P4" v. "not quite so poor code for the P4."
 

Train

Lifer
Jun 22, 2000
13,572
66
91
www.bing.com
BK is right,
besides, not too many people have the immense concentration it takes to optimize code at the assembly level, its hard enough for basic instructions, but for a new architecture with brand new instructions, would take a hell of a lot of study to realize what the best design would be. For starters, im going to rely on the new compiler which intel is hoping will help Applications look a lot better on IA-32 chips, then we can worry about tweaking at the assembly level.
 

sciencewhiz

Diamond Member
Jun 30, 2000
5,885
8
81
To run a benchmark using all the cores, just run "dnetc.exe -bench" and the client will run rc5 long benchmarks for each of the cores and report the results.

Like BK said, and I've said before, the first couple p4 cores will be incremental improvements. Then when someone has the time (or Intel has the time), someone will totally rewrite the core, and we could see anywhere from another small increase to a core that performs as well or better than a G4.

Edit: Does anyone know how the p4 does in OGR, or any other distributed computing programs? It would be intersting to see how it compares to other chips when it doesn't have to use its crippled ROL instruction.
 

Train

Lifer
Jun 22, 2000
13,572
66
91
www.bing.com
SOme people have speculated that because of its great memory bandwith it would do good at Seti, but i havent seen any marks for that.
 

Deeko

Lifer
Jun 16, 2000
30,213
11
81
I expected the P4 to do well with Rc5, it has a double pumped ALU unit right? Oh well, poor intel
 

Fandu

Golden Member
Oct 9, 1999
1,341
0
0
Request for RC5 benchmark across all cores in. SETI request in but pending. I also expect that the P4 is going to do very well with SETI. The newer versions have dropped the L2 cache requirements, but like everything else, if won't really benefit unless SSE2 is implemented.
 

Fandu

Golden Member
Oct 9, 1999
1,341
0
0
Your wish is my command.

[Dec 08 05:48:01 UTC] RC5: using core #0 (RG/BRF class 5).
[Dec 08 05:48:20 UTC] Benchmark for RC5 core #0 (RG/BRF class 5)
0.00:00:16.07 [1,435,244.64 keys/sec]
[Dec 08 05:48:20 UTC] RC5: using core #1 (RG class 3/4).
[Dec 08 05:48:22 UTC] Note: this client does not support
the RC5/486/SMC core.
[Dec 08 05:48:39 UTC] Benchmark for RC5 core #1 (RG class 3/4)
0.00:00:16.42 [1,851,537.20 keys/sec]
[Dec 08 05:48:39 UTC] RC5: using core #2 (RG class 6).
[Dec 08 05:48:59 UTC] Benchmark for RC5 core #2 (RG class 6)
0.00:00:16.18 [1,943,829.52 keys/sec]
[Dec 08 05:48:59 UTC] RC5: using core #3 (RG Cx re-pair).
[Dec 08 05:49:18 UTC] Benchmark for RC5 core #3 (RG Cx re-pair)
0.00:00:16.20 [1,811,998.40 keys/sec]
[Dec 08 05:49:18 UTC] RC5: using core #4 (RG RISC-rotate I).
[Dec 08 05:49:37 UTC] Benchmark for RC5 core #4 (RG RISC-rotate I)
0.00:00:16.04 [1,764,735.16 keys/sec]
[Dec 08 05:49:37 UTC] RC5: using core #5 (RG RISC-rotate II).
[Dec 08 05:49:55 UTC] Benchmark for RC5 core #5 (RG RISC-rotate II)
0.00:00:16.26 [1,934,254.03 keys/sec]
[Dec 08 05:49:55 UTC] RC5: using core #6 (RG/HB ath).
[Dec 08 05:50:15 UTC] Benchmark for RC5 core #6 (RG/HB ath)
0.00:00:16.51 [1,714,436.44 keys/sec]
 

Fandu

Golden Member
Oct 9, 1999
1,341
0
0
I'm actually surprised with the RISC Rotate II core. Looks like it might be good to have a look at.
 

BurntKooshie

Diamond Member
Oct 9, 1999
4,204
0
0
Which speed was this P4? 1.5ghz I assume?

the RG class 6 is about the same as the RISC II...it should be interesting to compare the two.

Thanks Fandu!
 

Train

Lifer
Jun 22, 2000
13,572
66
91
www.bing.com
remember that core # 2 is the MMX core, very outdated IMO, SSE and SSE2 have never been taken advantage of in any RC5 core ive seen
 

DJ_D

Member
Oct 11, 1999
193
0
0
OK, so the best the P4 did was 1,943 kkeys. At 1.5 Ghz that gives us 1.29 kkeys/mhz/sec. Bleh Oh well, hopefully an optimized core will make it into a speed demon.

I thought SSE was floating point SIMD instructions. Does SSE or SSE2 have integer instructions?

Also aren't there only like 3 important instructions for doing RC5? Add, multiply, and rotate left or something? Does anyone know for sure?
 

BurntKooshie

Diamond Member
Oct 9, 1999
4,204
0
0
Yes, not only is SSE2 include integer SIMD, they are 128bit SIMD! The reason why mmx wasn't so useful for RC5 was that they just weren't able to bitslice it because MMX isn't wide enough an implementation of SIMD. Altivec is 128 bits, which allowed it to bitslice RC5, and now...SSE2....
 

DJ_D

Member
Oct 11, 1999
193
0
0
Yes, I see. Check this out from Intel's IA-32 Intel Architecture Software Developer's Manual, Volume 1: Basic Architecture, page 272.



<< The ability to operate on 128-bit packed integers (bytes, words, doublewords, and
quadwords) in XMM registers provides greater flexibility and greater throughput when
performing SIMD operations on packed integers. This capability is particularly useful for
applications such as RSA authentication and RC5 encryption. Using the full set of SIMD
registers, data types, and instructions provided with the MMX technology and the SSE and
SSE2 extensions, programmers can now develop algorithms that finely mix packed single-and
double-precision floating-point data and 64- and 128-bit packed integer data.
>>



So there is some hope.
 

BurntKooshie

Diamond Member
Oct 9, 1999
4,204
0
0
DJ-J! That's the quote I was looking for! So I do have that pdf file I knew I had read that a long time ago, I just couldn't remember exactly where! Thanks!
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |