This is the small comparison test of akosf optimized clients-> albert_4.37_x86 for Windoze
CPU,RAM,Instruction set:
AMD Athlon XP (Barton) 3200+@2200 MHz
512MB RAM
MMX(+),3DNow!(+),SSE
Tested WU:
(short) z1_0261.5_2539_S4R2a_1
-------------------------------------------------
|version |time[s]| curr./orig.| speedup |
| | | | |
|-----------------------------------------------|
|original | 4553 | 1.0000 | 1.0000 |
|-----------------------------------------------|
|387 | 2294 | 0.5038 | 1.9847 |
|C-37 | 1867 | 0.4100 | 2.4387 |
|C-40 | 1653 | 0.3630 | 2.7544 |
|-----------------------------------------------|
|A-36 | 1944 | 0.4269 | 2.3421 |
|D-40 | 1194 | 0.2622 | 3.8132 |
|-----------------------------------------------|
|S-38 | 1354 | 0.2974 | 3.3626 |
|S-39 | 1389 | 0.3051 | 3.2779 |
|S-39L | 1249 | 0.2743 | 3.6453 |
|S-40 | 1144 | 0.2512 | 3.9799 |
-------------------------------------------------
So the best results gives S-40 with SSE 3.9799 x faster than original
second place D-40 with 3DNow! 3.8732 x faster than original
Thanks akosf, you have done great job !!!!
Copyright © 2024 Einstein@Home. All rights reserved.
Comparison of clients
)
So you actually did a controlled comparison of the science aps (not clients) using a single standard work unit. Outstanding.
Thanks for the data. My guess is that your method gives a more accurate comparison for most purposes than those of us providing before/after results on whatever work units we happen to get.
If there is meaningful variation in computation type content among WU's, you fail to randomize that, but you should get a far more accurate comparision of the actual work at hand. I suspect Einstein does not vary much in computation mix (floating vs integer mix, locality of reference ...) from WU to WU, though I have my doubts on that score with SETI.
Would you care to share your method for "capturing" a work unit and running and measuring the time "privately". I assume you disengage the boincmgr and boinc.exe, but don't know just how you handle this. Perhaps some other folks here might try your method on some of the other CPU types.
RE: Thanks akosf, you have
)
Thanks.
I did a modification on S40, I hope it will be reach the 4.000 ratio... :-)
RE: RE: Thanks akosf, you
)
Are you saying there are two versions of S40 now? And those that got it b4 should d/l again???
:)
98SE XP2500+ @ 2.1 GHz Boinc v5.8.8
RE: RE: I did a
)
Pooh! I didn't change the version because this modification is very small.
But I will change it before upload. Okay?
What do you think about a fine codename? :-)
RE: RE: RE: I did a
)
How about S40.01 or S40again or S40.4x
Yes, S40.4x. The 4x for 4 times faster, lol.
:)
98SE XP2500+ @ 2.1 GHz Boinc v5.8.8
RE: RE: What do you think
)
S40.01 sounds well, because I have new ideas. (small things, but lot)
And two decimal places will be enough. (we will run out of work in some days!)
RE: RE: This is the small
)
I used BOINC 5.2.13 optimized by truX
I download some WUs, then i stopped crunching any unit, suspend all unit except z1_0261.5_2539_S4R2a_1 and all network activity, then I create some copies of BOINC directory for every tested "science aps", then each of tested "albert_4.37_windows_intelx86.exe" original and akosfs versions were copied to einstein directories, so it was 10 directories.
After that, i started one of my copied BOINCmgr and also start only that one WU which was not suspended. When WU was cruched i wrote the time into tablet. So i did this 10 times.
It is quite simple method, but i think the results are usable.
Thats all...
RE: I download some WUs,
)
Got it. The suspended network activity would stop spurious reporting of multiple results for the same WU at the time. For cleanup in the end is just deleting the replicated BOINC directories enough? Or is there something more global than that to assure a clean restart when you resume real processing?
RE: RE: Got it. The
)
Yes, It is enough. After testing, all replicated BOINC were deleteted. (but i still have one for new akosf optimized app.).
For example, I did not test (S-37a) because of SSE2.
I don't understand the
)
I don't understand the naming. D40 vs S40, etc.
I am running D40 on my AMD XP PCs. This is correct? or should I switch to the S40 for a little more speed?
G