S5Txxxx.dat Patched App Tests - Speed up

LiborA
LiborA
Joined: 8 Dec 05
Posts: 74
Credit: 337,135
RAC: 0
Topic 191443

Hi, I try crunche the same WUs by original aplication and Akos opti. For this test a downloaded 4 WU and crunched by original app. are:
h1_0318.0_S5R1__23088_S5R1a_1 - T: 4012.6 sec
h1_0081.5_S5R1__242_S5R1a_1 - T: 4718.7 sec
l1_0229.0_S5R1__2610_S5R1a_0 - T: 3908.6 sec
l1_0229.0_S5R1__2610_S5R1a_1 - T: 3949.9 sec

Now I'm crunchig it by S5T0003 version:
h1_0318.0_S5R1__23088_S5R1a_1 - T: is not finished yet
h1_0081.5_S5R1__242_S5R1a_1 - T: 4773.2 sec
l1_0229.0_S5R1__2610_S5R1a_0 - T: 3916.4 sec
l1_0229.0_S5R1__2610_S5R1a_1 - T: 3916.1 sec

It seem the speed-up is ZERO

Barrie
Barrie
Joined: 23 Mar 05
Posts: 219
Credit: 21,449,106
RAC: 0

S5Txxxx.dat Patched App Tests - Speed up

Please tell us what kind of processor these were run on. Your machines are hidden.

Dead men don't get the baby washed. HTH

LiborA
LiborA
Joined: 8 Dec 05
Posts: 74
Credit: 337,135
RAC: 0

RE: Please tell us what

Message 40267 in response to message 40266

Quote:
Please tell us what kind of processor these were run on. Your machines are hidden.

This is on my A64 2800+ (nonOC) machine with 1024 MB RAM (DDR400) and WinXP SP2.

Akos Fekete
Akos Fekete
Joined: 13 Nov 05
Posts: 561
Credit: 4,527,270
RAC: 0

RE: Hi, I try crunche the

Quote:

Hi, I try crunche the same WUs by original aplication and Akos opti. For this test a downloaded 4 WU and crunched by original app. are:

...

It seem the speed-up is ZERO

Thanks LiborA!

Your results a little bit surprising, but not too much.
I had to free up some places in the binary code at first.
I'm optimizing the hot loop now, but the zero-level tolerance at validation is a very strong restriction ( i cannot remove lots of slow rounding/memory operations ).

Kratylos
Kratylos
Joined: 23 Nov 05
Posts: 28
Credit: 1,669,914
RAC: 0

Akos, I don't know, if it can

Akos, I don't know, if it can help, but the Linux standard app. is around 20 % faster, than the Windows-version.

Thanks for you greatly work.

Pepperammi
Pepperammi
Joined: 20 Feb 05
Posts: 131
Credit: 437,943
RAC: 0

My P 4 HT 3.4Ghz was running

My P 4 HT 3.4Ghz was running S5T0001 and now S5T0301 and they cut about 1hour (maybe 2) off per unit when running two same time. Don't know about when running two diffrent projects at same time yet. Have another to compare times in an hour (this is compared to one long unit the mechine did on its own so not conclusive sorry)

My P D 830 3.21Ghz might have cut half an hour or more with the later S5T0301-S5T0304 waiting for the other person to return result so can see if will validate normally or if need concensus.

Akos Fekete
Akos Fekete
Joined: 13 Nov 05
Posts: 561
Credit: 4,527,270
RAC: 0

RE: Akos, I don't know, if

Message 40271 in response to message 40269

Quote:
Akos, I don't know, if it can help, but the Linux standard app. is around 20 % faster, than the Windows-version.

I think the official app could be about 2 times faster than the current speed. There are lots of needless FPU -> memory -> FPU operations, but i could not remove them because these movements change the last bit of the numbers ( only one bit! ).

The results would be better ( the last bit would be also good ) without these movements and the app would be more faster, but... the current S5 validator doesn't accept these ( better, faster ) results.

edit: perhaps i will do a test with SSE2, probably those registers doesn't need these corrections because they are only 64-bit wide.

LiborA
LiborA
Joined: 8 Dec 05
Posts: 74
Credit: 337,135
RAC: 0

RE: RE: Akos, I don't

Message 40272 in response to message 40271

Quote:
Quote:
Akos, I don't know, if it can help, but the Linux standard app. is around 20 % faster, than the Windows-version.

I think the official app could be about 2 times faster than the current speed. There are lots of needless FPU -> memory -> FPU operations, but i could not remove them because these movements change the last bit of the numbers ( only one bit! ).

The results would be better ( the last bit would be also good ) without these movements and the app would be more faster, but... the current S5 validator doesn't accept these ( better, faster ) results.

edit: perhaps i will do a test with SSE2, probably those registers doesn't need these corrections because they are only 64-bit wide.

Now I'm testing S5T0307. For a first view it seems a litle faster as original app and S5T003. The first of my testing Wu will be finished after cca 30 min :)

LiborA
LiborA
Joined: 8 Dec 05
Posts: 74
Credit: 337,135
RAC: 0

RE: I think the official

Message 40273 in response to message 40271

Quote:

I think the official app could be about 2 times faster than the current speed. There are lots of needless FPU -> memory -> FPU operations, but i could not remove them because these movements change the last bit of the numbers ( only one bit! ).

The results would be better ( the last bit would be also good ) without these movements and the app would be more faster, but... the current S5 validator doesn't accept these ( better, faster ) results.

Thats very interesting. What say on this people from E@H as Bruce and other leader of project?

Athlonheizer
Athlonheizer
Joined: 3 Jun 06
Posts: 33
Credit: 513,937
RAC: 0

RE: Thats very interesting.

Message 40274 in response to message 40273

Quote:
Thats very interesting. What say on this people from E@H as Bruce and other leader of project?


OK very fast so we make the WUs bigger. :-(

Athlon

Stay tuned and keep crunching

LiborA
LiborA
Joined: 8 Dec 05
Posts: 74
Credit: 337,135
RAC: 0

The first test result of

The first test result of S5T0307:

WU: l1_0229.0_S5R1__2610_S5R1a_1 - T: 3404.0 sec

That is about 14% speedup :)

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.