Optomized S5 SSE3

ErichZann
ErichZann
Joined: 11 Feb 05
Posts: 120
Credit: 81582
RAC: 0

RE: I think the official

Message 39047 in response to message 39041

Quote:

I think the official app could be about 2 times faster than the current speed. There are lots of needless FPU -> memory -> FPU operations, but i could not remove them because these movements change the last bit of the numbers ( only one bit! ).

The results would be better ( the last bit would be also good ) without these movements and the app would be more faster, but... the current S5 validator doesn't accept these ( better, faster ) results.

I dont find the post where you wrote this and just quote from pepperammi but:

I also think you could talk with the project leaders about this, can't you?
I suppose they would either explain you why they can't change this or or again listen to your good advice :)

Pepperammi
Pepperammi
Joined: 20 Feb 05
Posts: 131
Credit: 437943
RAC: 0

RE: I dont find the post

Message 39048 in response to message 39047

Quote:

I dont find the post where you wrote this and just quote from pepperammi but:

I also think you could talk with the project leaders about this, can't you?
I suppose they would either explain you why they can't change this or or again listen to your good advice :)


It's here.http://einsteinathome.org/node/191443
Sorry i wasn't too sure my post belonged in that thread so put it here. Sorry if shouldn't have.
I wouldn't want to start interfering myself. I'll leave it to the people who know a lot more what they're talking about :)

Athlonheizer
Athlonheizer
Joined: 3 Jun 06
Posts: 33
Credit: 513937
RAC: 0

With 0304 i have 6 valid

With 0304 i have 6 valid short WUs on an XP3200+
Now i test 0307.
0304 is arround 8-9% faster on this maschine than the standard app.
0303 on my A64 is arround 10-12% faster than standard app. but brings no good WUs. Checked, but no consensus yet

Athlon

It´s nice to be part of this community :-(

Stay tuned and keep crunching

Pepperammi
Pepperammi
Joined: 20 Feb 05
Posts: 131
Credit: 437943
RAC: 0

http://einstein.phys.uwm.edu/

http://einsteinathome.org/workunit/10104420
Again have to wait till the other person return result. First few % S5T0001 rest with S5T0301. Will move up to more recent patch when all these finally get verdict on validation.

MRAO
MRAO
Joined: 7 May 05
Posts: 33
Credit: 15770746
RAC: 0

S5T0003: seeing results in

S5T0003: seeing results in range 53033-57580 seconds compared to range 58339-59193 with stock app on dual processor P3 1266, all on WUs for h1_0600.5. All validating. That's about 8-10% improvement I think. Now onto S5T0307. Mike

Akos Fekete
Akos Fekete
Joined: 13 Nov 05
Posts: 561
Credit: 4527270
RAC: 0

So, i could estimate the time

So, i could estimate the time on the same wu:

S5T0000: 38721 sec
S5T0307: 33304 sec

gain: 38721 / 33304 = 1,163

CPU: Sempron E6 1600 MHz

Crunchers For More Power
Crunchers For M...
Joined: 3 Aug 05
Posts: 69
Credit: 1071273
RAC: 0

In 1h i have a good

In 1h i have a good comparison between standard app, S5T0301, S5T0304, S5T0305, S5T0307.

LiborA
LiborA
Joined: 8 Dec 05
Posts: 74
Credit: 337135
RAC: 0

RE: So, i could estimate

Message 39054 in response to message 39052

Quote:

So, i could estimate the time on the same wu:

S5T0000: 38721 sec
S5T0307: 33304 sec

gain: 38721 / 33304 = 1,163

CPU: Sempron E6 1600 MHz

How I can verify correctness of results from opti app. The size of files are the same as from orginal app, but on same bytes are different?

Akos Fekete
Akos Fekete
Joined: 13 Nov 05
Posts: 561
Credit: 4527270
RAC: 0

RE: How I can verify

Message 39055 in response to message 39054

Quote:
How I can verify correctness of results from opti app. The size of files are the same as from orginal app, but on same bytes are different?

You cannot check it. You should send back the results to the validator, so you need different wus for validation tests.

Akos Fekete
Akos Fekete
Joined: 13 Nov 05
Posts: 561
Credit: 4527270
RAC: 0

RE: S5T0308.dat -

Message 39057 in response to message 39040

Quote:

S5T0308.dat

- eliminated double jumps
- reduced amount of FPU macro ops
- removed double loads on general purpose registers

- better SSE register usage
- reduced memory and integer register usage
- optimized branch structure
- faster FPU comparisons

- less FPU-memory-FPU operation

CPU: SSE compatible

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.