Optomized S5 SSE3

[B@H] Ray
[B@H] Ray
Joined: 4 Jun 05
Posts: 621
Credit: 49583
RAC: 0
Topic 191417

Any word on an Optomized SSE3 App. yet? SSE and SSE2 seem to be well covered with the stock App.

I know the credit will be the same but if faster the high power bill will be easier to swallow.

Stock App. is working good on my SSE2 system.

Cheers
Ray


Try the Pizza@Home project, good crunching.

Akos Fekete
Akos Fekete
Joined: 13 Nov 05
Posts: 561
Credit: 4527270
RAC: 0

Optomized S5 SSE3

Quote:
Any word on an Optomized SSE3 App. yet? SSE and SSE2 seem to be well covered with the stock App.

As far as i know Bernd would like to do different versions for different platforms, but he works on the new WUs and database at moment.
I haven't enough time to optimize the S5 app too.
I hope you know that SSE3 optimization doesn't mean a big performance improvement (cca. +5%).

Jord
Joined: 26 Jan 05
Posts: 2952
Credit: 5779100
RAC: 0

RE: I hope you know that

Message 38847 in response to message 38846

Quote:
I hope you know that SSE3 optimization doesn't mean a big performance improvement (cca. +5%).


Doesn't SSE3 mainly speed up 3D graphics and Digital Signal Processing?

Chris397
Chris397
Joined: 25 Aug 05
Posts: 5
Credit: 1423209
RAC: 0

RE: I hope you know that

Message 38848 in response to message 38846

Quote:


I hope you know that SSE3 optimization doesn't mean a big performance improvement (cca. +5%).

Hello Akos,

just two other questions about optimized S5 clients:
1. Do you know, if all the optimizations you used in the S4 apps are already included in the S5 client (or can we look forward to further improvements besides SSE3)?
2. How much do you think an additional optimization on executable level can achieve for S5 (I remember that you e.g. made special versions for small L1 cache in S4)?

I am sorry to hear that you are very short on time at the moment, but maybe you would like to have some more fun (as you used to call it :-) ) in the future?

Thank you for all your time and the big boost you gave to the project so far,

Chris397

ca_grufti
ca_grufti
Joined: 9 Feb 05
Posts: 53
Credit: 4309237
RAC: 0

It appears that there are

It appears that there are still a number of type conversions in the compute intensive parts of the S5 science app. If there is a way to eliminate those then that could speed things up.

Akos is definitely the person who knows where more speed can be found in the Einstein application ... and for all the various instruction sets.

Stan Pleban
Stan Pleban
Joined: 2 Dec 05
Posts: 73
Credit: 4635380
RAC: 0

If an optimized SS3 meant an

If an optimized SS3 meant an improvement of 5-8 % when my Pent4 2.8GHz is taking now 13 + hours....that could mean to me an improvement of almost an hour...I would be happy with that

Pepperammi
Pepperammi
Joined: 20 Feb 05
Posts: 131
Credit: 437943
RAC: 0

RE: If an optimized SS3

Message 38851 in response to message 38850

Quote:
If an optimized SS3 meant an improvement of 5-8 % when my Pent4 2.8GHz is taking now 13 + hours....that could mean to me an improvement of almost an hour...I would be happy with that


Hell yea. I second that.

ca_grufti
ca_grufti
Joined: 9 Feb 05
Posts: 53
Credit: 4309237
RAC: 0

There is probably 30-50+%

There is probably 30-50+% elsewhere.

Akos Fekete
Akos Fekete
Joined: 13 Nov 05
Posts: 561
Credit: 4527270
RAC: 0

I replaced some truncations

I replaced some truncations with SSE3 instructions, but i forgot that i don't have SSE3 machine at moment, so i could not test it. :-)

Pepperammi
Pepperammi
Joined: 20 Feb 05
Posts: 131
Credit: 437943
RAC: 0

RE: I replaced some

Message 38854 in response to message 38853

Quote:
I replaced some truncations with SSE3 instructions, but i forgot that i don't have SSE3 machine at moment, so i could not test it. :-)

Do you need testers? :) i got two SSE3 capable machines.
Just tell me how you want them tested if you want me?

Akos Fekete
Akos Fekete
Joined: 13 Nov 05
Posts: 561
Credit: 4527270
RAC: 0

RE: RE: I hope you know

Message 38855 in response to message 38847

Quote:
Quote:
I hope you know that SSE3 optimization doesn't mean a big performance improvement (cca. +5%).
Doesn't SSE3 mainly speed up 3D graphics and Digital Signal Processing?

SSE3 is an extension of SSE2 intructions, so they are CPU operations with double precision datas. 3D graphics usually use only single precision because it's faster and the graphics doesn't need better calculations. But SSE3 sometimes good for DSP, it has some special instructaions to simplify some instruction sequences.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.