Any word on an Optomized SSE3 App. yet? SSE and SSE2 seem to be well covered with the stock App.
I know the credit will be the same but if faster the high power bill will be easier to swallow.
Stock App. is working good on my SSE2 system.
Cheers
Ray
Try the Pizza@Home project, good crunching.
Copyright © 2024 Einstein@Home. All rights reserved.
Optomized S5 SSE3
)
As far as i know Bernd would like to do different versions for different platforms, but he works on the new WUs and database at moment.
I haven't enough time to optimize the S5 app too.
I hope you know that SSE3 optimization doesn't mean a big performance improvement (cca. +5%).
RE: I hope you know that
)
Doesn't SSE3 mainly speed up 3D graphics and Digital Signal Processing?
RE: I hope you know that
)
Hello Akos,
just two other questions about optimized S5 clients:
1. Do you know, if all the optimizations you used in the S4 apps are already included in the S5 client (or can we look forward to further improvements besides SSE3)?
2. How much do you think an additional optimization on executable level can achieve for S5 (I remember that you e.g. made special versions for small L1 cache in S4)?
I am sorry to hear that you are very short on time at the moment, but maybe you would like to have some more fun (as you used to call it :-) ) in the future?
Thank you for all your time and the big boost you gave to the project so far,
Chris397
It appears that there are
)
It appears that there are still a number of type conversions in the compute intensive parts of the S5 science app. If there is a way to eliminate those then that could speed things up.
Akos is definitely the person who knows where more speed can be found in the Einstein application ... and for all the various instruction sets.
If an optimized SS3 meant an
)
If an optimized SS3 meant an improvement of 5-8 % when my Pent4 2.8GHz is taking now 13 + hours....that could mean to me an improvement of almost an hour...I would be happy with that
RE: If an optimized SS3
)
Hell yea. I second that.
There is probably 30-50+%
)
There is probably 30-50+% elsewhere.
I replaced some truncations
)
I replaced some truncations with SSE3 instructions, but i forgot that i don't have SSE3 machine at moment, so i could not test it. :-)
RE: I replaced some
)
Do you need testers? :) i got two SSE3 capable machines.
Just tell me how you want them tested if you want me?
RE: RE: I hope you know
)
SSE3 is an extension of SSE2 intructions, so they are CPU operations with double precision datas. 3D graphics usually use only single precision because it's faster and the graphics doesn't need better calculations. But SSE3 sometimes good for DSP, it has some special instructaions to simplify some instruction sequences.