Hi!
I put a new Windows App for testing on the Beta Test Page. It incorporates ideas from E@H user "AkosF" and some others making it much faster than the current official App. How fast exactly is left to you to find out.
As the changes have been made on the source code level, they are probably not as efficient as the latest Assembler-level tuned Apps, but compatibility has always been a higher goal at least for the official Apps.
Give it a try. "Akosf", you may find to tune something on this one, too.
BM
BM
Copyright © 2024 Einstein@Home. All rights reserved.
Windows Beta Test App 4.50 available
)
Thanks a lot Bernd for the Albert 4.50 Beta (Windows Version)
I have some _very_ preliminary results from my Pentium M 2 GHz:
Albert 4.37 - relative factor 1.0
Albert 4.50 - relative factor 1.8
Turboalbert S40 - relative factor 4.1
Edit: I just noticed the 4.50 for Windows does not incorporate vectored code so i add for comparison:
Turboalbert C40 - relative factor 3.0
1.8 is really nice for sourcecode-level optimizations!
My results base on only _one_ partially calculated WU so please take them with a pinch of salt - they may be inaccurate.
RE: Hi! Hi! RE: Give
)
Hi!
Wow! Really good job!
I hope you don't mind if I will do some black magic on this app.
AF
RE: I hope you don't mind
)
Nope. As long as you let us know what you did - the source code will always be a bit behind.
BM
BM
RE: I hope you don't mind
)
Here's some white one that didn't make it into the App on time: The innermost loop is always ran 32 times (at least for the current S4 run), you can unroll it if you like.
BM
BM
I'll give this 4.50 version a
)
I'll give this 4.50 version a try on my Windows 2003 Server machine, which is currently running 4.19 because all other version fail to run properly.
"Chance is irrelevant. We will succeed."
- Seven of Nine
RE: I'll give this 4.50
)
4.19 is the BOINC, client. Albert was 4.37, and now the Beta is 4.5. Sometimes when the numbers are close, it's hard to keep this infomation seperate.
RE: RE: I hope you don't
)
Thanks!
But it is done in my codes (S38,S39,S40,D40). It took about 1-2% improvement.
Other idea: You can combine the precision of this innerloop.
That means you can use the FPU to calculate only 4 values (high precision).
The other 28 values are calculated by SSE/3DNow! reciprocal instructions.
Newton-Raphson iteration isn't needed!
SSE/3DNow! reciprocal instruction is very fast, because it is just a look-up table reading (faster than a simple addition!).
This method is working, because the middle values will be much bigger than others, so the lower bits of the others aren't important.
RE: RE: I'll give this
)
Ahh - quite true. Thank you for the correction - I do still sometimes confuse the Science App with the BOINC Client. :o
I do still wish that I could find a new version client that would work with Server 2003. There is a thread from some time back about this issue that one other gentleman had noticed as well, but no resolution as of yet. Oh well.. as long as the BOINC 4.19 client results continue to be accepted...
"Chance is irrelevant. We will succeed."
- Seven of Nine
RE: I do still wish that I
)
What kind of problems did you have? I have a pile of Windows Server 2003 (both Standard and Enterprise; some 32 and some 64 bit) machines that all run BOINC clients version 5.x. They used to run 5.2.13 (still the recommended version) and now run 5.3.31 (development one).
A side note: I always run BOINC CC as a service...
Metod ...
Hmm.. I haven't tried running
)
Hmm.. I haven't tried running BOINC as a service on that server - I'll give that a shot. Thanks! :)
"Chance is irrelevant. We will succeed."
- Seven of Nine