S5R2

Akos Fekete
Akos Fekete
Joined: 13 Nov 05
Posts: 561
Credit: 4527270
RAC: 0

RE: Akos, what about the

Message 62321 in response to message 62313

Quote:
Akos, what about the S5R2 hot loop and SSE2? I know SSE1 and 3DNow brought a huge speed increase to the former applications, so the code seems to be fine for vectorization. These two work on 64 Bit, whereas you can do 128 Bits with SSE2. There's just not much benefit in doing so with P3/4/M and AXP/64. However, Core 2 D/Q and K10 would benefit greatly from such a codepath as they can process 128 Bit vectors in one cycle (OK, send and retire one such command per clock per unit :) instead of two cycles for the former CPUs. I see great potential for optimizations here, which is hardly used, if at all, by any BOINC app today (or did I miss some fancy new project?)


The 'main' hot loop is designed for single precision ( against of S5R1 ), and the double precision parts are similar to the S5R1 code, where the SSE2 code was tried out, but wasn't faster (AXP,A64,P4,PM). So i don't think that the code will be optimised for SSE2. I know the C2D processors would be faster but not significantly. There is no time for this at moment.

The 64 bit SSE engine CPUs can execute an arithmetic instruction ( ADD,MUL,SUB ) in 5 cycles. The 128 bit wide SSE engine CPUs (C2D,K10?) executes them in 4 cycles. So the ratio is 5:4, not 2:1. Of course, there are 1 cycle difference.

Metod, S56RKO
Metod, S56RKO
Joined: 11 Feb 05
Posts: 135
Credit: 826469244
RAC: 86719

RE: Besides old, slow

Message 62322 in response to message 62320

Quote:
Besides old, slow machines having a problem with that, so
do newer machines that run a low resource allocation to e@h.
With BIG units coming in, BOINC has to prioritize them
up, with the result of not honoring my chosen resource
shares, and cutting heavily into the time allocated to
my primary project, s@h.

You are aware that BOINC does scheduling on two layers, right? One is when running already downloaded WUs and one when deciding on when to fetch another WU. If E@H runs into dead-line problem and thus consumes more than its share of CPU time, it'll build up long term debt, which in turn will prevent it from fetching another E@H WU. This effectively means that your resource share split will be honoured in longer time scale (such as on monthly basis).

But you are right when saying your resource shares won't be matched daily ...

Metod ...

Odysseus
Odysseus
Joined: 17 Dec 05
Posts: 372
Credit: 20568311
RAC: 6019

RE: And I would have to

Message 62323 in response to message 62310

Quote:
And I would have to agree with you on the value of the users running 1.0 -1.5 or slower chips, on dialup, and would be very interested in how the completed work would "breakdown" between the various CPU speeds. I think most people would be surprised. But I gotta agree with you that the backbone of many of these programs is made up of people with older machines that work perfectly well, but they are not being used after they are replaced with the latest and greatest screemin' gamer box. They need to take a close look before they run too many of these folks off for good.


Even if the slow machines themselves are considered expendable, i.e. their contributions would not be greatly missed, I hope their owners aren’t. I expect that any participants who may be forced to withdraw will be less inclined to attach their new or upgraded systems to the project in the future. So accommodating the ‘low-end’ users, to earn their loyalty—rather than sneering at them—would probably pay off in the long term.

Annika
Annika
Joined: 8 Aug 06
Posts: 720
Credit: 494410
RAC: 0

You're 100% correct there.

You're 100% correct there. Still, there are technical necessities which sometimes force a project to have rather high system requirements. Over at CPDN, for example, they demand 512 MB of memory and at least a 1.6 GHz CPU- and believe me, they do need it. When I attached my old notebook (1.3 GHz and 496 MB of memory, so, just a little bit below the recommended minimum configuration) the whole thing was so unstable that I gave it up after a few weeks. I never got more than 8 hours into my model. (And no, it wasn't anything about the system, which cooperated with about every other BOINC project just fine, nor was it my fault cause on my more powerful desktop I didn't get any problems. My notebook was, very simply, not powerful enough) And I don't think they do that to annoy crunchers, it's simply because the app needs it. So, if the current science run and the hardware used for the project on the server side force the developers to make the WUs larger, I don't think that can really be helped. Of course, it is worth considering if they can extend the deadlines a bit- which in turn would probably mean higher "pendings" for people with medium to fast boxes, but then, having pending credit is not sooooo bad cause you always know you'll get it eventually.

Ananas
Ananas
Joined: 22 Jan 05
Posts: 272
Credit: 2500681
RAC: 0

My P3s/1266 Tualatins work

My P3s/1266 Tualatins work fine with HadCM3, one dual (1GB RAM) and one single (256MB RAM). I had a full run lately :-)

I wish all projects would have optional "no nonsense" clients though, with only the math core and no graphics and stuff.

Odysseus
Odysseus
Joined: 17 Dec 05
Posts: 372
Credit: 20568311
RAC: 6019

RE: I wish all projects

Message 62326 in response to message 62325

Quote:
I wish all projects would have optional "no nonsense" clients though, with only the math core and no graphics and stuff.


If you install BOINC as a system service, the graphics will be disabled. (Systems with video cards that do their own processing won’t get a lot of speed benefit from this.)

J Langley
J Langley
Joined: 30 Dec 05
Posts: 50
Credit: 58338
RAC: 0

RE: So i don't think that

Message 62327 in response to message 62321

Quote:
So i don't think that the code will be optimised for SSE2. I know the C2D processors would be faster but not significantly. There is no time for this at moment.

How about Penryn's SSE4? That should be out late this year / early next year. Are the new instructions of any use to E@H?

Ananas
Ananas
Joined: 22 Jan 05
Posts: 272
Credit: 2500681
RAC: 0

RE: RE: I wish all

Message 62328 in response to message 62326

Quote:
Quote:
I wish all projects would have optional "no nonsense" clients though, with only the math core and no graphics and stuff.

If you install BOINC as a system service, the graphics will be disabled. (Systems with video cards that do their own processing won’t get a lot of speed benefit from this.)

Not using isn't the real problem, not loading (and initializing) the .DLL / .SO stuff would be a lot better. Lite and graphical version could even be in the same program, if the shared libraries loaders would be under program control instead of linker control.

p.s.: linking against a module with all dummy functions in it would do that job too of course ;-)

F. Prefect
F. Prefect
Joined: 7 Nov 05
Posts: 135
Credit: 1016868
RAC: 0

RE: RE: RE: I wish all

Message 62329 in response to message 62328

Quote:
Quote:
Quote:
I wish all projects would have optional "no nonsense" clients though, with only the math core and no graphics and stuff.

If you install BOINC as a system service, the graphics will be disabled. (Systems with video cards that do their own processing won’t get a lot of speed benefit from this.)

Not using isn't the real problem, not loading (and initializing) the .DLL / .SO stuff would be a lot better. Lite and graphical version could even be in the same program, if the shared libraries loaders would be under program control instead of linker control.

p.s.: linking against a module with all dummy functions in it would do that job too of course ;-)

I'm an aging baby boomer who graduated with a minor in physics and astronomy many, many years ago, and still like to take my C-8 out at least once a month, but sometimes I wish to h*ll I could figure out what you youngsters are talkin' about.. I just got to a point where I could put together a machine using the first generation Athlons, and now I buy a board, chip, and accessories for the latest dual, and everything somehow looks kind of
foreign:(

Gary

In the beginning the Universe was created. This has made a lot of people very angry and been widely regarded as a bad move.....Douglas Adams

ErichZann
ErichZann
Joined: 11 Feb 05
Posts: 120
Credit: 81582
RAC: 0

Hm, the first S5R2 WU i got

Hm, the first S5R2 WU i got needed about 12 hours to complete. The one i have now is at 2,5 hours and tells me: 26 hours to go - wow... Thats really long and for someone who doesnt crunch too often the 14 days limit could become a problem at this length...

(on an Athlon 64 3500+ at 2500 Mhz)

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.