Strange Behaviour of BOINC v 4.19 with e@h 4.78 and cpdn 4.03 on Dual PowerMac

Stefan Urbat
Stefan Urbat
Joined: 9 Feb 05
Posts: 16
Credit: 147672
RAC: 0
Topic 187580

When launching BOINC, it runs three instead of two instances: both cpdn and one e@h process together, all at the same priority. The resource share I set is 400 for CPDN and the default 100 for Einstein@home, so the behaviour I had shortly after joining is supposed to be correct: one CPU always does CPDN and the other is hourly switching between e@h and the other CPDN work unit. But as mentioned, something strange is happening (all went smooth when running CPDN alone since summer 2004, by the way). The messages hint, that there is some kind of confusion with pausing/relaunching/resuming the clients, as mentioned all three are running intially:

Launch:

2005-02-13 12:12:30 [---] Starting BOINC client version 4.19 for powerpc-apple-darwin
2005-02-13 12:12:30 [climateprediction.net] Project prefs: using your defaults
2005-02-13 12:12:30 [Einstein@Home] Project prefs: using your defaults
2005-02-13 12:12:30 [climateprediction.net] Host ID is 2490
2005-02-13 12:12:30 [Einstein@Home] Host ID is 11089
2005-02-13 12:12:30 [---] General prefs: from climateprediction.net (last modified 2005-02-11 07:07:29)
2005-02-13 12:12:30 [---] General prefs: using your defaults
2005-02-13 12:12:30 [climateprediction.net] Resuming computation for result 1y50_100111826_1 using hadsm3 version 4.03
2005-02-13 12:12:30 [climateprediction.net] Deferring computation for result 418n_000210119_0
2005-02-13 12:12:30 [Einstein@Home] Resuming computation for result H1_0429.9__0430.0_0.1_T05_Test02_4 using einstein version 4.78
2005-02-13 12:12:30 [climateprediction.net] Restarting result 418n_000210119_0 using hadsm3 version 4.03
2005-02-13 12:12:30 [Einstein@Home] Pausing result H1_0429.9__0430.0_0.1_T05_Test02_4 (left in memory)
Starting model in /Volumes/ufs3/cpdn/projects/climateprediction.net...
Created shared memory region key = 24305
Env Used=DYLD_LIBRARY_PATH=/Volumes/ufs3/cpdn/projects/climateprediction.net:../
Starting model in /Volumes/ufs3/cpdn/projects/climateprediction.net...
Created shared memory region key = 24620

claimed switch, effectively stopping one CPDN instance instead:

2005-02-13 13:12:30 [climateprediction.net] Pausing result 418n_000210119_0 (left in memory)
2005-02-13 13:12:30 [Einstein@Home] Resuming result H1_0429.9__0430.0_0.1_T05_Test02_4 using einstein version 4.78

finally correct switch to CPDN:

2005-02-13 14:12:30 [climateprediction.net] Resuming result 418n_000210119_0 using hadsm3 version 4.03
2005-02-13 14:12:30 [Einstein@Home] Pausing result H1_0429.9__0430.0_0.1_T05_Test02_4 (left in memory)
Resuming CPDN!

since then all seems to be fine...

Any explanation for this?

C
C
Joined: 9 Feb 05
Posts: 94
Credit: 189446
RAC: 0

Strange Behaviour of BOINC v 4.19 with e@h 4.78 and cpdn 4.03 on

Hi, Stefan.

I'm having similar problems - you can read the background here. I hope Dr. Allen takes a look at this thread on Monday - I'm guessing it's a generic problem for the Mac.

C

Stefan Urbat
Stefan Urbat
Joined: 9 Feb 05
Posts: 16
Credit: 147672
RAC: 0

> I'm having similar problems

Message 2933 in response to message 2932

> I'm having similar problems - you can read the background href="https://einsteinathome.org/%3Ca%20href%3D"http://einsteinathome.org/node/187558">http://einsteinathome.org/node/187558">here[/url].
>
Thanks for the link, I have read it largely, and didn't find it before.

> I'm guessing it's a generic
> problem for the Mac.
>
That may very well the case --- or maybe not. My client is of course running in keep in memory mode, to avoid wasting too much work at the relatively frequent switchings.

What is interesting: opposed to cpdn and seti the Einstein@home BOINC client seems to be a multithread resp. (at least on GNU/Linux) multi process application (guess because only the recent 2.6 Linux Kernel offers fully Posix conformant threading support). Can it have to do something with it? Because for example cpdn uses a control process of its own to launch the main crunching process, but not several threads/processes for that task itself. Maybe the BOINC client isn't able to cope correctly with such a multi-thread application?

But I haven't studied the BOINC client code in depth, so I'm just guessing, I have to admit. My suspect is wrong signal handling among BOINC and the Einstein processes/threads anyway; it seems to be fairly obvious, when choosing the right process/thread, I can manually freeze (stop) and continue the Einstein@h cruncher anyway.

Especially here on GNU/Linux it is rather obvious to me, that the crunching process is the last child of a three process cascade of those, all carrying the same name, meaning they represent the process simulated threads (by the way, that machine is running a 2.6.10 kernel and could easily use true threading, if desired, but that is just a note).

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.