GNU/Linux S5R3 "power users" App 4.21 available

Metod, S56RKO
Metod, S56RKO
Joined: 11 Feb 05
Posts: 135
Credit: 809775752
RAC: 63259

RE: I'm surprised that

Message 76352 in response to message 76351

Quote:
I'm surprised that BOINC isn't considering the upgraded PC as a completely new machine (and I thought you could not merge machines with different CPUs).

As long as you re-use host ID, BOINC will happily accept any change you may throw at it (including platform and OS change).

BOINC server only assigns new ID to a host when host newly attaches to the project (and server's heuristic can't find long-idle host with same characteristics to re-use the ID) or when RPC sequence numbers get out of sync (eg. when server's idea of number of connection attempts is higher than that of client side).

Once I developed a trick to reduce number of hosts on my accounts: for each orphaned host I did some magic in the client_state.xml and did a project update. This way I ended up with the same number of orphaned hosts, but all of them had the same description. BOINC server then allowed me to merge them into one .

Metod ...

Bikeman (Heinz-Bernd Eggenstein)
Bikeman (Heinz-...
Moderator
Joined: 28 Aug 06
Posts: 3522
Credit: 689012770
RAC: 211915

RE: RE: I'm surprised

Message 76353 in response to message 76352

Quote:
Quote:
I'm surprised that BOINC isn't considering the upgraded PC as a completely new machine (and I thought you could not merge machines with different CPUs).

As long as you re-use host ID, BOINC will happily accept any change you may throw at it (including platform and OS change).

BOINC server only assigns new ID to a host when host newly attaches to the project (and server's heuristic can't find long-idle host with same characteristics to re-use the ID) or when RPC sequence numbers get out of sync (eg. when server's idea of number of connection attempts is higher than that of client side).

Once I developed a trick to reduce number of hosts on my accounts: for each orphaned host I did some magic in the client_state.xml and did a project update. This way I ended up with the same number of orphaned hosts, but all of them had the same description. BOINC server then allowed me to merge them into one .

Thanks for the explanations. Now that you say it, I should have known better: once I upgraded from an Athlon XP 1800+ to something sold as a "AMD Geode 2001+" which really is a relabelled AMD Athlon XP 2200+ (but was cheaper, go figure :-) ) . And the Host ID didn't change.

CU
Bikeman

Donald A. Tevault
Donald A. Tevault
Joined: 17 Feb 06
Posts: 439
Credit: 73516529
RAC: 0

RE: Currently my top

Message 76354 in response to message 76341

Quote:

Currently my top priority items for Einstein@home Apps are:

* fix the 'signal 11' problem of the Linux App (4.20/21). The Linux Apps migth be fast, but currently Linux hast the highest failure rate of all platforms, which is mainly due to this 'signal 11' errors.

BM

@Bernd

Here are some of my observations about "signal 11".

With my machines, it only seems to happen when there are communication problems. (Unfortunately, that seems to be happening more and more often lately with my ISP.) It also seems to only happen on machines with the newer BOINC clients. The old 5.8.x clients don't seem to be affected, where the 5.10.x clients are. (Of course, that could just be a matter of timing, which means that my theory could be wrong.)

So, I'm wondering, if something in BOINC, rather than the Einstein app, could be causing the problem.

Bikeman (Heinz-Bernd Eggenstein)
Bikeman (Heinz-...
Moderator
Joined: 28 Aug 06
Posts: 3522
Credit: 689012770
RAC: 211915

4.21 runs just fine on my

4.21 runs just fine on my Pentium M (Banias) notebook. It will take some more units to say something definitive about speed but about 20 - 30 % speedup should be realistic on this particular machine.

FalconFly
FalconFly
Joined: 16 Feb 05
Posts: 191
Credit: 15650710
RAC: 0

RE: It also seems to only

Message 76356 in response to message 76354

Quote:

It also seems to only happen on machines with the newer BOINC clients. The old 5.8.x clients don't seem to be affected, where the 5.10.x clients are. (Of course, that could just be a matter of timing, which means that my theory could be wrong.)

So, I'm wondering, if something in BOINC, rather than the Einstein app, could be causing the problem.

I heard alot of talking about BOINC 5.10.x screwing up badly when it encounters connection problems in various Project forums, i.e. erroneously killing all Results in progress, and several people started avoiding 5.10.x until the bug is fixed. (haven't encountered this myself though, but my connection is usually rock stable)

Boinc Trac currently doesn't list anything alike though, so all of this remains unverified...

Matt LO
Matt LO
Joined: 7 Feb 06
Posts: 44
Credit: 386731
RAC: 0

RE: I heard alot of talking

Message 76357 in response to message 76356

Quote:
I heard alot of talking about BOINC 5.10.x screwing up badly when it encounters connection problems in various Project forums, i.e. erroneously killing all Results in progress, and several people started avoiding 5.10.x until the bug is fixed. (haven't encountered this myself though, but my connection is usually rock stable)

I've seen this behaviour. My internet was down, and all my tasks error'd out. I am running 5.10.8 on ubuntu.

Just put the new app on. it's crunchin . . .

Brian Silvers
Brian Silvers
Joined: 26 Aug 05
Posts: 772
Credit: 282700
RAC: 0

RE: RE: It also seems to

Message 76358 in response to message 76356

Quote:
Quote:

It also seems to only happen on machines with the newer BOINC clients. The old 5.8.x clients don't seem to be affected, where the 5.10.x clients are. (Of course, that could just be a matter of timing, which means that my theory could be wrong.)

So, I'm wondering, if something in BOINC, rather than the Einstein app, could be causing the problem.

I heard alot of talking about BOINC 5.10.x screwing up badly when it encounters connection problems in various Project forums, i.e. erroneously killing all Results in progress, and several people started avoiding 5.10.x until the bug is fixed. (haven't encountered this myself though, but my connection is usually rock stable)

Boinc Trac currently doesn't list anything alike though, so all of this remains unverified...

I'll add that to my list of reasons to keep using 5.8.16 ;-)

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 4273
Credit: 245208851
RAC: 13214

RE: RE: It also seems to

Message 76359 in response to message 76356

Quote:
Quote:

It also seems to only happen on machines with the newer BOINC clients. The old 5.8.x clients don't seem to be affected, where the 5.10.x clients are. (Of course, that could just be a matter of timing, which means that my theory could be wrong.)

So, I'm wondering, if something in BOINC, rather than the Einstein app, could be causing the problem.

I heard alot of talking about BOINC 5.10.x screwing up badly when it encounters connection problems in various Project forums, i.e. erroneously killing all Results in progress, and several people started avoiding 5.10.x until the bug is fixed. (haven't encountered this myself though, but my connection is usually rock stable)


There are some bug in the 5.10.x that prevented me from using it. At least the 'truncate stderr' bug has been fixed by now. I still see the first tasks after a new installation error out in some cases, apparently because the App is started before all the files have been downloaded completely.

Anyway I'm pretty sure that there is a reason for the segfault in the Einstein App left (either in our code or the BOINC library that's linked into the App). Other problems could be caused by the Client, but hardly that one. And roughly 50% of the 'signal 11' errors I've seen in the DB are from 5.8 Clients.

I have access to a cluster of machines that shows this problem rather frequently. However it's pretty slow for nowadays (PIII), tasks run quite long there and until now the problem hasn't appeared under a debugger. I'd guess I won't find it before I'm away for Xmas.

Annika,
if you still see this problem running BOINC as an ordinary user that is logged in, please try:
- touch a file "EAH_DEBUG_DDD" in the BOINC directory
- each time a new task is started, the App will launch the DDD debugger attached to it.
- press the "Cont" button on the "Command toolbar" or type "cont" at the "(gdb)" prompt in the main window.
- If the App catches a signal 11 (shown in the gdb window), type "bt" and post the output (stack bactrace) here. "bt full" gives even more informative but much, much longer output.
- Alternatively you can save a corefile by typing "gcore" (at the "(gdb)" prompt) and compress & upload it somewhere to make it available to me (probably way too big for eMail).

BM

BM

Annika
Annika
Joined: 8 Aug 06
Posts: 720
Credit: 494410
RAC: 0

Okay, the last two points

Okay, the last two points were new to me ;-) did all the rest but had no idea what to do with the output. Thanks, I'll do that.

Donald A. Tevault
Donald A. Tevault
Joined: 17 Feb 06
Posts: 439
Credit: 73516529
RAC: 0

Well, it looks like I can

Well, it looks like I can finally make a judgement call on whether the power-app is working for me.

The last several 4.21-app results from my 6000+ machine have all had completion times in the 27,000 second range. Under previous apps, my completion times have never been shorter than in the 29,000 second range. So, it appears that the SSE2 loop really is having an effect.

But, the jury is still out on my P-4 Xeon. Hopefully, I'll be able to evaluate it in the next few days.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.