GNU/Linux S5R3 "power users" App 4.21 available

KSMarksPsych
KSMarksPsych
Moderator
Joined: 15 Oct 05
Posts: 2702
Credit: 4090227
RAC: 0

RE: RE: They are running

Message 76402 in response to message 76401

Quote:
Quote:
They are running different version of the core client. Windows is 5.10.30 (I think) installed as a service and Linux is 5.10.21 installed via rpm as a system daemon.

Does this run BOINC and the App as root? (try "ps -ef | grep eistein" or similar)?

BM

As far as I know (and Eric Myers can confirm for you) when installing as a rpm, it runs as its own user. The rpm creates a user boinc and the home directory is /var/lib/boinc.

Here's the output of ps, I hope it makes sense to you because it doesn't make sense to me.

[Kathryn@Galaxy ~]$ ps -ef | grep einstein
boinc 5585 5581 96 04:15 ? 04:36:02 einstein_S5R3_4.21_i686-pc-linux-gnu --method=0 --Freq=724.755964345 --FreqBand=0.0161398745876 --dFreq=6.71056161393e-06 --f1dot=-1.58548959919e-09 --f1dotBand=1.74403855911e-09 --df1dot=3.88447721545e-10 --skyGridFile=skygrid_0730Hz_S5R3.dat --numSkyPartitions=109 --partitionIndex=50 --DataFiles1=h1_0724.60_S5R2 l1_0724.60_S5R2 h1_0724.65_S5R2 l1_0724.65_S5R2 h1_0724.70_S5R2 l1_0724.70_S5R2 h1_0724.75_S5R2 l1_0724.75_S5R2 h1_0724.80_S5R2 l1_0724.80_S5R2 h1_0724.85_S5R2 l1_0724.85_S5R2 h1_0724.90_S5R2 l1_0724.90_S5R2 --tStack=90000 --nStacksMax=84 --pixelFactor=0.500 --nf1dotRes=1 --ephemE=earth --ephemS=sun --nCand1=10000 -o Hough.out --gridType=3 --useWeights=0 --printCand1 --semiCohToplist -d1 --WUfpops=2.0406e+14
boinc 6174 5581 41 06:32 ? 01:02:37 einstein_S5R3_4.21_i686-pc-linux-gnu --method=0 --Freq=724.755964345 --FreqBand=0.0161398745876 --dFreq=6.71056161393e-06 --f1dot=-1.58548959919e-09 --f1dotBand=1.74403855911e-09 --df1dot=3.88447721545e-10 --skyGridFile=skygrid_0730Hz_S5R3.dat --numSkyPartitions=109 --partitionIndex=40 --DataFiles1=h1_0724.60_S5R2 l1_0724.60_S5R2 h1_0724.65_S5R2 l1_0724.65_S5R2 h1_0724.70_S5R2 l1_0724.70_S5R2 h1_0724.75_S5R2 l1_0724.75_S5R2 h1_0724.80_S5R2 l1_0724.80_S5R2 h1_0724.85_S5R2 l1_0724.85_S5R2 h1_0724.90_S5R2 l1_0724.90_S5R2 --tStack=90000 --nStacksMax=84 --pixelFactor=0.500 --nf1dotRes=1 --ephemE=earth --ephemS=sun --nCand1=10000 -o Hough.out --gridType=3 --useWeights=0 --printCand1 --semiCohToplist -d1 --WUfpops=2.0406e+14
Kathryn 6659 6576 0 09:01 pts/1 00:00:00 grep einstein

I also had a unit finish overnight without issue and another one that has about an hour to go on it. The finished one is waiting validation as are the first couple I finished.

Kathryn :o)

Einstein@Home Moderator

Melvin Bobo Slacke
Melvin Bobo Slacke
Joined: 22 Jan 05
Posts: 32
Credit: 1903599
RAC: 4872

Damn :( Lost another 35 units

Damn :(
Lost another 35 units when network was cut off, hostid=1027762
Boinc 5.10.21

I erased everything and restarted with Boinc 5.2.13. Guess I will do that with the other boxen too.

OK, guess I should have aborted the ones "In Progress" first, sorry..

Brian Silvers
Brian Silvers
Joined: 26 Aug 05
Posts: 772
Credit: 282700
RAC: 0

RE: I erased everything

Message 76404 in response to message 76403

Quote:

I erased everything and restarted with Boinc 5.2.13. Guess I will do that with the other boxen too.

What is this with "boxen"? LOL. I asked someone else over on LHC what the plural of "moose" was, and they replied back with the "correct answer", which is "mooxen" (or "moosen"), according to an old Brian Regan comedy skit "Stupid In School"... You can watch this Youtube video that is a stick-figure sketch done by someone with his skit as the audio... A decent understanding of English concepts is required to "get" some of the humor... English is, honestly, a very odd language as it has different pronunciations for words that you'd think are pronounced the same, like "comb" and "tomb"...

Also, try this more "on topic" skit snippet...

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 4349
Credit: 253189960
RAC: 40940

RE: As far as I know (and

Message 76405 in response to message 76402

Quote:
As far as I know (and Eric Myers can confirm for you) when installing as a rpm, it runs as its own user. The rpm creates a user boinc and the home directory is /var/lib/boinc.


Well, what the rpm does depends on the distributor. Might be that it is more common by now to create an own user (which is good), but I've seen installations where the client (and thus the App) ran as root.

Quote:
Here's the output of ps, I hope it makes sense to you because it doesn't make sense to me.


It does. The App is running as user "boinc".

It also reveals that this is a dual-CPU machine, which might be the reason why I couldn't reproduce the problem. I'll try with a dual-core VM.

Are others seeing this problem (only) on multi-CPU/core machines?

Thanks,

BM

BM

Donald A. Tevault
Donald A. Tevault
Joined: 17 Feb 06
Posts: 439
Credit: 73516529
RAC: 0

RE: RE: As far as I know

Message 76406 in response to message 76405

Quote:
Quote:
As far as I know (and Eric Myers can confirm for you) when installing as a rpm, it runs as its own user. The rpm creates a user boinc and the home directory is /var/lib/boinc.

Well, what the rpm does depends on the distributor. Might be that it is more common by now to create an own user (which is good), but I've seen installations where the client (and thus the App) ran as root.
Quote:
Here's the output of ps, I hope it makes sense to you because it doesn't make sense to me.

It does. The App is running as user "boinc".

It also reveals that this is a dual-CPU machine, which might be the reason why I couldn't reproduce the problem. I'll try with a dual-core VM.

Are others seeing this problem (only) on multi-CPU/core machines?

Thanks,

BM

Come to think of it, all of my signal 11 problems have been on dual processor machines.

Computer 1042068--This problem machine is an SGI 1200 server, with dual Pentium III 700 processors. It has 256 Meg of registered, ECC memory, if it helps to know that. It's the only machine that still has crashed results showing up in its workunit list. There are four, but only two are from the signal-11 problem. The other two crashed when I removed the app_info file so that it could finally upgrade to the 4.20 app.

Computer 1059057--This one is a home-brew box that I built from a dual Pentium III 866 motherboard that I got from Ebay. It has 512-Meg of non-registered, non-ECC PC-133 memory. If I remember correctly, I believe that all of its problem workunits were running with the 4.14 app. I've also since upgraded it to the 4.20 app.

Computer 1060000-The third one is an old IBM Intellistation with dual 2.8 GHz Xeons and 512 Meg of memory. I neglected to take note of what kind of memory it is when I had the box opened, but I believe that it's probably DDR. If I remember correctly, I believe that the problems on this one occurred with the 4.20 app. I've since upgraded to the 4.21 power-users' app, and have had no problems with it.

Most, and perhaps all, of the signal 11 problems occurred when I had network problems. Also, the problem machines are all running the newer 5.10.x versions of BOINC.

****

Having said all of this, I also have to note that I have several other dual-processor machines, and one dual-core machine, that have never had signal-11 problems. But, all of the problems I have had have been on the dual-processor machines listed above; the single-processor machines haven't had any problems at all.

Annika
Annika
Joined: 8 Aug 06
Posts: 720
Credit: 494410
RAC: 0

Well... mine is a Core Duo,

Well... mine is a Core Duo, so it would kinda fit. But I don't have any single-core Linux boxes to test if they don't get problems.

KSMarksPsych
KSMarksPsych
Moderator
Joined: 15 Oct 05
Posts: 2702
Credit: 4090227
RAC: 0

RE: RE: As far as I know

Message 76408 in response to message 76405

Quote:
Quote:
As far as I know (and Eric Myers can confirm for you) when installing as a rpm, it runs as its own user. The rpm creates a user boinc and the home directory is /var/lib/boinc.

Well, what the rpm does depends on the distributor. Might be that it is more common by now to create an own user (which is good), but I've seen installations where the client (and thus the App) ran as root.

Sorry. I should have been more clear. Eric did package it up.

Quote:
Quote:
Here's the output of ps, I hope it makes sense to you because it doesn't make sense to me.

It does. The App is running as user "boinc".

It also reveals that this is a dual-CPU machine, which might be the reason why I couldn't reproduce the problem. I'll try with a dual-core VM.

Are others seeing this problem (only) on multi-CPU/core machines?

Thanks,

BM

Would it help in testing a) to limit BOINC to 1 CPU and/or b) try an older core client? I'm comfortable enough with installing BOINC to uninstall the rpm and install an older version (5.4.x or 5.8.x?)

Kathryn :o)

Einstein@Home Moderator

rroonnaalldd
rroonnaalldd
Joined: 12 Dec 05
Posts: 116
Credit: 537221
RAC: 0

It can be that the problems

It can be that the problems are really related to interrupted network-access. But i don't think so, because i have often problems to reach the seti-project or the not frequently used projects LHC & Chess960 and Boinc logs this every time, but the last einstein-units going like the hell with power-app_4.21.

Now I'm running Augustines Boinc.5.10.30 in a Suse10.2_x86-64_DualCore-VM as root in folder /root/BOINC. I think all my problems earlier posted in this thread are only depended on faulty libraries or linkings to and between them. All dependecies earlier was solved, but a fuse comes to the end... I had done the mentioned fschk without errors but my problems continued to exist and resetting all projects are useless. Since i have installed all gnome-libs again plus some more all will be fine. No errors are more reported from this host. Knocking on wood.

[edit]
I believe more that Boinc has under some circumstances problems with handling of working-slots especially with 2 task from same project. But that should be solved with 5.10.X or not?
Strange for me is that only Einstein and Spinhenge have had this problems, all other projects runs fine on that host.
[/edit]

Melvin Bobo Slacke
Melvin Bobo Slacke
Joined: 22 Jan 05
Posts: 32
Credit: 1903599
RAC: 4872

Funny thing happened here

Funny thing happened here when switching from 3 to 4 cores:

Quote:

2008-01-11 11:51:57 [Einstein@Home] Started upload of h1_0734.45_S5R2__204_S5R3a_0_0
2008-01-11 11:51:58 [Einstein@Home] Scheduler request to http://einstein.phys.uwm.edu/EinsteinAtHome_cgi/cgi succeeded
2008-01-11 11:51:58 [Einstein@Home] General preferences have been updated
2008-01-11 11:51:58 [---] General prefs: from Einstein@Home (last modified 2008-01-11 11:40:31)
2008-01-11 11:51:58 [---] General prefs: no separate prefs for home; using your defaults
2008-01-11 11:51:58 [---] Number of usable CPUs has changed. Running benchmarks.
2008-01-11 11:51:58 [---] Suspending computation and network activity - running CPU benchmarks
2008-01-11 11:51:58 [Einstein@Home] Pausing result h1_0734.45_S5R2__175_S5R3a_0 (left in memory)
2008-01-11 11:51:58 [Einstein@Home] Pausing result h1_0734.45_S5R2__174_S5R3a_0 (left in memory)
2008-01-11 11:51:58 [Einstein@Home] Pausing result h1_0734.45_S5R2__173_S5R3a_0 (left in memory)
SIGSEGV: segmentation violationStack trace (8 frames):
./boinc[0x80845b2]
[0xffffe420]
./boinc[0x8058d77]
./boinc[0x8057e71]
./boinc[0x8078819]
./boinc[0x807895f]
/lib/tls/i686/cmov/libc.so.6(__libc_start_main+0xdc)[0xb7e4aebc]
./boinc(shmat+0x59)[0x804bf21]

Exiting...[/quoted]

Ubuntu 7.04 Boinc 5.2.13

Bikeman (Heinz-Bernd Eggenstein)
Bikeman (Heinz-...
Moderator
Joined: 28 Aug 06
Posts: 3522
Credit: 812482199
RAC: 1265534

RE: Funny thing happened

Message 76411 in response to message 76410

Quote:

Funny thing happened here when switching from 3 to 4 cores:

Quote:

2008-01-11 11:51:57 [Einstein@Home] Started upload of h1_0734.45_S5R2__204_S5R3a_0_0
2008-01-11 11:51:58 [Einstein@Home] Scheduler request to http://einstein.phys.uwm.edu/EinsteinAtHome_cgi/cgi succeeded
2008-01-11 11:51:58 [Einstein@Home] General preferences have been updated
2008-01-11 11:51:58 [---] General prefs: from Einstein@Home (last modified 2008-01-11 11:40:31)
2008-01-11 11:51:58 [---] General prefs: no separate prefs for home; using your defaults
2008-01-11 11:51:58 [---] Number of usable CPUs has changed. Running benchmarks.
2008-01-11 11:51:58 [---] Suspending computation and network activity - running CPU benchmarks
2008-01-11 11:51:58 [Einstein@Home] Pausing result h1_0734.45_S5R2__175_S5R3a_0 (left in memory)
2008-01-11 11:51:58 [Einstein@Home] Pausing result h1_0734.45_S5R2__174_S5R3a_0 (left in memory)
2008-01-11 11:51:58 [Einstein@Home] Pausing result h1_0734.45_S5R2__173_S5R3a_0 (left in memory)
SIGSEGV: segmentation violationStack trace (8 frames):
./boinc[0x80845b2]
[0xffffe420]
./boinc[0x8058d77]
./boinc[0x8057e71]
./boinc[0x8078819]
./boinc[0x807895f]
/lib/tls/i686/cmov/libc.so.6(__libc_start_main+0xdc)[0xb7e4aebc]
./boinc(shmat+0x59)[0x804bf21]

Exiting...[/quoted]

Ubuntu 7.04 Boinc 5.2.13

Wow!

But this is a segmentation fault inside the BOINC software itself, so someone should make a bug report to BOINC's TRAC system, I guess. This issue should not have any relation to Einstein@Home.

CU
Bikeman

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.