strange BOINC lockup

fjalar
fjalar
Joined: 20 Feb 05
Posts: 4
Credit: 4,956
RAC: 0
Topic 187931

Hi everybody, I'm running the following:
AMD Duron 1600, 384MB RAM, Linux 2.6.8-24, BOINC V4.19.

Since I joined Einstein (I'm running SETI and CPDN too), I've had the problem twice that the whole thing crashed, obviously while switching from CPDN to Eintein. In the ps forest the boinc process is still there, but all related subprocesses have turned Zombie. I'll have logs written now and try to get deeper into it in case the process crashes again. Has anybodey experienced the same problem?

Matthias

S@NL - Skipper
S@NL - Skipper
Joined: 15 Feb 05
Posts: 1
Credit: 625,390
RAC: 0

strange BOINC lockup

Have not seen this problem (also running seti, cpdn and einstein)
Except this machine is running with a Intel 4 proc and has more mem and in running windoze ;)

EclipseHA
EclipseHA
Joined: 19 Feb 05
Posts: 41
Credit: 10,540,182
RAC: 0

> Hi everybody, I'm running

> Hi everybody, I'm running the following:
> AMD Duron 1600, 384MB RAM, Linux 2.6.8-24, BOINC V4.19.
>
> Since I joined Einstein (I'm running SETI and CPDN too), I've had the problem
> twice that the whole thing crashed, obviously while switching from CPDN to
> Eintein. In the ps forest the boinc process is still there, but all related
> subprocesses have turned Zombie. I'll have logs written now and try to get
> deeper into it in case the process crashes again. Has anybodey experienced the
> same problem?
>
> Matthias
>

I too have seen this on Linux. Win2k seems fine. This seems to be a Linux only problem.

Bruce Allen
Bruce Allen
Moderator
Joined: 15 Oct 04
Posts: 1,113
Credit: 172,127,663
RAC: 0

> > Hi everybody, I'm running

Message 4920 in response to message 4919

> > Hi everybody, I'm running the following:
> > AMD Duron 1600, 384MB RAM, Linux 2.6.8-24, BOINC V4.19.
> >
> > Since I joined Einstein (I'm running SETI and CPDN too), I've had the
> problem
> > twice that the whole thing crashed, obviously while switching from CPDN
> to
> > Eintein. In the ps forest the boinc process is still there, but all
> related
> > subprocesses have turned Zombie. I'll have logs written now and try to
> get
> > deeper into it in case the process crashes again. Has anybodey
> experienced the
> > same problem?
> >
> > Matthias

> I too have seen this on Linux. Win2k seems fine. This seems to be a Linux
> only problem.

We do all of our development under Linux, and have not seen this, so I am puzzled. Could I suggest that you try out the current experimental core client + clientgui (4.23 or 4.24). You can get these by going to the very bottom of the download page and selecting the 'show experimental clients' link.

Bruce

Director, Einstein@Home

fjalar
fjalar
Joined: 20 Feb 05
Posts: 4
Credit: 4,956
RAC: 0

> Could I suggest that you

Message 4921 in response to message 4920

> Could I suggest that you try out the current experimental core
> client + clientgui (4.23 or 4.24).

Done.

Maybe I'm missing something obvious. The core client asks for a project URL, then starts to download the Einstein 4.80 client and a workunit. It does not honor any state files, project data, present wu's, whatever.

Maybe I miss somethin very obvious here... it's the end of a long night shift. Can someone pojnt me to the mistake I made?

Yes, I'm running it in my BOINC root directory, whith the state files and a project folder.

Matthias

Blank Reg
Blank Reg
Joined: 18 Jan 05
Posts: 228
Credit: 40,599
RAC: 0

> > Could I suggest that you

Message 4922 in response to message 4921

> > Could I suggest that you try out the current experimental core
> > client + clientgui (4.23 or 4.24).
>
> Done.
>
> Maybe I'm missing something obvious. The core client asks for a project URL,
> then starts to download the Einstein 4.80 client and a workunit. It does not
> honor any state files, project data, present wu's, whatever.
>
> Maybe I miss somethin very obvious here... it's the end of a long night shift.
> Can someone pojnt me to the mistake I made?
>
> Yes, I'm running it in my BOINC root directory, whith the state files and a
> project folder.
>
> Matthias
>

When you install Boinc it is not necessary to detach from the project

EclipseHA
EclipseHA
Joined: 19 Feb 05
Posts: 41
Credit: 10,540,182
RAC: 0

> We do all of our

Message 4923 in response to message 4920


> We do all of our development under Linux, and have not seen this, so I am
> puzzled. Could I suggest that you try out the current experimental core
> client + clientgui (4.23 or 4.24). You can get these by going to the very
> bottom of the download page and selecting the 'show experimental clients'
> link.
>
> Bruce
>

The key seems to be running CP too... The weird hangs occur when task switching "happens". (60 min switch time, keep apps in memory)

(3 times in 3 days with Einstein/CP only). My solution for now is to not run Einstein on that box.

fjalar
fjalar
Joined: 20 Feb 05
Posts: 4
Credit: 4,956
RAC: 0

> The key seems to be running

Message 4924 in response to message 4923


> The key seems to be running CP too... The weird hangs occur when task
> switching "happens". (60 min switch time, keep apps in memory)

Exactly. It just happened again right now. Here's the process tree:

11950 ? S 0:04 ./boinc_4.19_athlon-pc-linux-gnu -allow_remote_gui_rpc
11954 ? SN 55:14 \_ setiathome_4.02_i686-pc-linux-gnu
13338 ? SNl 0:00 \_ einstein_4.80_i686-pc-linux-gnu @conf -f869.298 -o .Ha --startTime 757247505 --endTime 75728
14582 ? SN 0:00 \_ hadsm3_4.04_i686-pc-linux-gnu 1x4m_100110504
14584 ? TN 48:20 \_ hadsm3um_4.04_i686-pc-linux-gnu 24740 14582
15168 ? ZN 0:00 \_ [cp]

Btw, 4.24. still is not running properly. The tcp port for the GUI doesn't come up, and the state files seem to be ignored.

If this continues, I'll disconnect from Einstein. It's been running stable without it.

EclipseHA
EclipseHA
Joined: 19 Feb 05
Posts: 41
Credit: 10,540,182
RAC: 0

Seems there's reports of this

Seems there's reports of this on Seti/Boinc when running just Seti and PP too.. (no CP or Einstein, so it looks like a Boinc problem). 4.19 CC and above...

It's not windows, and I see only Linux reports (no Mac?)

Also it seems to occur (and this is what I've seen too) when one project is switching to another with a brand new WU.

I think I'll just run a single project on Linux boxes for awhile....

fjalar
fjalar
Joined: 20 Feb 05
Posts: 4
Credit: 4,956
RAC: 0

> Also it seems to occur (and

Message 4926 in response to message 4925

> Also it seems to occur (and this is what I've seen too) when one project is
> switching to another with a brand new WU.

I've had it three times now, and I've been watching it once by coincidence. Seems to me that it occurs when switching from CP to Einstein.

I've changed the option "keep application in memory" to "No". Since then It's been switching through all projects at least once without hanging up. I'll keep watchin it.

Matthias

debugas
debugas
Joined: 11 Nov 04
Posts: 170
Credit: 77,331
RAC: 0

even though i do not use

Message 4927 in response to message 4926

even though i do not use switch in memory option i still sometimes experience "lock up" when playing with graphics window - i call it by poping up menu on WU and clicking "show graphics" which runs in software mode (i.e. no OpenGL)

This happens both with einstein and climate prediction graphics.
I suspect it happens when data is being written/read to/from disk.

running on windows 98 SE, non OpenGL video-card

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.