Notebook failures

WayneKFord

Joined: 21 Aug 05

Posts: 25

Credit: 1424120

RAC: 0

28 Jul 2006 13:32:22 UTC

Topic 191625

(moderation:

)

I have three computers working on einstein, two desktops and a Dell i600m notebook. They each run the same version of einstein, which seems to be S5 R1.402

The notebook rarely returns a valid result. The id for this computer is 393166. The log file is full of info, none of which makes sense to me. The desktops are pretty successful.

These computers also run other projects and have no 'trouble'.

Any ideas? Can anyone who can help examine the log files?

Scott Brown

Joined: 9 Feb 05

Posts: 38

Credit: 215235

RAC: 0

Notebook failures

28 Jul 2006 13:39:56 UTC

Message 43140

(moderation:

)

Quote:

I have three computers working on einstein, two desktops and a Dell i600m notebook. They each run the same version of einstein, which seems to be S5 R1.402

The notebook rarely returns a valid result. The id for this computer is 393166. The log file is full of info, none of which makes sense to me. The desktops are pretty successful.

These computers also run other projects and have no 'trouble'.

Any ideas? Can anyone who can help examine the log files?

I am certainly not an expert at reading the log files, so perhaps someone with greater knowledge will also chime in...

But I noticed that all your results seem to begin with an error reading or finding the checkpoint file...You might take a look at this to make sure something simple isn't going on (such as a glitch that made the file or directory read-only, etc.).

Michael Karlinsky

Joined: 22 Jan 05

Posts: 888

Credit: 23502182

RAC: 0

RE: Any ideas? Can anyone

28 Jul 2006 14:22:08 UTC

Message 43141

(moderation:

)

Quote:

Any ideas? Can anyone who can help examine the log files?

Quote:

Unhandled Exception Detected...

- Unhandled Exception Record -
Reason: Access Violation (0xc0000005) at address 0x0040BF6A read attempt to address 0x2B6B2830

Engaging BOINC Windows Runtime Debugger...

Maybe heat. The usual advice is to run memtest and prime95.

Michael

Team Linux Users Everywhere

Richard Haselgrove

Joined: 10 Dec 05

Posts: 2143

Credit: 3044694278

RAC: 2026484

I seem to remember a

28 Jul 2006 17:35:45 UTC

Message 43142

(moderation:

)

I seem to remember a discussion some time ago about computer suspend/hibernate functions: IIRC, the hypothesis was that if Windows was fully powered down, applications such as BOINC were given plenty of time to flush their state files to disk before closedown; but if Windows went to suspend/hibernate, it used an accelerated procedure which didn't give BOINC enough time.

Your problem is with your notebook, which I guess is more likely to be suspended - could that be the problem?

If you use the suspend function, you could try closing BOINC manually (a full file|exit, not just hiding the window) at the end of your work sessions for a few days. If you get fewer errors/more credit, then all you have to do is find a way of automating the process!

WayneKFord

Joined: 21 Aug 05

Posts: 25

Credit: 1424120

RAC: 0

RE: I seem to remember a

28 Jul 2006 18:52:31 UTC

Message 43143 in response to message 43142

(moderation:

)

Quote:

I seem to remember a discussion some time ago about computer suspend/hibernate functions: IIRC, the hypothesis was that if Windows was fully powered down, applications such as BOINC were given plenty of time to flush their state files to disk before closedown; but if Windows went to suspend/hibernate, it used an accelerated procedure which didn't give BOINC enough time.

I should also say that the last several wu's completed (and invalidated) were computed over several days when I never turned the beast off (used as a desktop most of the time.)

Your problem is with your notebook, which I guess is more likely to be suspended - could that be the problem?

If you use the suspend function, you could try closing BOINC manually (a full file|exit, not just hiding the window) at the end of your work sessions for a few days. If you get fewer errors/more credit, then all you have to do is find a way of automating the process!

I think there may be a thread of truth to this theory, because I frequently see in the event logs a comment (from my memory now) about some app not releasing some part of the registry in time and that the registry will be somehow restored later (sorry for not being precise here). That app is always something to do with boinc (I run it as a service, not a screen saver).

Regarding the memtest, etc., I've done that and never have any troubles. Several other projects also run on this notebook and don't report the same type of issue. (I actually posted this complaint about 12 m ago; but a lot has changed over time in that I am now getting almost all my einstein wu's invalidated.)

More comments are welcome. When my queue flushes on the other projects I expect I will try to re-install the einstein stuff from scratch to see if something is corrupted. Too bad all that diagnostic stuff isn't useful.

Annika

Joined: 8 Aug 06

Posts: 720

Credit: 494410

RAC: 0

I have never encountered any

10 Aug 2006 13:50:28 UTC

Message 43144 in response to message 43143

(moderation:

)

I have never encountered any problems or crashes with either SETI or E@H and very rarely get any invalid results on my laptop (although it is a relatively slow Celeron and I use hibernate/standby a lot). If you are sure the problem is not temperature-related I would try switching to not leaving WUs in memory. I guess that might do the trick as your WUs will be saved on the hard disk before you shut down. I'm afraid I am not sure if this works as I have never used a different setting (because I always seem to have too little RAM anyway, using a shared RAM gfx card and so on) but as I said, BOINC works fine for me the way it is...

WayneKFord

Joined: 21 Aug 05

Posts: 25

Credit: 1424120

RAC: 0

Here is some more data. After

17 Aug 2006 14:51:39 UTC

Message 43145

(moderation:

)

Here is some more data. After crunching successfully for a while (completing wu's) the notebook had about 13 compute errors in a row. If you look at the sequence, the first failed unit made it through 3470 sec and failed (below), but the subsequent 12 failed out of the box. The notebook is shared with seti and I think the einstein wu's were started sequentially because the seti queue was exhausted. see 393166

2006-08-17 06:28:02.7900 [normal]: Start of BOINC application 'projects/einstein.phys.uwm.edu/einstein_S5R1_4.24_windows_intelx86.exe'.
2006-08-17 06:28:02.8000 [normal]: Started search at lalDebugLevel = 0
2006-08-17 06:28:03.6813 [normal]: Checkpoint-file 'Fstat.out.ckp' not found.
2006-08-17 06:28:03.6813 [normal]: No usable checkpoint found, starting from beginning.
Detected CPU type 1
small x
small x
small x
small x
small x
ERROR! sftIndex = -2147483648 < 0 in TestLALDemod run 0
alpha=51, xTemp=542621.64819343563000000, Dterms=16, ifmin=542535
Level 0: $Id: ComputeFStatistic.c,v 1.371 2006/06/09 12:48:58 reinhard Exp $
Function call `TestLALDemod(status, &Fstat, SFTData, DemodParams)' failed.
file ComputeFStatistic.c, line 958
2006-08-17 07:27:36.4286 [normal]:
Level 1: $Id: CFSLALDemod_SSEgas.c,v 1.6 2006/07/28 17:01:40 bema Exp $
2006-08-17 07:27:36.4286 [normal]: Status code 3: Invalid input
2006-08-17 07:27:36.4286 [normal]: function TestLALDemod, file FDS_isolated/CFSLALDemod_SSEgas.c, line 190
2006-08-17 07:27:36.4286 [CRITICAL]: BOINC_ERR_EXIT(): now calling boinc_finish()

Mahray

Joined: 11 Nov 04

Posts: 43

Credit: 150028400

RAC: 379924

Do you have any defrag

24 Aug 2006 9:43:53 UTC

Message 43146 in response to message 43145

(moderation:

)

Do you have any defrag software running? I've had problems in the past with it trying to move Boinc files when they weren't necessarily being used, but still open (or something like that, I know Boinc really doesn't like being defragged while running).

Notebook failures

Forums › Cruncher's Corner

Notebook failures

RE: Any ideas? Can anyone

I seem to remember a

RE: I seem to remember a

I have never encountered any

Here is some more data. After

Do you have any defrag

Comment viewing options

Forums › Cruncher's Corner