Various "Error while computing" problems

Ron Kosinski
Ron Kosinski
Joined: 23 Mar 05
Posts: 57
Credit: 1094183964
RAC: 731199
Topic 195549

Over the past several days, I have had various errors on Einstein work units:

Error without even starting:
http://einsteinathome.org/task/213893576

Error while running:
http://einsteinathome.org/task/213161417

Computer Details:
http://einsteinathome.org/host/3780436

Tasks:
http://einsteinathome.org/host/3780436/tasks

I only seem to be having problems with Einstein WU, SETI WU ran without any issues at the same time.

This is a new computer used mainly for BOINC crunching.

Any help would be appreciated.

Thanks, Ron K.

BilBg
BilBg
Joined: 27 May 07
Posts: 56
Credit: 23998
RAC: 0

Various "Error while computing" problems

What is your antivirus?

Try to exclude BOINC Data directory from antivirus scans (on-demand (scheduled) & real-time (on-read/write/open))

(There where reports of AVG, Kaspersky, Comodo, McAfee, Symantec (Norton), Trend Micro, ... (but not NOD32) blocking or deleting (quarantine) the files downloaded by BOINC)

[pre] [/pre]

- ALF - "Find out what you don't do well ..... then don't do it!" :)

Ron Kosinski
Ron Kosinski
Joined: 23 Mar 05
Posts: 57
Credit: 1094183964
RAC: 731199

RE: What is your

Quote:
What is your antivirus?


I am using Avast with only "Real Time" scanning.

I am using all 4 cores on the processor and any SETI WU (optimized clients) running at the same time are not not being affected, only Einstein.

Bikeman (Heinz-Bernd Eggenstein)
Bikeman (Heinz-...
Moderator
Joined: 28 Aug 06
Posts: 3522
Credit: 775024773
RAC: 1281100

RE: Over the past several

Quote:

Over the past several days, I have had various errors on Einstein work units:

Error without even starting:
http://einsteinathome.org/task/213893576

Error while running:
http://einsteinathome.org/task/213161417

Computer Details:
http://einsteinathome.org/host/3780436

Tasks:
http://einsteinathome.org/host/3780436/tasks

I only seem to be having problems with Einstein WU, SETI WU ran without any issues at the same time.

This is a new computer used mainly for BOINC crunching.

Any help would be appreciated.

Thanks, Ron K.

There were two problems:
first, BOINC was complaining that the app to be executed was missing (BOINC should have downloaded this app automatically). This can happen when anti-virus software quarantines an app.

But that must have been solved now.

Second, you get this one a lot:

Quote:
2010-12-31 23:32:24.7895 (2900) [normal]: Reading input data ... Memeory reallocation for SFTs failed: nSFT:121101, length:121100, add:100
Error[2] 14: function LALSFTdataFind, file /home/bema/EinsteinAtHome/EinsteinAtHome/source/lalsuite/lalpulsar/src/SFTfileIO.c, line 351, $Id$
ABORT: Out of memory

You have a quad core and maybe you also run some GPU apps on your ATI card on other projects, so for that the 3 GB of RAM might not be enough (together with any stuff that is not BOINC related and happens to run on your PC).


I would consider to play around with the BOINC preferences in the web interface, or with the local preferences in BOINC manager on taht PC (this will override the settings you select in the web interface): You might want to limit the number of CPUs (= cores) that can be used for BOINC simultaneously or you might want to require BOINC to start new apps only if there is sufficient memory available. This might lead to fewer E@H apps being started, but at least they would hopefully finish.

It might also be smart to check your swapfile configuration: if you for example disabled the swap file altogether, or limited its size too strictly, this kind of problem is likely to happen.

CU

HB

Ron Kosinski
Ron Kosinski
Joined: 23 Mar 05
Posts: 57
Credit: 1094183964
RAC: 731199

RE: You have a quad core

Quote:
You have a quad core and maybe you also run some GPU apps on your ATI card on other projects, so for that the 3 GB of RAM might not be enough (together with any stuff that is not BOINC related and happens to run on your PC).

I am not running any apps on the (integrated) GPU. According to several Memory Optimization programs, I am using less than half of the available memory. The only programs running on the computer are some minor backround programs and BOINC.

Quote:
It might also be smart to check your swapfile configuration: if you for example disabled the swap file altogether, or limited its size too strictly, this kind of problem is likely to happen.


I am letting Windows handle the pagefile. It is currently around 3.4 gig.

During installation, I let BOINC put all the directories in their default locations and I am the only user on the computer.

One item I mentioned that nobody has commented on is the fact that only Einstein is being affected and not SETI.

Thanks, Ron K.

tullio
tullio
Joined: 22 Jan 05
Posts: 2118
Credit: 61407735
RAC: 0

Einstein uses more RAM than

Einstein uses more RAM than all others BOINC projects I am running (6). I have 5 GB RAM on my 32-bit Linux-pae and I am going to bring it to a maximum of 8 GB because I am running also 2 Virtual Machines via VirtualBox. One is Solaris Express, the other is an Alpha project.
Tullio

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5877
Credit: 118567917578
RAC: 19192228

RE: One item I mentioned

Quote:
One item I mentioned that nobody has commented on is the fact that only Einstein is being affected and not SETI.


Over the years, I've seen many similar reports, quite often accompanied by a statement to the effect that, "... it can't be my computer because Seti runs fine - only Einstein has a problem so it must be Einstein."

The E@H apps seem to put a greater stress on the machine than those of other projects. In similar previous cases, problems like yours often turn out to be caused by overclocking and/or excessive heat, or some other hardware issue such as PSU or RAM. Are you overclocking your machine at all? If you are, try backing off a little on the overclock and see what happens.

Cheers,
Gary.

Ron Kosinski
Ron Kosinski
Joined: 23 Mar 05
Posts: 57
Credit: 1094183964
RAC: 731199

Gary, Yes I do have some

Gary,
Yes I do have some slight overclocking in the machine. I did back down some and when I brought BOINC back on line, one of the Einstein WU that was running at the time crashed on me. I will wait until all the In Process Einstein WU finish and back it down some more.

My current CPU temperature is running around 51C. This is from the TThrottle app. Several other apps showed around the same temps.

Quote:
The E@H apps seem to put a greater stress on the machine than those of other projects.


With both SETI and Einstein apps running at the same time on different cores, a hardware problem could affect one app and not the other?

Thanks, Ron K.

mikey
mikey
Joined: 22 Jan 05
Posts: 12809
Credit: 1879705999
RAC: 1276783

RE: RE: You have a quad

Quote:
Quote:
You have a quad core and maybe you also run some GPU apps on your ATI card on other projects, so for that the 3 GB of RAM might not be enough (together with any stuff that is not BOINC related and happens to run on your PC).

I am not running any apps on the (integrated) GPU. According to several Memory Optimization programs, I am using less than half of the available memory. The only programs running on the computer are some minor backround programs and BOINC.
Thanks, Ron K.

When running Win7 it is NOT advisable to run any memory optimization programs because Win7 uses extra/free memory as a supplemental cache to speed up the machine. Microsoft's new thinking about memory is that free memory is wasted memory and why should you be paying for stuff to not be used. Win7 will use and release memory as programs need or give it up, older versions of Windows do not do this though.

Ron Kosinski
Ron Kosinski
Joined: 23 Mar 05
Posts: 57
Credit: 1094183964
RAC: 731199

RE: When running Win7 it is

Quote:
When running Win7 it is NOT advisable to run any memory optimization programs .....


I just used the Memory apps to see how much memory I was using. I disabled all of the optimization functions and used it to only monitor memory usage.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5877
Credit: 118567917578
RAC: 19192228

RE: With both SETI and

Quote:
With both SETI and Einstein apps running at the same time on different cores, a hardware problem could affect one app and not the other?


That's not quite what I was saying. Some part of your hardware is running 'close to the edge'. The Seti app doesn't push it over but the Einstein app does. It's not clear what is causing 'the edge'. It may be CPU frequency, it may be RAM frequency and/or timings, it may be a thermal condition or it may be something else in your system that is being effected by noise/ripple from your PSU for example.

The only way to diagnose is to progressively and systematically experiment (one at a time) with all the possibilities. Put CPU frequency back to stock, relax your RAM timings, change components like RAM sticks and PSU, check your motherboard capacitors for any signs of leakage or swelling, etc. Sooner or later you should find something that will cure the problem.

What I am saying is that your symptoms seem to suggest hardware. It is possible that it may be software/firmware but that seems less likely from what you have reported so far. I'm running around 70 machines and at one point a couple of years ago I had over 200. The vast majority of my hosts are overclocked. I see compute errors caused by hardware all the time. Almost invariably I can cure the problem by cleaning the heat sink, or improving the ventilation, or backing off on the overclock, or finding and replacing swollen caps, or servicing the fan in a PSU, etc. I rarely see problems where the app is clearly to blame.

Good luck with finding the cause of your problems!

Cheers,
Gary.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.