Why all the errors?

Michael Murray
Michael Murray
Joined: 13 Jun 09
Posts: 5
Credit: 17083
RAC: 0
Topic 194630

The last 12 tasks that I've run were all flagged as receiving a client error, some of them after running a very long time. What's going on? I'm running seti@home along with Einstein and am not having this problem with their tasks. This has started happening within the past month and I'm at a loss to explain the cause or what to do to fix it. My apologies for this post being identical to the one in Cafe Einstein but I'd like to get back to contributing.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5842
Credit: 109381476167
RAC: 35970949

Why all the errors?

Quote:
The last 12 tasks that I've run were all flagged as receiving a client error, some of them after running a very long time. What's going on? ...


I presume you have noticed these errors by looking at your list of tasks for your computer. If you observe that list, you will see that each listed task has a task ID which itself is a clickable link. If you follow the link you will get to see the error messages returned by your client to the server with each failed task. For example, here is part of the error message for the most recent failure visible at the time of this reply. This was for task ID 147839349.

6.6.36

too many exit(0)s

l]: Start of BOINC application '..\..\projects\einstein.phys.uwm.edu\einstein_S5R6_3.01_windows_intelx86_2.exe'.
Activated exception handling...
14:08:28 (2588): Can't acquire lockfile (32) - waiting 35s
14:09:03 (2588): Can't acquire lockfile (32) - exiting
14:09:03 (2588): Error: The process cannot access the file because it is being used by another process. (0x20)


The basic error message is that the science app exited normally (ie claimed it had finished processing) too many times - there is a limit of 100 exit(0)s (I think) before BOINC gives up. When any process exits, it sets an exit code which is a number representing the status of the process when it exited. If that number is anything but zero, the process has exited abnormally and the number returned is an indication of what went wrong or what caused the abnormal exit. So ostensibly, the science app is claiming that it was finished and exited normally whereas BOINC knows that it really wasn't finished (through other indicators) and so instructs the app to try again. After 100 such attempts, BOINC is programmed to give up and return the client error.

The science app itself gives a further clue to the real problem by telling us that it couldn't acquire a lockfile (ie exclusive access to a required file), even after trying for 35 seconds, so that is why it decided to exit with a zero exit status. It further added the information that a file it needed to access was already being accessed (and therefore was locked) by another process. If you continue browsing the error output for the above task you will find approximately 100 iterations of the above sequence of messages about not being able to acquire a lock on a file.

This situation has come up a number of times recently and in a lot of cases it is due to an over-zealous anti-virus system locking some of BOINC's files. The solution is to exclude the BOINC tree from being scanned by your anti-virus system.

Can you please advise exactly what anti-virus system you are using and if you have made any changes recently that might have triggered this spate of errors? It could be a different cause but anti-virus activity is probably the most likely so let's rule that in or out first before looking for other possible causes.

Cheers,
Gary.

Michael Murray
Michael Murray
Joined: 13 Jun 09
Posts: 5
Credit: 17083
RAC: 0

I'm using Kaspersky Internet

I'm using Kaspersky Internet Security 2010 which thus far is proving to be something of a pain in the arse. I got it because I'd heard it was less of a resource hog than Norton 360 but from several recent experiences I'm prepared to throw both of them out and look elsewhere. If anyone can give me specific information on how to exempt BOINC I'd be more than happy to do so; right now I don't have time to go digging through the rather sizable user manual trying to figure out what needs to be done.

Gundolf Jahn
Gundolf Jahn
Joined: 1 Mar 05
Posts: 1079
Credit: 341280
RAC: 0

RE: I'm using Kaspersky

Message 95542 in response to message 95541

Quote:
I'm using Kaspersky Internet Security 2010 which thus far is proving to be something of a pain in the arse. I got it because I'd heard it was less of a resource hog than Norton 360 but from several recent experiences I'm prepared to throw both of them out and look elsewhere. If anyone can give me specific information on how to exempt BOINC I'd be more than happy to do so; right now I don't have time to go digging through the rather sizable user manual trying to figure out what needs to be done.


If you do a forum search (top left corner of this page) for 'Kaspersky', you'll find this thread with a quite detailed description of what to do.

Gruß,
Gundolf

Computer sind nicht alles im Leben. (Kleiner Scherz)

Michael Murray
Michael Murray
Joined: 13 Jun 09
Posts: 5
Credit: 17083
RAC: 0

I've made a few tweaks, now

I've made a few tweaks, now to see what happens; thanks for the link (though the description given there didn't exactly match what I see under Kaspersky Internet Security 2010). Now I'm not getting any new work (says no jobs available) but that is preceded by a message I don't recall seeing before; it says "reached daily quota of 1 tasks". Wasn't aware of quotas, what is the significance of that?

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 2139
Credit: 2752648655
RAC: 1487008

RE: I've made a few tweaks,

Message 95544 in response to message 95543

Quote:
I've made a few tweaks, now to see what happens; thanks for the link (though the description given there didn't exactly match what I see under Kaspersky Internet Security 2010). Now I'm not getting any new work (says no jobs available) but that is preceded by a message I don't recall seeing before; it says "reached daily quota of 1 tasks". Wasn't aware of quotas, what is the significance of that?


It's a safety feature in BOINC. If you hadn't noticed that your computer was producing nothing but errors, and come here to ask for help (and congratulations on doing that - so many people don't) - your machine could have run wild and caused damage to the Einstein project (and your pocket if you have to pay bandwidth surcharges) by continually downloading replacement workunits.

When you start returning errors, BOINC gradually cuts you down until you reach that bottom limit of 1 task per day. If your Kaspersky fix works, and you're able to return tomorrow's task without errors, you'll immediately be allowed to download another one, and quota will quickly be restored to normal (16 per day) as you return successful results.

You can always see the current value for your quota by looking at your computer details page on this website, but you can't change it except by returning good work.

Paul D. Buck
Paul D. Buck
Joined: 17 Jan 05
Posts: 754
Credit: 5385205
RAC: 0

YOu can also get the restarts

YOu can also get the restarts from IBERCIVIS, DrugDiscovery and Hydrogen ... if you happen to be running them and have tasks from them ...

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 4265
Credit: 244921643
RAC: 16886

RE: ... (says no jobs

Message 95546 in response to message 95543

Quote:
... (says no jobs available) ...

Oh really? When was the last time you got this error message? This message usually comes from a bug that I thought we fixed.

BM

BM

DanNeely
DanNeely
Joined: 4 Sep 05
Posts: 1364
Credit: 3562358667
RAC: 1580

RE: You can always see the

Message 95547 in response to message 95544

Quote:

You can always see the current value for your quota by looking at your computer details page on this website, but you can't change it except by returning good work.

Can't you reset it by de/rreattaching (reinstalling?) from the project. I remember doing so a few years ago when Akos's optimized app was able to deplete my daily quota in slightly less than one day.

Jord
Joined: 26 Jan 05
Posts: 2952
Credit: 5779100
RAC: 0

Not anymore with the present

Message 95548 in response to message 95547

Not anymore with the present server software and clients. You usually get the same hostID you had before, unless you completely delete your BOINC Data directory and everything in it (and even then it's possible you get the same hostID!), or install a whole new CPU (think Intel -> AMD) or OS (Windows -> Linux).

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.