The last 12 tasks that I've run were all flagged as receiving a client error, some of them after running a very long time. What's going on? I'm running seti@home along with Einstein and am not having this problem with their tasks. This has started happening within the past month and I'm at a loss to explain the cause or what to do to fix it. My apologies for this post being identical to the one in Cafe Einstein but I'd like to get back to contributing.
Copyright © 2024 Einstein@Home. All rights reserved.
Why all the errors?
)
I presume you have noticed these errors by looking at your list of tasks for your computer. If you observe that list, you will see that each listed task has a task ID which itself is a clickable link. If you follow the link you will get to see the error messages returned by your client to the server with each failed task. For example, here is part of the error message for the most recent failure visible at the time of this reply. This was for task ID 147839349.
too many exit(0)s
l]: Start of BOINC application '..\..\projects\einstein.phys.uwm.edu\einstein_S5R6_3.01_windows_intelx86_2.exe'.
Activated exception handling...
14:08:28 (2588): Can't acquire lockfile (32) - waiting 35s
14:09:03 (2588): Can't acquire lockfile (32) - exiting
14:09:03 (2588): Error: The process cannot access the file because it is being used by another process. (0x20)
The basic error message is that the science app exited normally (ie claimed it had finished processing) too many times - there is a limit of 100 exit(0)s (I think) before BOINC gives up. When any process exits, it sets an exit code which is a number representing the status of the process when it exited. If that number is anything but zero, the process has exited abnormally and the number returned is an indication of what went wrong or what caused the abnormal exit. So ostensibly, the science app is claiming that it was finished and exited normally whereas BOINC knows that it really wasn't finished (through other indicators) and so instructs the app to try again. After 100 such attempts, BOINC is programmed to give up and return the client error.
The science app itself gives a further clue to the real problem by telling us that it couldn't acquire a lockfile (ie exclusive access to a required file), even after trying for 35 seconds, so that is why it decided to exit with a zero exit status. It further added the information that a file it needed to access was already being accessed (and therefore was locked) by another process. If you continue browsing the error output for the above task you will find approximately 100 iterations of the above sequence of messages about not being able to acquire a lock on a file.
This situation has come up a number of times recently and in a lot of cases it is due to an over-zealous anti-virus system locking some of BOINC's files. The solution is to exclude the BOINC tree from being scanned by your anti-virus system.
Can you please advise exactly what anti-virus system you are using and if you have made any changes recently that might have triggered this spate of errors? It could be a different cause but anti-virus activity is probably the most likely so let's rule that in or out first before looking for other possible causes.
Cheers,
Gary.
I'm using Kaspersky Internet
)
I'm using Kaspersky Internet Security 2010 which thus far is proving to be something of a pain in the arse. I got it because I'd heard it was less of a resource hog than Norton 360 but from several recent experiences I'm prepared to throw both of them out and look elsewhere. If anyone can give me specific information on how to exempt BOINC I'd be more than happy to do so; right now I don't have time to go digging through the rather sizable user manual trying to figure out what needs to be done.
RE: I'm using Kaspersky
)
If you do a forum search (top left corner of this page) for 'Kaspersky', you'll find this thread with a quite detailed description of what to do.
Gruß,
Gundolf
Computer sind nicht alles im Leben. (Kleiner Scherz)
I've made a few tweaks, now
)
I've made a few tweaks, now to see what happens; thanks for the link (though the description given there didn't exactly match what I see under Kaspersky Internet Security 2010). Now I'm not getting any new work (says no jobs available) but that is preceded by a message I don't recall seeing before; it says "reached daily quota of 1 tasks". Wasn't aware of quotas, what is the significance of that?
RE: I've made a few tweaks,
)
It's a safety feature in BOINC. If you hadn't noticed that your computer was producing nothing but errors, and come here to ask for help (and congratulations on doing that - so many people don't) - your machine could have run wild and caused damage to the Einstein project (and your pocket if you have to pay bandwidth surcharges) by continually downloading replacement workunits.
When you start returning errors, BOINC gradually cuts you down until you reach that bottom limit of 1 task per day. If your Kaspersky fix works, and you're able to return tomorrow's task without errors, you'll immediately be allowed to download another one, and quota will quickly be restored to normal (16 per day) as you return successful results.
You can always see the current value for your quota by looking at your computer details page on this website, but you can't change it except by returning good work.
YOu can also get the restarts
)
YOu can also get the restarts from IBERCIVIS, DrugDiscovery and Hydrogen ... if you happen to be running them and have tasks from them ...
RE: ... (says no jobs
)
Oh really? When was the last time you got this error message? This message usually comes from a bug that I thought we fixed.
BM
BM
RE: You can always see the
)
Can't you reset it by de/rreattaching (reinstalling?) from the project. I remember doing so a few years ago when Akos's optimized app was able to deplete my daily quota in slightly less than one day.
Not anymore with the present
)
Not anymore with the present server software and clients. You usually get the same hostID you had before, unless you completely delete your BOINC Data directory and everything in it (and even then it's possible you get the same hostID!), or install a whole new CPU (think Intel -> AMD) or OS (Windows -> Linux).