computation errors on Linux and Win 10

Oliver
Oliver
Joined: 22 Jul 05
Posts: 6
Credit: 918813
RAC: 0
Topic 210658

Hello,

 

Run BOINC on Linux Mint 18.2, mostly, and do some on a Win 10 laptop, but both lately have had several data crunching packets abort themselves early for computation errors, or get so stuck one point that I abort them. Odd. I have had good success before on both systems--but a lot of termination lately. Buggy packets or what? SETI not affected, running well on both

Thanks, S

AgentB
AgentB
Joined: 17 Mar 12
Posts: 915
Credit: 513211304
RAC: 0

Hi Oliver If you could share

Hi Oliver

If you could share some of the event log output that would be helpful.

Some errors in the online task logs show errors with file JPLEPH.405 so perhaps have a look at this thread...

https://einsteinathome.org/content/md5-check-failed-1

Oliver
Oliver
Joined: 22 Jul 05
Posts: 6
Credit: 918813
RAC: 0

yup, it just did it again: I

yup, it just did it again: I put a few SETI notes in to show that BOINC is functioning well with SETI, but keeps crumping on E@H.

Thanks much. This is Linux Mint 18.2

 

 

hu 02 Nov 2017 01:53:23 AM MDT | SETI@home | Computation for task 12fe07ac.7585.24205.4.31.22_1 finished
Thu 02 Nov 2017 01:53:23 AM MDT | SETI@home | Starting task 13ap08ab.31624.24203.11.38.236_0
Thu 02 Nov 2017 01:53:25 AM MDT | SETI@home | Started upload of 12fe07ac.7585.24205.4.31.22_1_r1499153236_0
Thu 02 Nov 2017 01:53:28 AM MDT | SETI@home | Finished upload of 12fe07ac.7585.24205.4.31.22_1_r1499153236_0
Thu 02 Nov 2017 01:53:33 AM MDT | SETI@home | Sending scheduler request: To report completed tasks.
Thu 02 Nov 2017 01:53:33 AM MDT | SETI@home | Reporting 1 completed tasks
Thu 02 Nov 2017 01:53:33 AM MDT | SETI@home | Not requesting tasks: "no new tasks" requested via Manager
Thu 02 Nov 2017 01:53:35 AM MDT | SETI@home | Scheduler request completed
Thu 02 Nov 2017 01:56:47 PM MDT | Einstein@Home | update requested by user
Thu 02 Nov 2017 01:56:49 PM MDT | Einstein@Home | Sending scheduler request: Requested by user.
Thu 02 Nov 2017 01:56:49 PM MDT | Einstein@Home | Requesting new tasks for CPU
Thu 02 Nov 2017 01:56:51 PM MDT | Einstein@Home | Scheduler request completed: got 2 new tasks
Thu 02 Nov 2017 01:56:53 PM MDT | Einstein@Home | Started download of LATeah0007F.dat
Thu 02 Nov 2017 01:56:58 PM MDT | Einstein@Home | Finished download of LATeah0007F.dat
Thu 02 Nov 2017 04:09:12 PM MDT | SETI@home | project suspended by user
Thu 02 Nov 2017 04:09:13 PM MDT | Einstein@Home | Starting task LATeah0007F_1352.0_191575_0.0_0
Thu 02 Nov 2017 04:09:13 PM MDT | Einstein@Home | Starting task LATeah0007F_1160.0_207612_0.0_2
Thu 02 Nov 2017 04:09:31 PM MDT | Einstein@Home | Computation for task LATeah0007F_1352.0_191575_0.0_0 finished
Thu 02 Nov 2017 04:09:31 PM MDT | Einstein@Home | Output file LATeah0007F_1352.0_191575_0.0_0_0 for task LATeah0007F_1352.0_191575_0.0_0 absent
Thu 02 Nov 2017 04:09:31 PM MDT | Einstein@Home | Output file LATeah0007F_1352.0_191575_0.0_0_1 for task LATeah0007F_1352.0_191575_0.0_0 absent
Thu 02 Nov 2017 04:09:32 PM MDT | Einstein@Home | Computation for task LATeah0007F_1160.0_207612_0.0_2 finished
Thu 02 Nov 2017 04:09:32 PM MDT | Einstein@Home | Output file LATeah0007F_1160.0_207612_0.0_2_0 for task LATeah0007F_1160.0_207612_0.0_2 absent
Thu 02 Nov 2017 04:09:32 PM MDT | Einstein@Home | Output file LATeah0007F_1160.0_207612_0.0_2_1 for task LATeah0007F_1160.0_207612_0.0_2 absent
Thu 02 Nov 2017 04:10:07 PM MDT | SETI@home | project resumed by user
Thu 02 Nov 2017 04:10:16 PM MDT | Einstein@Home | update requested by user
Thu 02 Nov 2017 04:10:21 PM MDT | Einstein@Home | Sending scheduler request: Requested by user.
Thu 02 Nov 2017 04:10:21 PM MDT | Einstein@Home | Reporting 2 completed tasks
Thu 02 Nov 2017 04:10:21 PM MDT | Einstein@Home | Requesting new tasks for CPU
Thu 02 Nov 2017 04:10:24 PM MDT | Einstein@Home | Scheduler request completed: got 0 new tasks
Thu 02 Nov 2017 04:10:24 PM MDT | Einstein@Home | No work sent
Thu 02 Nov 2017 04:10:24 PM MDT | Einstein@Home | (reached daily quota of 2 tasks)
Thu 02 Nov 2017 04:10:24 PM MDT | Einstein@Home | Project has no jobs available

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5888
Credit: 119637537612
RAC: 25031697

Unfortunately, there is

Unfortunately, there is nothing in the event log that tells you why this is happening.  All you can really see is that two tasks started (one per core), attempted to crunch for a very short time, and then were seen to have no output to return.  This is a classic sign of an early compute error before any output files were created.  You have had quite a few of these because the event log mentions that you have reached your (heavily reduced) daily limit - just one task per CPU core.  Each error task reduces your daily quota by one in order to stop a runaway trashing of tasks.

To get more information on the cause of the problem, it is best to consult the stderr information that is returned to the project for every task, good or bad.  The following gives you the task names involved

Quote:

Thu 02 Nov 2017 04:09:13 PM MDT | Einstein@Home | Starting task LATeah0007F_1352.0_191575_0.0_0
Thu 02 Nov 2017 04:09:13 PM MDT | Einstein@Home | Starting task LATeah0007F_1160.0_207612_0.0_2

If you go to your account page and click on the link for the computer having the problem, you will find a details page with a link to all the tasks for that machine.  Click on the tasks link to open a list of all tasks (currently 14).  The two latest tasks (names listed above) are down at the bottom of the page.  Click on the taskID for either of those tasks and you can see exactly what was returned to the server.  Here is a snippet from the full stderr output that is most relevant.

Quote:

% Sky point 1/79
dpleph[initephem]: Cannot open file ../../projects/einstein.phys.uwm.edu/JPLEPH.405, result = 107
Error while initializing ephemeris; status: 104
BarycenterSingleDate failed! Returned time: 0+-0.000000
barycenter_photons() failed!
16:09:18 (20782): [CRITICAL]: ERROR: MAIN() returned with error '4'

I would guess (you would need one of the Devs to tell you for sure) that the standard ephemeris file JPLEPH.405 is not readable by the app.  If the file wasn't present, I imagine BOINC wouldn't have allowed the task to attempt to start in the first place.  If the file was corrupt, I would expect a MD5 checksum error rather than a 'cannot open' error.  You should check for yourself by looking in the Einstein project directory and see if the file is there.  I think it will be but I'm wondering if it has the wrong ownership or permissions.  Is the file readable by the user ID running the app?

If it is wrong ownership/permissions, just correct them to what they should be.  See if you can work out how this was caused so you don't let it happen again :-).   In your case, is everything supposed to be owned by a 'boinc' user ID and run from /var/lib/boinc?  I run everything from my home dir under my own user ID which I find a lot easier for when I want to tweak stuff.  If it's not an ownership/permissions problem, I don't know what else to suggest.

 Good luck with sorting it out.

 

Cheers,
Gary.

Oliver
Oliver
Joined: 22 Jul 05
Posts: 6
Credit: 918813
RAC: 0

Just for fun, I joined

Just for fun, I joined Asteroids@Home, and BOINC is crunching and returning those just fine.

That would lead me to suspect something is goofy in the E@H packets. Which is too bad, 'cuz I love the E@H screensaver. Asteroids doesn't have one.

 

S

Oliver
Oliver
Joined: 22 Jul 05
Posts: 6
Credit: 918813
RAC: 0

OK, when all else fails,

OK, when all else fails, detach/reset project, reboot, re-attach to project, and guess what? E@H runs now. I guess a re-download of E@H specific items restored whatever was busted.

 

Good ol' uninstall/reinstall. How could I have forgotten you?

 

 

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5888
Credit: 119637537612
RAC: 25031697

Did you happen to check the

Did you happen to check the ownership and permissions for the JPLEPH.405 file?

It would be nice to know if that was the cause of the problem.

 

Cheers,
Gary.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.