O1Spot1TLo 1.02 computation errors

floyd
floyd
Joined: 12 Sep 11
Posts: 133
Credit: 186610495
RAC: 0
Topic 207728

All running instances of einstein_O1Spot1TLo_1.02_x86_64-pc-linux-gnu__AVX are killed with "finish file present too long" when BOINC is restarted. This is on two hosts that were running 1.00 or 1.01 before. Debian jessie + BOINC 7.6.33 from backports.

Richie
Richie
Joined: 7 Mar 14
Posts: 656
Credit: 1702989778
RAC: 0

I had also one (Hi freq

I had also one (Hi freq task).

https://einsteinathome.org/task/646634804

That happened yesterday after I had reboot that host. It had been running only 1 of those tuning searches at a time. That specific task had been running for almost 7 hours. I shut down Boinc and even gave about ten seconds for whatever write operations to complete before I hit reboot.

After reboot, I let Boinc start working again and it said something like files of that task do not exist... and a fresh task begun from the beginning.

Christian Beer
Christian Beer
Joined: 9 Feb 05
Posts: 595
Credit: 188569457
RAC: 170522

I looked at your tasks and

I looked at your tasks and the problem seems to be that when you restart BOINC the science app gets send a SIGTERM which aborts the computation. Your Client shouldn't send a SIGTERM in the first place. We didn't change the science app on how it reacts to this kind of signal between 1.01 and 1.02.

A possible explanation is that it takes too long for the app to finish normally and that the Client tries to expedite this by sending a SIGTERM.

We are investigating a possible build system related change between 1.01 and 1.02.

Christian Beer
Christian Beer
Joined: 9 Feb 05
Posts: 595
Credit: 188569457
RAC: 170522

There is going to be a

There is going to be a version 1.03 for Linux that will address this problem soon.

Jim1348
Jim1348
Joined: 19 Jan 06
Posts: 463
Credit: 257957147
RAC: 0

I now see version 1.04 on

I now see version 1.04 on Linux.

floyd
floyd
Joined: 12 Sep 11
Posts: 133
Credit: 186610495
RAC: 0

Everything is back to normal

Everything is back to normal with 1.04.

AgentB
AgentB
Joined: 17 Mar 12
Posts: 915
Credit: 513211304
RAC: 0

I see a couple of these

I see a couple of these errors on v1.05 - see here after a daily restart.

Richie
Richie
Joined: 7 Mar 14
Posts: 656
Credit: 1702989778
RAC: 0

That's different app though

That's a different app though (Gamma-ray pulsar binary search) ...

AgentB
AgentB
Joined: 17 Mar 12
Posts: 915
Credit: 513211304
RAC: 0

Good point! I'll start a

Good point! I have started a separate thread, hopefully the fix will be exactly the same.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.