A reportable glitch, or just one of those things ... ?

Richard de Lhorbe
Richard de Lhorbe
Joined: 15 Dec 05
Posts: 43
Credit: 9275812259
RAC: 659445
Topic 225124

This is a new one for me.

Got this download today, work unit 535203277

 

TASK ID WORKUNIT ID SENT TIME REPORTED OR DEADLINE STATUS RUN TIME CPU TIME GRANTED CREDIT

ApplicationAll applications (1)Gamma-ray pulsar binary search #1 on GPUs (1)

LATeah4001L00_1204.0_0_0.0_12061535_1 535203277 28 Mar 2021 22:32:14 UTC 28 Mar 2021 22:36:44 UTC Timed out - no response 0 0 0 Gamma-ray pulsar binary search #1 on GPUs v1.17 () x86_64-apple-darwin

Normally, you are given two weeks to process a WU, but in this case, the time allowed was only 4 and a half minutes .... not quite enough time to process or even start, so of course it Timed Out.  On this particular computer at this particular time I only keep about 20 WU in my queue, and average turnaround time is less than two hours, so this error jumped out.

Never seen this type of glitch before, so I am not sure if anyone at E&H is interested in this, or if such things happen on occasion and we just live with it ......

Regards

Richard

 

 

Keith Myers
Keith Myers
Joined: 11 Feb 11
Posts: 4754
Credit: 17704376071
RAC: 5330977

Yes, it is common and happens

Yes, it is common and happens frequently on all projects. The scheduler issued a backup task right at the same time as the original was reported and the two tasks "passed in the night" in the database.

So, as soon as it realized the original task was returned it either issues a fast cancellation or more commonly resets the deadline for an instant expiration.  Depends on the project which method is more common.

Your fast deadline method was more common at Seti and here at Einstein while the fast cancellation happens all the time at Universe in my hosts. Depends on the version of the BOINC server code.

 

Ian&Steve C.
Ian&Steve C.
Joined: 19 Jan 20
Posts: 3716
Credit: 34669989744
RAC: 26112005

Keith Myers wrote: Yes, it

Keith Myers wrote:

Yes, it is common and happens frequently on all projects. The scheduler issued a backup task right at the same time as the original was reported and the two tasks "passed in the night" in the database.

So, as soon as it realized the original task was returned it either issues a fast cancellation or more commonly resets the deadline for an instant expiration.  Depends on the project which method is more common.

doesn't look to be the case here if you look at the timing of the activity of the hosts for this WU. the _1 task timed out before receipt of the _0 original tasks. the OP had the other half of the original work, so none of this should have happened anyway, this task was required for validation, even if it had received _0 back, it should still be waiting for _1 and does not have what it needs to be able to send a cancellation request. and cancellation requests are labelled differently, they say "Cancelled by server", not Timed out.

 

Chronologically:

  • 28 Mar 2021 22:29:09 UTC - _0 sent out
  • 28 Mar 2021 22:32:14 UTC - _1 sent out with 4 min deadline
  • 28 Mar 2021 22:36:44 UTC - _1 timed out, deadline expired
  • 28 Mar 2021 22:38:17 UTC - _2 sent out
  • 29 Mar 2021 8:36:05 UTC - _0 results received

 

_________________________________________________________________________

Keith Myers
Keith Myers
Joined: 11 Feb 11
Posts: 4754
Credit: 17704376071
RAC: 5330977

OK, not exactly as I

OK, not exactly as I described.  But we saw exactly the same thing all the time at Seti.

Thus, my response that this is nothing to worry about.  A BOINC scheduler server code problem that has existed for years.

And never fixed though reported often.

 

Ian&Steve C.
Ian&Steve C.
Joined: 19 Jan 20
Posts: 3716
Credit: 34669989744
RAC: 26112005

yeah I agree it's a common

yeah I agree it's a common glitch.

_________________________________________________________________________

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.