timed out - no response

merle van osdol
merle van osdol
Joined: 1 Mar 05
Posts: 513
Credit: 60724446
RAC: 0
Topic 197787

I just had a work unit called an error.

My workunit was sent out to me just six minutes prior to the time that the workunit was reported as valid by someone else. Any recourse on these types of situations?

merle

What is freedom of expression? Without the freedom to offend, it ceases to exist.

— Salman Rushdie

Holmis
Joined: 4 Jan 05
Posts: 1118
Credit: 1055935564
RAC: 0

timed out - no response

If your talking about WU#201192907 then the following seem to have happened:

14 Oct 2014 17:50:49 UTC task#460649709 was sent out to the first host
14 Oct 2014 18:22:41 UTC task#460649710 was sent out to the second host
16 Oct 2014 5:55:16 UTC task#460649710 reported an error so a new task was sent out to a third host
28 Oct 2014 17:50:49 UTC the deadline for task task#460649709 was up so a new task got sent to the fourth host (you) at 28 Oct 2014 17:56:55 UTC then the host with task#460649709 reported in at 28 Oct 2014 18:02:50 UTC, 12m01s past the deadline and validated against the result from the third host.
Then on 7 Nov 2014 18:38:40 UTC your task missed it's deadline.

What I thought would happen is that after the workunit was validated the next time your computer contacted the server the scheduler should have sent a message that Boinc could abort the task if it had not started yet so not to waste time on a needless task. Maybe that function isn't turned on for all the searches here, I seem to remember it was some time ago but I'm not sure.

archae86
archae86
Joined: 6 Dec 05
Posts: 3145
Credit: 7056094931
RAC: 1606859

RE: Any recourse on these

Quote:
Any recourse on these types of situations?


Yes--adjust your general BOINC preferences to a much shorter queue length.

You did not send back the task for over a week, which is the primary issue here, not the other deadlines, task sendings, and credit rules.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5845
Credit: 109956040625
RAC: 31318718

RE: I just had a work unit

Quote:
I just had a work unit called an error.


No it wasn't :-). It was called a "Timed out - no response", which is also wrong, but it's not regarded as an "error". I don't know for sure but I'm guessing that the highly customised but quite old server code running here doesn't quite know how to handle this situation. If you look at the sent and reported times, the task didn't exceed it's 14 day deadline at all so it certainly wasn't a normal 'deadline miss' situation. You received new work at pretty much the exact time that the task was 'timed out' so it seems likely that the work request that resulted in the 2 new tasks received at that time also invoked a response from the server to delete the task as it was no longer required.

The puzzling bit is why the server took so long to recognise that the task was redundant. I'm guessing that the whole WU quorum to which the task belonged had become 'overdue for deletion' from the online database and was being held up by your redundant outstanding task. Perhaps this prompted the abort action. Whatever the reason, it wasn't an 'error' or any sort of problem for you to worry about. No crunch time was wasted on an unneeded task so all is good :-).

Cheers,
Gary.

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 4273
Credit: 245207413
RAC: 13173

RE: Maybe that function

Quote:
Maybe that function isn't turned on for all the searches here, I seem to remember it was some time ago but I'm not sure.

Einstein@Home never send "cancel" requests for tasks to clients where e.g. the workunits have been canceled. To do this is a standard feature in BOINC and already available in the server code we use on Einstein@Home. However turnig it on puts an enormous load on our database server. Apparently it works well for a number of projetc that have some 10k (or possibly even 100k) tasks in the DB, but for Einstein@Home with 1-3M tasks and locality schduling this simply doesn't work, at least not in its current implementation.

BM

BM

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.