Hi!
It seems that I got a WU without a deadline?
http://einsteinathome.org/workunit/218136984
Sent today in the early night and already timed out?
Screenshot of the website, since it will vanish in a month or so ...
http://abload.de/img/nodeadlineq9r0c.png
Copyright © 2024 Einstein@Home. All rights reserved.
Immediate timeout? Missing deadline?
)
It says no response, not a deadline miss. It seems likely that the server thinks that some expected handshaking in the work assignment/download process was not detected, so it gave up on sending that one to your host and sent a copy to another host.
I recall composing a post similar to this one an hour or so ago, but don't see it, so am risking a double post.
RE: I recall composing a
)
I miss the entry in Column 4 showing the deadline. It is empty.
In the boinc manager the same, empty deadline date.
http://abload.de/img/boinco4qy2.png
So I think this is a server-oops.
Compare here some real deadline:
http://einsteinathome.org/workunit/214259915
Same entry "Timed out - no response", but holding a valid deadline date.
Today I got one of those too.
)
Today I got one of those too. The task is still running but assuming the server won't accept the result I think I'll abort it.
http://einsteinathome.org/workunit/220348353
RE: Today I got one of
)
The problem task was h1_0378.00_S6GC1__S6BucketFU2UBb_32310395_1
Unfortunately, the host has contacted the server again since then, and picked up another task:
It would be really interesting to catch and examine a server log for one of these immediate timeouts sometime, and try to work out what's going wrong. But you'd need to be quick about it.
Thanks for reporting this
)
Thanks for reporting this problem!
So far we have not been aware of it.
We are looking into it.
Currently we do have >1700 such tasks in the DB (send_time>0 and report_deadline=0), all of which belong to "einstein_S6BucketFU2UB", which makes me think that the reason is in the locality scheduler.
BM
BM
I am about to miss my
)
I am about to miss my deadline on 2 tasks that have been running for 10 and 7 hrs. They have 2 and 2.75 hrs left. Due in 20min. What happens now? Do I miss the credits and end up wasting 18+hrs of CPU time?
RE: I am about to miss my
)
It depends.
The system will arm itself to send out another copy to someone else, as you failed to respond in the allocated time. But it may not do so immediately, the recipient likely will not start working on it immediately, won't finish immediately, and may not report it immediately. If you both finish and report before they do--and your result has enough integrity and similarity to your first quorum partner to validate, you'll get credit. If the third quorum partner reports within their (later) deadline, in this case they also get credit.
But there still is a loss. The project wasted the effort of that third partner (the credit is just symbolic), and that third partner actually wasted their effort also so far as useful science is concerned. The "consolation prize" of getting credit in this case notwithstanding.
So it is a good idea so to manage your queue and your participation to avoid missing deadlines, not just losing credit.
Perhaps people reading this with better knowledge of the system for cancelling already distributed work will comment on what circumstances (if any) there are in which the third party won't wasted time as the software will tell it not to run the already distributed work before that system started on it. But in at least a fraction of real-world cases that can't possibly happen in time to avoid all wasted.
Hello. In the last few
)
Hello.
In the last few days I got some tasks without a deadline.
510204919
512267367
511293311
511314272
511314214
Some were calculated, others were immediately and automatically canceled by the client. It looks like the problem of the first post.
Bye, Grubix.
Next WU without a deadline:
)
Next WU without a deadline: 511817146
Bye, Grubix.
RE: I am about to miss my
)
My question is "what is good form in this case?"
If I know I'll miss a deadline by a couple of hours, should I abort the task and take the wasted compute hit or should I just let it complete/report and "waste" someone else's cpu/gpu cycles. [I've been doing the former for any task scheduled to complete in > 1 hour]
Is there some kind of grace period before the ending of which this wu will not be sent to another computer?
Also, I've noticed that sometimes tasks are marked as "missed deadline" but not always. Is the presence of this message some indication that the WU has been sent to a 3rd computer? Is there any such indication for these situations?