However, it's not my day today :). I took your advice and cancelled running work that was in many cases 80-90% complete!!! And I'm still not mad at you in the slightest :). I'd rather lose the credits than hold up the science by doing work that will only have to be repeated anyway so my cancelling the partly completed work was still the right thing to do.
I aborted 1 ongoing h1_WU and its been granted the claimed credit,so I don't think you loose those credits. :)
Edit: Hmm.. 4.19.. Was there a abort/cancel-button on those?
Hope they got reported.(Haven't read all posts here.(too long))
Edit: Hmm.. 4.19.. Was there a abort/cancel-button on those?
Hope they got reported.(Haven't read all posts here.(too long))
Yep, you worked it out exactly!! There is no abort button in 4.19 which is why I reported my procedure earlier thinking I might be helping other 4.19ers. The computation on the WU gets zeroed when BOINC restarts after deleting the h1_nnnn file. So no credit will be coming for those.
However it doesn't matter in the slightest as it would be a waste of science to keep spending cycles on a WU that wont contribute.
[Edit added 30 min later]
I found a script that I have used before, which I can use to grant credit to users/hosts/teams for workunits which I have cancelled. I am going to use this to grant credit to people who have had the misfortune of getting and doing work then having it cancelled.
Bruce
I'm very pleased that you have done that and it will be good for the silent majority who probably aren't even aware of the problem yet.
However, it's not my day today :). I took your advice and cancelled running work that was in many cases 80-90% complete!!! And I'm still not mad at you in the slightest :). I'd rather lose the credits than hold up the science by doing work that will only have to be repeated anyway so my cancelling the partly completed work was still the right thing to do.
Good news -- I'm giving credit for cancelled and 'download error' work as well as successful and valid results. Since these problems were my fault it seems the least I can do.
Quote:
It must have been one of those nightmare days (and nights) for you :).
I confess to being in a pretty foul mood for most of the day today!
I just aborted "h1_0118.0__0118.1_0.1_T00_S4ha_0" from my machine, 06/28/2005 08:11:06 PM|Einstein@Home|Starting result l1_0315.5__0315.9_0.1_T00_S4lA_0 using einstein version 4.79.
I confess to being in a pretty foul mood for most of the day today!
Actually you deserve heaps of praise for the way you handled everything. I don't think you could have done more and the issue was completely defused before there were any nasty surprises and the accompanying flood of complaints that would normally be expected to follow.
It is this kind of professionalism that makes me proud to give my full support to this project. Well done, and many thanks for all your efforts!!
Actually you deserve heaps of praise for the way you handled everything. I don't think you could have done more
it wasn't till I saw this wu that I realised just how much Bruce had done to defuse anger: he has set things up so that people get credit for the part worked wu they cancel part way through - at least I think that is what this wu is telling us
Quote:
It is this kind of professionalism that makes me proud to give my full support to this project. Well done, and many thanks for all your efforts!!
it wasn't till I saw this wu that I realised just how much Bruce had done to defuse anger: he has set things up so that people get credit for the part worked wu they cancel part way through - at least I think that is what this wu is telling us
Your interpretation is entirely correct. I am giving credit for partial/aborted/failed/completed h1_* workunits. Note that this is not instantaneous and may take a few hours. I have to run the script by hand and only do it a few times per day.
it wasn't till I saw this wu that I realised just how much Bruce had done to defuse anger: he has set things up so that people get credit for the part worked wu they cancel part way through - at least I think that is what this wu is telling us
Your interpretation is entirely correct. I am giving credit for partial/aborted/failed/completed h1_* workunits. Note that this is not instantaneous and may take a few hours. I have to run the script by hand and only do it a few times per day.
Bruce
Gary has pointed out to me that credit is not granted for wu that are killed by stealing their files. On consideration this makes sense if the xml that held the cpu time has gone. If the client re-starts the download when the files vanish, presumably it also deletes/overwrites the file that remembers the cpu time so far?
My thought is that it may be better, if running 4.19, to kill those wu from the operating system while BOINC is actually crunching them. This assumes the OS has some kind of task manager (eg not Win-98).
On win-XP for example, hit ctrl-alt-del and the task manager comes up. Highlight the Einstein task, right click, and kill process. The wu will report to BOINC that it ended with some error code that means killed. I think that this means that BOINC will report it back with a 'client error' message and they will get credit.
On linux: you probably already know how to use top or ps to get the pid, and how to use kill to abort. If not, I recommend the man pages on top, ps, kill.
Note: I have tried the win-xp method in the past, but not on these wu. If my suggestion won't work, please say so!
RE: However, it's not my
)
I aborted 1 ongoing h1_WU and its been granted the claimed credit,so I don't think you loose those credits. :)
Edit: Hmm.. 4.19.. Was there a abort/cancel-button on those?
Hope they got reported.(Haven't read all posts here.(too long))
RE: Edit: Hmm.. 4.19.. Was
)
Yep, you worked it out exactly!! There is no abort button in 4.19 which is why I reported my procedure earlier thinking I might be helping other 4.19ers. The computation on the WU gets zeroed when BOINC restarts after deleting the h1_nnnn file. So no credit will be coming for those.
However it doesn't matter in the slightest as it would be a waste of science to keep spending cycles on a WU that wont contribute.
Cheers,
Gary.
RE: RE: [Edit added 30
)
Good news -- I'm giving credit for cancelled and 'download error' work as well as successful and valid results. Since these problems were my fault it seems the least I can do.
I confess to being in a pretty foul mood for most of the day today!
Director, Einstein@Home
I just aborted
)
I just aborted "h1_0118.0__0118.1_0.1_T00_S4ha_0" from my machine, 06/28/2005 08:11:06 PM|Einstein@Home|Starting result l1_0315.5__0315.9_0.1_T00_S4lA_0 using einstein version 4.79.
Greg
I'd also like to say thanks
)
I'd also like to say thanks for keeping us informed. Screw-ups happen, and I'm quite happy as long as I'm reasonably well informed.
RE: I confess to being in
)
Actually you deserve heaps of praise for the way you handled everything. I don't think you could have done more and the issue was completely defused before there were any nasty surprises and the accompanying flood of complaints that would normally be expected to follow.
It is this kind of professionalism that makes me proud to give my full support to this project. Well done, and many thanks for all your efforts!!
Cheers,
Gary.
I agree, good work from the
)
I agree, good work from the country of cheese and packers :-)
Especially for the good communication I'll give an A++
RE: Actually you deserve
)
it wasn't till I saw this wu that I realised just how much Bruce had done to defuse anger: he has set things up so that people get credit for the part worked wu they cancel part way through - at least I think that is what this wu is telling us
agreed^2
~~gravywavy
RE: it wasn't till I saw
)
Your interpretation is entirely correct. I am giving credit for partial/aborted/failed/completed h1_* workunits. Note that this is not instantaneous and may take a few hours. I have to run the script by hand and only do it a few times per day.
Bruce
Director, Einstein@Home
RE: RE: it wasn't till I
)
Gary has pointed out to me that credit is not granted for wu that are killed by stealing their files. On consideration this makes sense if the xml that held the cpu time has gone. If the client re-starts the download when the files vanish, presumably it also deletes/overwrites the file that remembers the cpu time so far?
My thought is that it may be better, if running 4.19, to kill those wu from the operating system while BOINC is actually crunching them. This assumes the OS has some kind of task manager (eg not Win-98).
On win-XP for example, hit ctrl-alt-del and the task manager comes up. Highlight the Einstein task, right click, and kill process. The wu will report to BOINC that it ended with some error code that means killed. I think that this means that BOINC will report it back with a 'client error' message and they will get credit.
On linux: you probably already know how to use top or ps to get the pid, and how to use kill to abort. If not, I recommend the man pages on top, ps, kill.
Note: I have tried the win-xp method in the past, but not on these wu. If my suggestion won't work, please say so!
~~gravywavy