Hey, what is up with this WU?

roadrunner_gs
roadrunner_gs
Joined: 7 Mar 06
Posts: 94
Credit: 3369656
RAC: 0
Topic 193972

A S5R3, lurking in my pending-list, resultid says checked, but no consensus yet, WU-detail says validate error.
I am puzzled...

archae86
archae86
Joined: 6 Dec 05
Posts: 3157
Credit: 7214554931
RAC: 979875

Hey, what is up with this WU?

Quote:
A S5R3, lurking in my pending-list, resultid says checked, but no consensus yet, WU-detail says validate error.
I am puzzled...


As it says in the Task details page:

Task was reported too late to validate

roadrunner_gs
roadrunner_gs
Joined: 7 Mar 06
Posts: 94
Credit: 3369656
RAC: 0

Nono, that is Task-ID

Nono, that is Task-ID 105909710, i am speaking of Task-ID 103149059.

Alinator
Alinator
Joined: 8 May 05
Posts: 927
Credit: 9352143
RAC: 0

RE: Nono, that is Task-ID

Message 86662 in response to message 86661

Quote:
Nono, that is Task-ID 105909710, i am speaking of Task-ID 103149059.

Simply put, the task result output from you host was not within the error tolerance band once the quorum formed (two tasks strongly similar) and the canonical task was selected.

Keep in mind in cases like this where the WU goes to more than the initial replication, all the other eligible tasks which remain after the quorum forms are compared to the canonical task and must be at least weakly similar to it to have credit granted.

Fortunately, a case like you saw here is very rare on EAH, and I'm not quite sure what the story was to bring it about. This WU dates back to just before the shutdown to upgrade the backend for R4. So if I had to guess, I'd say that the output file for your host got 'lost'. This is why it was sent to extra replications in the first place, since there was nothing for the validator to compare with when the wingman reported.

Alinator

roadrunner_gs
roadrunner_gs
Joined: 7 Mar 06
Posts: 94
Credit: 3369656
RAC: 0

Would this one never

Would this one never disappear in my pending list?

Alinator
Alinator
Joined: 8 May 05
Posts: 927
Credit: 9352143
RAC: 0

No, 'Zombies' don't seem to

No, 'Zombies' don't seem to be a problem with EAH, like they can be with SAH from time to time. Also keep in mind that updates to the pending list can be out of sync with the other summary pages by a bit. Some times a page refresh gets things moving for it.

AFAICT, this one is still in play since the scheduler has it has already 221'ed the two extra replications it didn't need.

EAH is just less aggressive about purging the BOINC database as quickly as some other projects are. My guess is the project will poof this WU from the record in a few days.

On second thought after looking this one over again, it might be a zombie. It was definitely having some problems back during the transition over to R4. So if it doesn't go poof in another week, then it might be worthwhile bringing it to Bernd's attention.

Alinator

Brian Silvers
Brian Silvers
Joined: 26 Aug 05
Posts: 772
Credit: 282700
RAC: 0

RE: On second thought

Message 86665 in response to message 86664

Quote:

On second thought after looking this one over again, it might be a zombie. It was definitely having some problems back during the transition over to R4. So if it doesn't go poof in another week, then it might be worthwhile bringing it to Bernd's attention.

This would also perhaps be something similar to what happened to Archae86 up in the power app thread...

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.