What's up with host 93643?

Nothing But Idle Time
Nothing But Idl...
Joined: 24 Aug 05
Posts: 158
Credit: 289204
RAC: 0
Topic 190972

Any ideas on this situation?...
I've been getting assigned a lot of WUs (issue #4) because host93643 isn't returning any. When I look at this host there are 928 results to his credit dating back to 27 Feb and only 1 has been returned and received any credit. WUs continue to be issued but none get returned as far as I can tell. It doesn't really matter to me since I'm getting credit and also helping 2 other hosts to finally get their credit also. Still, why keep issueing WUs to someone who isn't going to return them? Don't want to interfere with someone else's business, just trying to understand what gives?

Honza
Honza
Joined: 10 Nov 04
Posts: 136
Credit: 3332354
RAC: 0

What's up with host 93643?

Hard to say what preferences apply to this host (WU cache size).
Anyway, it has reached daily quota of 1 WU/CPU and this should stop draining WUs.

Well, it was running outdated BOINC 4.26 back in February - the only result completed and actually sent back to server. This may be the issue...

archae86
archae86
Joined: 6 Dec 05
Posts: 3145
Credit: 7023834931
RAC: 1804847

RE: Any ideas on this

Quote:
Any ideas on this situation?...
I've been getting assigned a lot of WUs (issue #4) because host93643 isn't returning any. When I look at this host there are 928 results to his credit dating back to 27 Feb and only 1 has been returned and received any credit. WUs continue to be issued but none get returned as far as I can tell. It doesn't really matter to me since I'm getting credit and also helping 2 other hosts to finally get their credit also. Still, why keep issueing WUs to someone who isn't going to return them? Don't want to interfere with someone else's business, just trying to understand what gives?

If you look at the computer status instead of just its results status, you'll see that it has done much work in the past, but the low RAC suggests it stopped contributing weeks ago, consistent with the one result returned in late February. As the Maximum daily WU quota per CPU currently shows at 1/day, it appears that the system is "on to it" in some sense, and issuing new work at a very low rate (it appears 2/day in the last few days, consistent with 1/day/CPU).

I'd presume the system is set to maintain a very big work queue, but don't understand why failure to return work did not bump down its quota/day much longer ago. For example, it got 35 new results on March 22, I suppose as queue backfill for queue expired results.

So I'll add my wonderment to NBIT's--is this the way it is supposed to work?

If so people might add another reason to avoid setting large queues, and to tidy up when reducing project priority. It would also help if the client bug which fails to consider resource share allocation in fetching work got fixed some day.

Personally, I wish folks who set very

Robert Everly
Robert Everly
Joined: 18 Jan 05
Posts: 9
Credit: 10393199
RAC: 0

I've mentioned this on

I've mentioned this on another project, but heres my $.02 worth.

I think all projects should have an "Outstanding work limit".

If you want to have a weeks worth of work on your machine, fine. But prove to the project that your going to return the work first.

David Hammer
David Hammer
Joined: 15 Oct 04
Posts: 360
Credit: 1672886
RAC: 0

For what its worth I just

For what its worth I just emailed the user telling them they have a misconfigured host. Hopefully they will fix the problem.

Wurgl (speak^Wcrunching for Special: Off-Topic)
Wurgl (speak^Wc...
Joined: 11 Feb 05
Posts: 321
Credit: 140550008
RAC: 0

RE: For what its worth I

Message 26718 in response to message 26717

Quote:
For what its worth I just emailed the user telling them they have a misconfigured host. Hopefully they will fix the problem.

Thanks David!

I think, the BOINC developers shall consider some mean to prevent hosts from running amok, like this one did. Either by a quota 'max unfinished results' or by some program which walks thru the database and sends out mails automatically.

BTW: Was the outage the planned one, to prepare the double-speed clients?

Spare_Cycles
Spare_Cycles
Joined: 18 Feb 06
Posts: 2
Credit: 20780
RAC: 0

It's a Xeon running a server

It's a Xeon running a server version of windows, so perhaps it's a corporate computer sitting behind a corporate firewall.

My guess is that a change in the firewall prevents WUs from downloading the files they need, but doesn't block WUs from being requested. Thus, the computer gets a WU, tries to download the files for it, gives up and requests another WU, and so on until it hits the quota limit.

Ingleside
Ingleside
Joined: 23 Jan 05
Posts: 33
Credit: 82104476
RAC: 5

Looking on the

Looking on the work-assignments, it actually looks like this host has got some kind of connection-problem, there never successfully gets-back the answer to RPC from server, in other words only generates "ghost"-tasks. This will show as 1-minute backoffs initially, before expands upto max 4 hours, but client also reverts-back to 1-minute after every 10 RPC...

Now, if this host is running v4.26, it means the client doesn't send any info about which Tasks already got on computer, meaning instead of re-sending the same tasks like under v4.45, he'll all the time get assigned new tasks.

As for the daily quota, this only decreases either when a task reported as an error, or if a task not returned by deadline. From the look of things, the tasks started to time-out and decreased his quota 21-22. March, indicating computer gone "bad" 07. March...

So, the quota-system works, but it does have the delay before it kicks-in.
Also, Einstein@home have enabled re-issuing of "lost" tasks, meaning only pre-v4.45-clients will continue to grab the full quota, while v4.45 and later will just re-issue the same small group of tasks again and again...
Einstein@home can choose to set v4.45 as min-allowed BOINC-client...

"I make so many mistakes. But then just think of all the mistakes I don't make, although I might."

Nothing But Idle Time
Nothing But Idl...
Joined: 24 Aug 05
Posts: 158
Credit: 289204
RAC: 0

Ghosts or something... I now

Ghosts or something... I now see several hundred results issued to this host on Mar 28... so I guess I don't understand why the quota parameter isn't preventing this. And two other hosts will have to wait 14 days before the results are re-issued to someone else and before credit is granted.

archae86
archae86
Joined: 6 Dec 05
Posts: 3145
Credit: 7023834931
RAC: 1804847

RE: Ghosts or something...

Message 26722 in response to message 26721

Quote:
Ghosts or something... I now see several hundred results issued to this host on Mar 28... so I guess I don't understand why the quota parameter isn't preventing this. And two other hosts will have to wait 14 days before the results are re-issued to someone else and before credit is granted.

I also got alarmed on seeing that this morning, but then came to doubt the time stamps. It appears to me the total is about the same as before the outage, when only two/day were time-stamped for this host in the last couple of days. Now nearly everything shown bears the same time-stamp.

Seems likely this is an artifact of the late unpleasantness with the project servers, rather than a new outbreak of trouble in handling this particular host?

By the way, my thanks to the posters who have added insight on this thread.

Ingleside
Ingleside
Joined: 23 Jan 05
Posts: 33
Credit: 82104476
RAC: 5

RE: Ghosts or something...

Message 26723 in response to message 26721

Quote:
Ghosts or something... I now see several hundred results issued to this host on Mar 28... so I guess I don't understand why the quota parameter isn't preventing this. And two other hosts will have to wait 14 days before the results are re-issued to someone else and before credit is granted.

Looks like the user has finally upgraded his client, and he'll now being re-assigned many of these "ghost"-tasks. If you looks a little closer, you'll see he's got a new date on his work, while the other users got assigned this work 21.03 or something.

Now, if he'll successfully got all of this work, not sure if he'll manage to return everything before the deadline...

"I make so many mistakes. But then just think of all the mistakes I don't make, although I might."

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.