Black hole for tasks

Alicia, Dan, an...

Joined: 4 Jul 07

Posts: 7

Credit: 525867

RAC: 0

20 Oct 2007 14:28:22 UTC

Topic 193248

(moderation:

)

I noticed that one of the tasks I completed about a month ago is still "pending" credit so I checked it out. It is assigned to computer 964318. In pulling the thread I find that computer 964318 has a huge pile of assigned tasks, no recent completions, last task completed about a month ago. Appears there's a problem with this machine. Owner is "Anonymous". Don't know if you or owner are aware of the situation.

DanNeely

Joined: 4 Sep 05

Posts: 1364

Credit: 3562358667

RAC: 0

Black hole for tasks

20 Oct 2007 15:27:35 UTC

Message 74728

(moderation:

)

there's really not anything that the project can do. Eventually the WUs will time out and be reassigned, meanwhile all the fails will slowly reduce the number the busted machine gets daily to only 1.

th3

Joined: 24 Aug 06

Posts: 208

Credit: 2208434

RAC: 0

Thats a funny coincidence, i

20 Oct 2007 17:49:19 UTC

Message 74729

(moderation:

)

Thats a funny coincidence, i came across the exact same computer yesterday when checking an old pending R2 result. I concluded it was of little significance now since its CPU quota has reached 1 per day. Its still a good example of why a qouta of 72 WUs per CPU is too high and has been to high ever since S5R1 ended, that computer has collected 412 WUs, most of them just sitting there waiting for timeout. With a somewhat smaller quota theres still some room for beta-app testing and experimenting without too much danger of running out of quota (unless for those who have 5-10 days of work cache).

Team Philippines

Bikeman (Heinz-...

Moderator

Joined: 28 Aug 06

Posts: 3522

Credit: 714577268

RAC: 923029

RE: ... With a somewhat

20 Oct 2007 17:56:15 UTC

Message 74730 in response to message 74729

(moderation:

)

Quote:

... With a somewhat smaller quota theres still some room for beta-app testing and experimenting without too much danger of running out of quota (unless for those who have 5-10 days of work cache).

A work cache of 5-10 days is nothing unusual, tho. I think the work quota was raised some time ago as a result of public demand, although work units must have been somewhat smaller then.

CU
Bikeman

th3

Joined: 24 Aug 06

Posts: 208

Credit: 2208434

RAC: 0

yes, i was one of the public

20 Oct 2007 18:17:27 UTC

Message 74731

(moderation:

)

yes, i was one of the public wanting higher quota during that time, the shortest WUs completed in around 20 minutes. But now, with shortest WUs taking 5 hours thats not an issue. Lets say a Penryn Extreme Edition can be clocked high enough to do it in 4 hours, thats still only 6 results per cpu per day, so a quota of 32 would be more now than 72 was back then.

Team Philippines

DanNeely

Joined: 4 Sep 05

Posts: 1364

Credit: 3562358667

RAC: 0

People are pushing

20 Oct 2007 22:27:07 UTC

Message 74732

(moderation:

)

People are pushing engineering sample penryns to >4.5gigs on air. I don't have a conroe, so I'm not sure what that'd equate to in terms of runtime per WU.

Erik

Joined: 14 Feb 06

Posts: 2815

Credit: 2645600

RAC: 0

RE: RE: ... With a

20 Oct 2007 22:45:20 UTC

Message 74733 in response to message 74730

(moderation:

)

Quote:

Quote:
... With a somewhat smaller quota theres still some room for beta-app testing and experimenting without too much danger of running out of quota (unless for those who have 5-10 days of work cache).

A work cache of 5-10 days is nothing unusual, tho. I think the work quota was raised some time ago as a result of public demand, although work units must have been somewhat smaller then.

CU
Bikeman

The work units were quite a bit smaller, and some people's computers were hitting the 32 quota. Particularly those using Akos apps. Maybe the project should consider lowering the quota from 72?

archae86

Joined: 6 Dec 05

Posts: 3157

Credit: 7214014931

RAC: 952984

RE: People are pushing

21 Oct 2007 1:04:06 UTC

Message 74734 in response to message 74732

(moderation:

)

Quote:

People are pushing engineering sample penryns to >4.5gigs on air. I don't have a conroe, so I'm not sure what that'd equate to in terms of runtime per WU.

Conroe at 3 GHz is 8 to 9 hours for my diet of recent S5R3. Penryn claims some clocks-per-instruction improvement, has larger caches, and may, eventually, gain from SSE4. Even so getting down below 4 hours for the results I'm processing now looks iffy to me.

What I don't know is how much S5R3 varies in compute requirement at the various frequencies. (is that the right name for the second field in the result name, with values like 368.55?). Across S5R2 it varied a lot, and my hosts have only seen a few frequencies.

By the way, when you return bad ones, you get limited much more quickly than when you are a black hole. My host that stumbled on the 4.11 bug that just got fixed had a small queue, so just one or at most two Einstein units at a time. So in a total of under 45 minutes of downloading a result, starting it up, bombing out, and getting another tone on a 75 second cycle, it stopped. It had started with the 72/day limit, but each bombed return reduced the limit, so when it had done 35, it was told it was overdrawn for its (reduced) limit for the day.

At this minute, it has a limit of 34, having bombed after 30,000 seconds running 4.07.

I'm not arguing a particular side here, just giving a little data.

Now I will argue a point for a moment. One of the really painful things for me on the SETI forums is the violence with which some participants argue that other participants should be thrown out (pick your cause--overclaiming clients, underclaiming clients, optimized applications, outdated clients, excessive queues, inadequate queues...) I hope Einstein remains a place where orderly reasoned disagreement and discussion (hoping this thread is an example) does not metastasize to that sort of thing.

Peace,
Peter

Jim Bailey

Joined: 31 Aug 05

Posts: 91

Credit: 1452829

RAC: 0

It can be some what

21 Oct 2007 5:12:10 UTC

Message 74735

(moderation:

)

It can be some what irritating at times. (Now there's an understatement) If you can't contact the person there's not much else you can do except grin and bare it. Not worth getting your BP all cranked up, they will get done when they get done, and not before. Have several in pending right now that might be done this month, or next month, at least I hope they will!

I too hope that the boards here will remain as they have been. Quiet, easy going, and pleasant.

Alicia, Dan, an...

Joined: 4 Jul 07

Posts: 7

Credit: 525867

RAC: 0

I guess my question really is

21 Oct 2007 5:25:48 UTC

Message 74736

(moderation:

)

I guess my question really is why the 412 tasks assigned to this machine aren't or can't be reassigned to a "currently reliable" machine before they "time out" with the "black hole"?

Alinator

Joined: 8 May 05

Posts: 927

Credit: 9352143

RAC: 0

One reason is to that

21 Oct 2007 5:42:25 UTC

Message 74737

(moderation:

)

One reason is to that arbitrarily issue another result before the deadline is due is that it means one of the two in progress wingmen will be wasting their time (and power) running a trailer if both report on time.

The only way to avoid that is if your running 5.10.something (I forget the exact build number) which has 221 Auto Abort Redundant Results capability, the project has 221 functionality enabled, and the result had never been started at all.

Alinator

Black hole for tasks

Forums › Cruncher's Corner

Comment viewing options

Forums › Cruncher's Corner