Too much pauzed Gamma-Ray Pulsar Search # 1.03 and Gravitational Wave S6GC search 1

Fred J. Verster
Fred J. Verster
Joined: 27 Apr 08
Posts: 118
Credit: 22451438
RAC: 0
Topic 196076

I've about ~10, sometimes more, Pulsar Search # 1. 0.23 and Gravitational Wave S6GC search 1.01 (SSE2),
pauzed, while another 6 are running?
On this host.
.
ATI GPU's, 2x EAH5870, aren't used, a.t.m.
Are these WU's meant to run on (NVidia)GPU(s), or is there some other explanation?

ExtraTerrestrial Apes
ExtraTerrestria...
Joined: 10 Nov 04
Posts: 770
Credit: 583184804
RAC: 148864

Too much pauzed Gamma-Ray Pulsar Search # 1.03 and Gravitational

Any BOINC messages upon suspending them?

MrS

Scanning for our furry friends since Jan 2002

Fred J. Verster
Fred J. Verster
Joined: 27 Apr 08
Posts: 118
Credit: 22451438
RAC: 0

RE: Any BOINC messages upon

Quote:

Any BOINC messages upon suspending them?

MrS

No just pauzed, 10 WUs, now crunching another 6.
Some were just running (22%) and some are almost finished (96%)?!

I've seen this before, but last days, they've run with Rosetta and the
high RAM use can have some impact on this?

Well I really haven't a clue.....
Ehh and forgot to ad the use of an NVidia GPU! (Although overwritten by
BOINC(6.12.34;64bit) on another host.

It's host 4274431, an I7-2600
with 2 ATI 5870 GPUs. It does only CPU tasks, atm.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5875
Credit: 118475138194
RAC: 26019907

RE: No just pauzed, 10 WUs,

Quote:
No just pauzed, 10 WUs, now crunching another 6.
Some were just running (22%) and some are almost finished (96%)?!


For the tasks that are running, are they running in high priority mode? When panic mode is invoked, BOINC will often suspend the tasks in progress and start a whole bunch of new tasks. It's quite counter-intuitive as the newly started tasks often seem to be in much less deadline stress than the 'closer-to-deadline' suspended tasks. I think BOINC actually does know what it is doing despite the apparent paradox.

If tasks are running in panic mode, you need to find out why.

Cheers,
Gary.

archae86
archae86
Joined: 6 Dec 05
Posts: 3161
Credit: 7282131708
RAC: 2033235

RE: It's quite

Quote:
It's quite counter-intuitive as the newly started tasks often seem to be in much less deadline stress than the 'closer-to-deadline' suspended tasks. I think BOINC actually does know what it is doing despite the apparent paradox.


I agree with Gary's diagnosis, but not his prognosis.

Fred, your report reminds me of things I've seen when boincmgr thought I was in deadline trouble on a project. Sometimes it was right, and sometimes it was wrong because something had bumped up the completion time estimate way too high (such as suddenly thinking my host slow because of one fouled-up result, or thinking I was not going to use it at a decent rate in the near future because of a few day's vacation).

But the really, really strange thing is to see just what you described--task after task started up, usually NOT from the ones needed soonest, and then shortly paused to run yet another. I've seen this get above several dozen, which is pretty odd.

I, personally, have dealt with such episodes with direct intervention, even though I'm usually on the "don't fiddle" side of the endless argument. My preferred intervention in this case is to put the unstarted jobs on the project in trouble on suspend, so that boincmgr can't continue the mad dash through the list. Then I put all save a few of the soonest due on suspend as well, then exit boincmgr and restart, thus avoiding the memory and possibly other overhead of all that paused work.

Oh, yes, I suspend work fetch just in case boincmgr should be tempted to add to the chaos--probably don't need to, but I do.

Sadly, this approach requires some continuing followup attention during recovery, taking some tasks out of suspend enough in advance that it always has work available, still selecting stuff due soon, until the available work as currently estimated by boincmgr is little enough not to pose a threat.

Fred J. Verster
Fred J. Verster
Joined: 27 Apr 08
Posts: 118
Credit: 22451438
RAC: 0

RE: RE: It's quite

Quote:
Quote:
It's quite counter-intuitive as the newly started tasks often seem to be in much less deadline stress than the 'closer-to-deadline' suspended tasks. I think BOINC actually does know what it is doing despite the apparent paradox.

I agree with Gary's diagnosis, but not his prognosis.

Fred, your report reminds me of things I've seen when boincmgr thought I was in deadline trouble on a project. Sometimes it was right, and sometimes it was wrong because something had bumped up the completion time estimate way too high (such as suddenly thinking my host slow because of one fouled-up result, or thinking I was not going to use it at a decent rate in the near future because of a few day's vacation).

But the really, really strange thing is to see just what you described--task after task started up, usually NOT from the ones needed soonest, and then shortly paused to run yet another. I've seen this get above several dozen, which is pretty odd.

I, personally, have dealt with such episodes with direct intervention, even though I'm usually on the "don't fiddle" side of the endless argument. My preferred intervention in this case is to put the unstarted jobs on the project in trouble on suspend, so that boincmgr can't continue the mad dash through the list. Then I put all save a few of the soonest due on suspend as well, then exit boincmgr and restart, thus avoiding the memory and possibly other overhead of all that paused work.

Oh, yes, I suspend work fetch just in case boincmgr should be tempted to add to the chaos--probably don't need to, but I do.

Sadly, this approach requires some continuing followup attention during recovery, taking some tasks out of suspend enough in advance that it always has work available, still selecting stuff due soon, until the available work as currently estimated by boincmgr is little enough not to pose a threat.

They indeed are running in HighPriorityMode and are running with Rosetta and LHC work. Rosetta work can take up as many as >700MByte RAM per job.
That can have been the cause of the trouble.
There are still 11 Einstein WUs waiting, pauzed?!
HT turned on means 8 cores with 8 GByte DDR3 1333MHz DRAM. 90% RAM can be used
if the host is idle, but when you use it, memory use has drop to 60%, as I've set it in My Preferences.
I've already noticed, coying large files (>1GByte) across internal (SATA2) or
to USB 2.0 or 3.0, crawls, when BOINC is running and I use a fast 8GByte SD, as ReadyBoostCashe, whiich does help.
I've seen almost the full 8GiG (7.6GByte), in use. But only with 4 Rosetta WUs.
with 4 Einstein WUs.

ExtraTerrestrial Apes
ExtraTerrestria...
Joined: 10 Nov 04
Posts: 770
Credit: 583184804
RAC: 148864

Maybe you were running out of

Maybe you were running out of memory while you were not using the machine (someone else besides windows wanted to squeeze into the 10% left by BOINC)? In this case the computations would start to crawl, which would make them take much longer than estimated and might have pushed your BOINC into panic mode.. which, as others have already said, it's not very good at.

MrS

Scanning for our furry friends since Jan 2002

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.