SETI Orphans. Can we go back to E@H?

cecht
cecht
Joined: 7 Mar 18
Posts: 1536
Credit: 2913468666
RAC: 2133034

Ole Kristian wrote:Everything

Ole Kristian wrote:
Everything looked nice.  13 cores with cpu and and 3 gpu task running nicely.  Suddenly it goes from 90% complete on the GPU and starts again(?) "Waiting for memory" on the cpu tasks.  I have 16 GB on the pc and 8 on the GPU.  L leave it on and just see what happens...

Yay! That's good to hear. As for the 90% completion pause, that's normal, and is covered here.

Ideas are not fixed, nor should they be; we live in model-dependent reality.

Ole Kristian
Ole Kristian
Joined: 30 Nov 17
Posts: 8
Credit: 159737450
RAC: 0

Seems to go OK.  GPU WUs

Seems to go OK.  GPU WUs mostly finishes at ca 50+ minutes.  One is still going, 99,991% after 10 hours.  With other projects going on such a WU would reset to zero something and I would end up aborting it.  CPU units are going but several are waiting for memory. None of those finished so far, I will check them later. 

Richie
Richie
Joined: 7 Mar 14
Posts: 656
Credit: 1702989778
RAC: 0

Ole Kristian wrote:Seems to

Ole Kristian wrote:
Seems to go OK.  GPU WUs mostly finishes at ca 50+ minutes.  One is still going, 99,991% after 10 hours.

It won't finish properly. You could just abort it.

Quote:
With other projects going on such a WU would reset to zero something and I would end up aborting it.  CPU units are going but several are waiting for memory. None of those finished so far, I will check them later. 

I'm sure that the system would run tasks better if you tried not to use all 16 cores, but maybe 10-14. I would try limiting the number of CPU tasks to 9 and see how that would let the system breathe.

Ole Kristian
Ole Kristian
Joined: 30 Nov 17
Posts: 8
Credit: 159737450
RAC: 0

Richie wrote:Ole Kristian

Richie wrote:
Ole Kristian wrote:
Seems to go OK.  GPU WUs mostly finishes at ca 50+ minutes.  One is still going, 99,991% after 10 hours.

It won't finish properly. You could just abort it.

Quote:
With other projects going on such a WU would reset to zero something and I would end up aborting it.  CPU units are going but several are waiting for memory. None of those finished so far, I will check them later. 

I'm sure that the system would run tasks better if you tried not to use all 16 cores, but maybe 10-14. I would try limiting the number of CPU tasks to 9 and see how that would let the system breathe.

 

OK, I suspended it. (99.997% after 11 hrs) But what happened?  How to prevent it?  Not all cores are in use anyway because several are waiting for memory.  But I reduced CPU load to 80% now.  I do not notice at all that Boinc is running, the system is responsive and fast (just the fan shows that it is running hot)

Richie
Richie
Joined: 7 Mar 14
Posts: 656
Credit: 1702989778
RAC: 0

Ole Kristian wrote:OK, I

Ole Kristian wrote:
OK, I suspended it. (99.997% after 11 hrs) But what happened?  How to prevent it?

I don't really know what happens with them. I've had a couple of them with AMD cards, running sort of endlessly until computation error with EXIT_TIME_LIMIT_EXCEEDED occurred. Late in March they run for 20k seconds. Example https://einsteinathome.org/task/934559958 and https://einsteinathome.org/task/934610443 .

A few days later administrator mentioned this in a completely other topic regarding errors: "For the time being I doubled the "flops estimation" (and credit), which should aslo double the runtime limit (for newly generated workunits, sorry)."

I'm not sure what the time limit currently is and if it's the same for all tasks, but I believe that your task would've errored out soon anyway. I see this morning one of my tasks had errored with EXIT_TIME_LIMIT_EXCEEDED already after 10k seconds: https://einsteinathome.org/task/938062047

Currently I don't have more than those 3 eternal runner wannabes in my error record window (plus some more CL_MEM_OBJECT_ALLOCATION_FAILURE errors for 2GB Nvidia cards in addition). So they are quite rare after all.

Quote:
Not all cores are in use anyway because several are waiting for memory.  But I reduced CPU load to 80% now.  I do not notice at all that Boinc is running, the system is responsive and fast (just the fan shows that it is running hot)

Sounds good. I mentioned it just so you could avoid that "waiting for memory" situation, because that naturally doesn't add in the productivity. More likely the opposite. It's better to avoid it happening in the beginning so too many tasks don't try to run at once if they can't run.

DanNeely
DanNeely
Joined: 4 Sep 05
Posts: 1364
Credit: 3562358667
RAC: 0

Ole Kristian wrote:Everything

Ole Kristian wrote:
Everything looked nice.  13 cores with cpu and and 3 gpu task running nicely.  Suddenly it goes from 90% complete on the GPU and starts again(?) "Waiting for memory" on the cpu tasks.  I have 16 GB on the pc and 8 on the GPU.  L leave it on and just see what happens...

 

1GB/Core is on the low side for E@H CPU tasks.  Looking at what I've got running now on my main system 2 of 2 tasks are running at 1,286MB, the other 2 at 658MB.  A second machine has 2 as 1,280MB and 1 at 1,500MB.  I don't know if the variation is due to relative progress or something intrinsic to the tasks themselves; but for the last few years I've had all my E@H boxes setup with 2gb/core due to growing ram demands.

Cosmic_Ocean
Cosmic_Ocean
Joined: 9 Apr 20
Posts: 3
Credit: 15082132
RAC: 12432

New S@H orphan here, too.

New S@H orphan here, too. Started in December 2000 there, never thought it would end.

Decided to come here and give this one a go.

First impression is that it needs a LOT more RAM than S@H did, that's for sure. But my main cruncher just does crunching and basically nothing else, so I had to adjust the settings to let it use 100% of memory and 50% of pagefile/swap to get around "waiting for memory".

I used to be a bit more active in the forum over there in the past and then somewhat more recently, I forgot about it often, so.. I'll be around from time to time, but not all day, every day.

Raistmer*
Raistmer*
Joined: 20 Feb 05
Posts: 208
Credit: 181428947
RAC: 6029

Why small deadlines??

Why small deadlines?? Gravitational waves can't wait month to be discovered??

I virtually never miss deadlines on SETI. Here I spend day (actuall, 2 days11 hours) of computational time of partially active host (netbook) just to miss deadline.

What a hurry? And why such waste??

 

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5872
Credit: 117754722100
RAC: 34839358

Raistmer* wrote:Why small

Raistmer* wrote:
Why small deadlines?? Gravitational waves can't wait month to be discovered??

The deadline for Einstein CPU tasks has pretty much always been 14 days.  That's not small - it's just the standard.  Even for a task taking 3 days, there is still plenty of time to meet the deadline.

It was a pretty big deal when GW was first directly measured (a BH-BH cataclysmic collision).  It will be an even bigger deal and a true triumph for those who have developed the technology when the much weaker continuous emissions from rapidly spinning massive objects like neutron stars is finally detected.  Hardly surprising that there is a race to be the first to make that detection! :-).

Cheers,
Gary.

Speedy
Speedy
Joined: 11 Aug 05
Posts: 40
Credit: 23546889
RAC: 6031

Quote:And why such

Quote:
And why such waste??

when you say the above may I ask what you are referring to?

 

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.