Einstein eats all available disk space

Gilrae
Gilrae
Joined: 9 Feb 05
Posts: 1
Credit: 840001640
RAC: 268762
Topic 210916

I'm running BOINC 7.8.3, x64 Win10.

 

Einstein doesn't remove its files, I assume each 2MB file is a project file? Until the allotted space for all projects is used up, and they all stop. SETI and GPUGRID clean up after themselves just fine.

garyabbott-[at]-comcast.net

AgentB
AgentB
Joined: 17 Mar 12
Posts: 915
Credit: 513211304
RAC: 0

The files are reused between

The files are reused between tasks, so repeatedly downloading them is bandwidth expensive. 

See  locality scheduling

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5872
Credit: 117624259593
RAC: 35231458

Gilrae wrote:Einstein doesn't

Gilrae wrote:
Einstein doesn't remove its files ...

Unfortunately, the GW searches have huge data requirements.  Many tasks need to refer to the same sets of large data files so locality scheduling (in some form) is a must.   The scheduler does try to send tasks for the data you already have, otherwise the bandwidth to handle the data transfer would be much larger than it already is.

The machines I use mostly have 20GB hard drives.  I only run Einstein and I can handle the GW search in that extremely limited (by today's standards) space.

How much space do you allow BOINC to use in total?  How much does each separate project consume?

I've seen Einstein get to around 5 - 10GB after running for quite a while.  It might take many months to get that much.  If the space consumed is more than you are prepared to allow, there is a fairly simple technique to reclaim it.  It will of course start rising again but at least you will have some breathing space for a while :-).

All you need to do is set NNT for Einstein and allow the work to finish and be returned.  When that finishes, just reset the project to clear out all the accumulated large data files.  Of course, all the apps get deleted as well but (if you wanted to) you could save copies (eg to a file share) and restore any non-large-data files in the project directory.  When you enable tasks once again, you can save the downloading of all those files if your client can see they already exist.  You will see, "File exists, download skipped" messages for any files you have saved and restored.  If you're not worried about your bandwidth allowance, you may decide not to bother.  It's really just the large data files you want to get rid of.

Just be warned that you can't just delete those data files.  You need the reset to clear the entries for them out of the state file otherwise the scheduler will just send them all again.  With the entries gone from the state file and the files removed from disk, you will have your space back again (for a while anyway).  You will get new files with the next lot of tasks and they will gradually accumulate over time as before.

 

Cheers,
Gary.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.