Excessive Disk Usage

David
David
Joined: 18 Feb 10
Posts: 9
Credit: 50931
RAC: 0
Topic 194817

Hi,

Sorry, I posted this in the Einstein threads by mistake. Can the mods please delete that ??

I have no Einstein tasks running or waiting to upload, but, the disk usage is up at 90 MB. What's the go ?? Can I clear that disk usage ??? How do I find where that disk usage is, so I can clear it ??? Is this normal ???

Cheers,
David

Gundolf Jahn
Gundolf Jahn
Joined: 1 Mar 05
Posts: 1079
Credit: 341280
RAC: 0

Excessive Disk Usage

If all your Einstein tasks are uploaded and reported, you could reset the project. That would delete all Einstein@home-related files.

However, if you want to resume working for Einstein, all executables and (semi)static data files will have to be downloaded again.

Gruß,
Gundolf

Computer sind nicht alles im Leben. (Kleiner Scherz)

David
David
Joined: 18 Feb 10
Posts: 9
Credit: 50931
RAC: 0

Thanks Gundolf, I'll

Thanks Gundolf,

I'll download tomorrow morning when I'm on off-peak.

Cheers,
David

David
David
Joined: 18 Feb 10
Posts: 9
Credit: 50931
RAC: 0

RE: Thanks Gundolf, I'll

Message 97272 in response to message 97271

Quote:

Thanks Gundolf,

I'll download tomorrow morning when I'm on off-peak.

Cheers,
David

Um.... Does that also clear out my credits ??? Or do they remain intact ??

Michael Karlinsky
Michael Karlinsky
Joined: 22 Jan 05
Posts: 888
Credit: 23502182
RAC: 0

RE: Um.... Does that

Message 97273 in response to message 97272

Quote:

Um.... Does that also clear out my credits ??? Or do they remain intact ??

Credits remain.

Michael

David
David
Joined: 18 Feb 10
Posts: 9
Credit: 50931
RAC: 0

Cool... Thanks. Zap.

Cool... Thanks.

Zap.

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 4312
Credit: 250253103
RAC: 35458

I think that 90MB isn't that

I think that 90MB isn't that much for a S5R6 task. Locality scheduling should make sure that you'll get new tasks that will use files that already are on your machine as much as possible, to minimize your new download volume. If you delete the files now, you'll probably download a very similar set of files again with your next S5R6 task.

BM

BM

David
David
Joined: 18 Feb 10
Posts: 9
Credit: 50931
RAC: 0

It's now up to 237MB after

It's now up to 237MB after reseting the project.... Is that normal ???

tullio
tullio
Joined: 22 Jan 05
Posts: 2118
Credit: 61407735
RAC: 0

It is 359.01 MB for me, the

It is 359.01 MB for me, the second after climateprediction.net with 477.67 MB.
Luckily disk space is cheap.
Tullio

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5872
Credit: 117358747918
RAC: 35742231

RE: It's now up to 237MB

Message 97278 in response to message 97276

Quote:
It's now up to 237MB after reseting the project.... Is that normal ???


It's perfectly normal, particularly at the current time when the old S5R6 run is being finished up and the new S5GCE run is in full flight. You could quite easily have several times the normal complement of large data files on board and if you wish to contribute to this project, you really need to have these files.

Several weeks ago when you first asked this question, you were advised that resetting the project was a way to remove data files if you were finished with them and all results had been completed and returned. Resetting is NOT a necessary operation and it will NOT give you a permanent reduction in disk usage. Resetting simply throws away all existing programs and data and (if you intend to continue crunching) allows you to download fresh copies of everything. At the end of it, you may very well have a greater disk usage than before, depending on what the scheduler decides to give you in the way of new large data files. If you have finished crunching and wish to withdraw from the project, then you could consider detaching rather than resetting, but if you have any intention to crunch more tasks in the future, you should do neither, as the files you already have on board will continue to be needed.

Bernd also mentioned (because E@H uses locality scheduling) that having 90MB of large data files was not at all unusual for just one task. If you don't know what locality scheduling is, try googling the term. The very top hit will take you to the BOINC documentation about it, which admittedly, is rather technical.

In essence, the E@H project analyses vast amounts of data and a way needed to be found to 'localise' data to groups of tasks. So the scheduler sends you (and a small number of other hosts as well) a particular subset of large data files and then tries very hard to send you all only the tasks that belong to that data. If you have a fast machine dedicated to the project, you can get very large numbers of tasks all belonging to the same data subset. If you have a slow machine or if your machine is shared amongst many other projects, you may find that the other hosts using your subset gobble up all the tasks before you have requested many and you may need to be getting new data subsets in order to get new tasks. It's quite possible to have a few data subsets on the go at any one time and each data subset can be more than 50MB. If you can't afford 250MB of disk space, E@H is really not the project for you.

Your computers are hidden so it's not possible to see what computers you have and how frequently it/they request new work or how many different data subsets might be involved. In any case, with disks being so large and disk space being so cheap these days, do you really need to be concerned about the odd 250MB?

Cheers,
Gary.

Jord
Joined: 26 Jan 05
Posts: 2952
Credit: 5893653
RAC: 167

RE: Resetting simply throws

Message 97279 in response to message 97278

Quote:
Resetting simply throws away all existing programs and data and (if you intend to continue crunching) allows you to download fresh copies of everything.


Actually, and I am not sure when this changed or if it ever changed, resetting will only throw away the executables and data (.dat) files. You'll be surprised how much gunk stays behind in your projects\einstein.phys.uwm.edu\ directory after a controlled No New Tasks followed by a project reset.

It won't throw away all picture files, libraries, or extensionless files. (IOW all the real data files are kept.)

I found that out on a backup of my Data Directory, after I had set all projects to NNT, ran what work I had, uploaded & reported all that and then did a reset on all projects. The total size of the directory was at that time still 270,843,455 bytes (~270MB), 7-zipped it came out at 155,304KB.

Checking the different Projects\ sub-directories, I saw that both Einstein and CPDN (which I hadn't run for over a year) kept all that gunk that I tried to get rid of. Which has resulted in me requesting Trac ticket [trac]#978[/trac]. :-)

(Manually removing everything from every projects\{project}\ directory, I still ended up with a 7-zipped file of 76,940KB (180,146,422 bytes unpacked))

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.