Excessive Disk Usage

Jord
Joined: 26 Jan 05
Posts: 2952
Credit: 5893653
RAC: 167

RE: If you want to free up

Message 97300 in response to message 97299

Quote:
If you want to free up disk space now and don't care about bandwith, you could manually delete the files that names end in _S5R7. Make sure the Client doesn't keep any reference (Gary or Richard may have more detailed instructions), or it will download the files again even if they are no longer needed.


So much for that then. I noted down what task I was running, exited BOINC, cleaned up the client_state.xml file, cleaned up the project directory, restarted BOINC and Einstein is resending half the lost tasks I just deleted. ;-)

Both with S5R4 and S5R7 names, but all of the same group.

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 4312
Credit: 250256693
RAC: 35326

RE: RE: If you want to

Message 97301 in response to message 97300

Quote:
Quote:
If you want to free up disk space now and don't care about bandwith, you could manually delete the files that names end in _S5R7. Make sure the Client doesn't keep any reference (Gary or Richard may have more detailed instructions), or it will download the files again even if they are no longer needed.

So much for that then. I noted down what task I was running, exited BOINC, cleaned up the client_state.xml file, cleaned up the project directory, restarted BOINC and Einstein is resending half the lost tasks I just deleted. ;-)

Both with S5R4 and S5R7 names, but all of the same group.

I may have been too vague there.

Tasks assigned to your hosts you will get back, and you will also download (again) all files that running these tasks might require. However there may be files on your host that once were required by tasks that are long gone from your system and where no more work can be generated that requires these files. Only these files aren't currently deleted properly by the client and you will get rid of by the described procedure.

BM

BM

Jord
Joined: 26 Jan 05
Posts: 2952
Credit: 5893653
RAC: 167

Oh, I know what I did wrong.

Message 97302 in response to message 97301

Oh, I know what I did wrong. I have task 0567.10 at this moment, but instead of only deleting all the others -- non-0657.xx -- I deleted all but the H1_0567.10 and L1_0567.10 files.

The data-set at hand is 0567, not 0567.10

So BOINC not only deleted the h1_0567.10_S5R4 in progress, it sent it back to me complete with all of the group of 0567, both S5R4 and S5R7, both H1_* and L1_*. I'm not complaining, it was a nice test. :-)

(using H1 and L1 with capitals for legibility, as l1 and 11 look alike).

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 2143
Credit: 2952333515
RAC: 699113

RE: (Gary or Richard may

Message 97303 in response to message 97299

Quote:
(Gary or Richard may have more detailed instructions)


Not yet, but I'll see if I can work something out.

Listing the data files in a current Einstein directory (just the H variant to save on duplication), I see

 Directory of D:\BOINCdata\projects\einstein.phys.uwm.edu

19/03/2010 06:11 3,660,008 h1_0295.35_S5R7
19/03/2010 07:09 3,660,008 h1_0295.40_S5R7
19/03/2010 07:15 3,660,008 h1_0295.45_S5R7
19/03/2010 07:18 3,660,008 h1_0295.50_S5R7
19/03/2010 13:55 3,660,008 h1_0295.55_S5R7
19/03/2010 06:06 3,660,008 h1_0295.60_S5R7
19/03/2010 06:06 3,660,008 h1_0295.65_S5R7
29/03/2010 00:58 3,660,008 h1_0295.70_S5R7
29/03/2010 17:56 3,660,008 h1_0416.65_S5R7
29/03/2010 17:57 3,660,008 h1_0416.70_S5R7
29/03/2010 17:58 3,660,008 h1_0416.75_S5R7
29/03/2010 17:59 4,262,400 h1_0416.80_S5R4
29/03/2010 17:59 3,660,008 h1_0416.80_S5R7
29/03/2010 18:00 4,262,400 h1_0416.85_S5R4
29/03/2010 18:00 3,660,008 h1_0416.85_S5R7
29/03/2010 18:01 4,262,400 h1_0416.90_S5R4
29/03/2010 18:01 3,660,008 h1_0416.90_S5R7
29/03/2010 18:02 4,262,400 h1_0416.95_S5R4
29/03/2010 18:02 3,660,008 h1_0416.95_S5R7
13/04/2010 05:23 4,262,400 h1_0417.00_S5R4
13/04/2010 05:23 3,660,008 h1_0417.00_S5R7
13/04/2010 05:24 4,262,400 h1_0417.05_S5R4
13/04/2010 05:25 3,660,008 h1_0417.05_S5R7
13/04/2010 05:26 4,262,400 h1_0417.10_S5R4
13/04/2010 05:26 3,660,008 h1_0417.10_S5R7
15/01/2010 06:55 4,262,400 h1_1071.25_S5R4


Compare with the file set required by a single workunit (h1_0416.80_S5R4__110_S5GCEa, in this case):

 h1_0416.80_S5R4
 h1_0416.80_S5R7
 l1_0416.80_S5R4
 l1_0416.80_S5R7
 h1_0416.85_S5R4
 h1_0416.85_S5R7
 l1_0416.85_S5R4
 l1_0416.85_S5R7
 h1_0416.90_S5R4
 h1_0416.90_S5R7
 l1_0416.90_S5R4
 l1_0416.90_S5R7
 h1_0416.95_S5R4
 h1_0416.95_S5R7
 l1_0416.95_S5R4
 l1_0416.95_S5R7
 h1_0417.00_S5R4
 h1_0417.00_S5R7
 l1_0417.00_S5R4
 l1_0417.00_S5R7
 h1_0417.05_S5R4
 h1_0417.05_S5R7
 l1_0417.05_S5R4
 l1_0417.05_S5R7
 h1_0417.10_S5R4
 h1_0417.10_S5R7
 l1_0417.10_S5R4
 l1_0417.10_S5R7


So you need 28 files - seven sets of matched {H1, L1, R4, R7} foursomes - just to run one task.

From my first list, I can safely lose all the _0295, _0416.65 to _0416.75, and that pesky _1071.25 off the bottom. OK so far, Bernd?

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 4312
Credit: 250256693
RAC: 35326

RE: From my first list, I

Message 97304 in response to message 97303

Quote:
From my first list, I can safely lose all the _0295, _0416.65 to _0416.75, and that pesky _1071.25 off the bottom. OK so far, Bernd?


The trouble is only with _S5R7 files. As long as you have a matching _S5R4 file on your machine it will be taken care of by the new scheduler, i.e. it might be needed by tasks you have or might get and will be automatically deleted when the corresponding _S5R4 file gets deleted. h1_1071.25_S5R4 (_S5R4 file above 1000.00) is from S5R6, it should eventually get deleted automatically, too. Obsolete files that could be deleted manually are _S5R7 files of 0295 and the _0416 ones that don't have a corresponding _S5R4 file.

BM

BM

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 2143
Credit: 2952333515
RAC: 699113

OK, I'm on the right

OK, I'm on the right track.

So, procedure could be:

1) Identify what files you've got, and what can be safely disposed of.

In Windows XP, I find the "Open Command Window Here" PowerToy from Microsoft PowerToys for Windows XP invaluable. A right-click in Windows Explorer, and you're at a command prompt where typing 'dir h*.* >dirlist.txt' will give you a reference list like the one in my last post. (In Windows Vista, and I presume Windows 7, the facility is built-in with shift-right-click)

Choose what you're ready to delete using Bernd's instructions: the datestamps in the directory listing are a useful cross-check too (that _1071.25 was really old).

2) Stop BOINC running - completely, not just the Manager. Make sure that everything is quiet, and take whatever backup precautions you feel are necessary. Neither I nor the project can be held responsible if you proceed beyond this point without a safety-net.

3) Using a plain-text editor in ANSI or ASCII mode, open the file client_state.xml: if you didn't already know where to find it, you probably shouldn't be undertaking this sort of procedure anyway (Read Jord's BOINC FAQs, linked from his signature).

For each file in your reference list of 'files to delete', there will be a section in client_state.xml which looks like this:

    h1_0295.35_S5R7
    3660008.000000
    0.000000
    ca0848376c3f468b0d06225ef93a7f05
    1
    
    
    http://einstein.astro.gla.ac.uk/download/30b/h1_0295.35_S5R7
    http://einstein.aei.mpg.de/download/30b/h1_0295.35_S5R7
    http://einstein-dl.phys.uwm.edu/download/30b/h1_0295.35_S5R7
    http://einstein.ligo.caltech.edu/download/30b/h1_0295.35_S5R7
    http://einstein.astro.gla.ac.uk/download/30b/h1_0295.35_S5R7
    http://einstein.aei.mpg.de/download/30b/h1_0295.35_S5R7
    http://einstein-dl.phys.uwm.edu/download/30b/h1_0295.35_S5R7
    http://einstein.ligo.caltech.edu/download/30b/h1_0295.35_S5R7


and a matching one for the l1_xxxx.xx_S5R7 version. You'll probably find that the h1 and l1 files are next to each other, and that they are roughly in the same order as they are in your directory listing, but don't take that for granted. Check each block carefully, noting that sometimes the filenames differ by only one letter or digit: and if it's a file you've chosen to delete, remove the entire block. Make sure that you remove the entire and lines, and everything in between: the file should 'close up' round the gap so there's no trace there was ever anything there.

Repeat until all the file references for the files you want to delete have been removed, then save and close client_state.xml

3) That's it. Now you can start up BOINC again. If you've been careful and steady-handed, everything should start as normal and continue where it left off. If not - well, you did take that backup, didn't you?

4) Now you can delete the actual files themselves from your hard disk. Again, the names are all very similar, so take your time and refer to your list - check everything before you delete it.

Jord
Joined: 26 Jan 05
Posts: 2952
Credit: 5893653
RAC: 167

Fun if you have tens of

Fun if you have tens of data-sets. I'm not sure, but do you really want to let users do this?

Gundolf Jahn
Gundolf Jahn
Joined: 1 Mar 05
Posts: 1079
Credit: 341280
RAC: 0

RE: 1) Identify what files

Message 97307 in response to message 97305

Quote:

1) Identify what files you've got, and what can be safely disposed of.

In Windows XP, I find the "Open Command Window Here" PowerToy from Microsoft PowerToys for Windows XP invaluable. A right-click in Windows Explorer, and you're at a command prompt where typing 'dir h*.* >dirlist.txt' will give you a reference list like the one in my last post. (In Windows Vista, and I presume Windows 7, the facility is built-in with shift-right-click)


Here sorting by name might be helpful. With 'dir h*.* /o:n>dirlist.txt' the solitary S5R7 files stand out pretty well.

Gruß,
Gundolf

Computer sind nicht alles im Leben. (Kleiner Scherz)

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 4312
Credit: 250256693
RAC: 35326

RE: Fun if you have tens of

Message 97308 in response to message 97306

Quote:
Fun if you have tens of data-sets. I'm not sure, but do you really want to let users do this?

Not every user. The problem will solve itself when the disk limit set in preferences is reached (files will get deleted randomly until there is enough space for new work) and it will be solved altogether when we have the new scheme running. These instructions are just for the techs that need to free disk space fast.

The easiest way to get rid of the files of course if you don't care about bandwidth is just to reset the project (update before to make sure completed tasks are reported). You will be resent the tasks that your client got assigned before and download the files for these tasks again.

BM

BM

David
David
Joined: 18 Feb 10
Posts: 9
Credit: 50931
RAC: 0

Um, I only asked a simple

Um,

I only asked a simple question out of curiosity. I didn't intend it to blow up like this...

My apologies if anyone has been upset. I've have been otherwise busy and haven't checked on this forum lately (only realised how popular this thread had become when I received a note saying this had been made /sticky .....)

I've actually allocated 50GB to be used by BOINC and I am running 10 projects. It just that I hate to see space wasted (I'm a dinosaur who remembers card sorters and punched tape). Agreed, 250MB isn't really anything, and, because I am a dinosaur, I am curious WHY that space is wasted. The current disk usage is now up to 334MB, but, I do have a task running atm.

Cheers,
David

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.