Task progress question

Dyna66
Dyna66
Joined: 5 May 13
Posts: 7
Credit: 332414
RAC: 0
Topic 219108

I always shut down my PC before going to work. For the last few weeks every time I restart my computer and restart the E@H the task progress shows a decline of 10% or more from where it stood prior to shutting down the PC. Any idea of what causes this? Thanks. 

San-Fernando-Valley
San-Fernando-Valley
Joined: 16 Mar 16
Posts: 401
Credit: 10138663455
RAC: 26007658

As far as I know: Tasks/WUs

As far as I know:

Tasks/WUs make checkpoints every so often.

When suspending a WU (work unit) all work since last checkpoint is lost.

So, when resuming a WU it starts at last checkpoint.

This can cause considerable "work loss".

Check to see what you have set in the preferences for how often (i.e. after how many seconds) a checkpoint is

to be made.

Hope I understood your question correctly and that my answer might be of help to you.

Have a nice day.

 

 

 

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5872
Credit: 117529713527
RAC: 35380364

Dyna66 wrote:I always shut

Dyna66 wrote:
I always shut down my PC before going to work.


This means that any progress made since the last checkpoint was written will be lost.  If checkpoints are written frequently, the loss would be insignificant.  Whilst BOINC does have a preference setting for the time between checkpoints (default is 60 secs I think) you can't actually reduce that time below what the app itself is designed to do.

For the app you are running (Gamma-ray pulsar search) the checkpoint interval is controlled by how many sky points in the data file need to be analysed.  For the current data file (LATeah0057F.dat) that happens to be 8.  A checkpoint is written at the end of each sky point.  The number of sky points could be quite different when the current file is finished and a new data file (perhaps LATeah0058F.dat) is released.  Data files can last for weeks and the current one was released about the time you noticed your 'problem' behaviour.

The calculations are done in two parts.  The main part is from 0% to ~90% and after that a followup stage is entered which sorts out the top 10 candidate signals found in the main part.  So here are some basic suggestions which will allow you to know how much you might lose if you shut down your computer at certain stages.

The main calculations will have 8 checkpoints.  One will be written every 90/8=11.25% of progress.  If you shut down your computer when the progress showed 11.0% then the first checkpoint wouldn't exist and when you next started up, the crunching would start right back at the beginning.  If you shut down when the progress showed 11.3% then you would lose virtually nothing and the restart would be from 11.25% - the first checkpoint.

For the current data file, checkpoints are written at 11.25%, 22.5%, 33.75%, 45.0%, 56.25%, ..... ~90.0%.

So, if you look at % progress before shutting down, you can easily plan for a good time to do the deed.  This would be a good way to avoid loss - the other is to allow the computer to keep running.  Of course, that's entirely up to you.

Dyna66 wrote:
For the last few weeks every time I restart my computer and restart the E@H the task progress shows a decline of 10% or more from where it stood prior to shutting down the PC.


That just goes to show how exceedingly bad are your choices for when to shut down :-) ;-).  With that sort of bad luck it would be best if you made a permanent resolution never to participate in any type of game of chance :-).  You'd be permanently likely to always draw the short straw :-).

Seriously, what you are seeing is just the way it has to be for the moment.  You can avoid this behaviour either by letting your computer run for longer intervals or by choosing the GW search which (I think) checkpoints much more frequently - about every minute.  Your computer seems extremely slow - I saw one of your tasks (since removed from the online database) where the recorded times seemed to indicate that it took about 6-7 hours to complete 1 checkpoint.  I don't know how reliable those recorded times are but the indications were that the full 8 checkpoints would likely have taken around 50 hours of continuous running just to get to the followup stage (90% -> 100%).  The GW tasks are likely to take quite a bit longer, but they will checkpoint much more frequently.

Cheers,
Gary.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.