What to do with orphaned tasks?

cecht
cecht
Joined: 7 Mar 18
Posts: 1,509
Credit: 2,805,927,104
RAC: 2,128,242
Topic 215684

I've removed the crunching GPU from a host and am only running it for CPU tasks. What's the best way to deal with the orphaned GPU tasks that are flagged in the BOINC Manager as "GPU missing, Ready to start...." or "GPU missing, Ready to run..."?

Ideas are not fixed, nor should they be; we live in model-dependent reality.

MarkJ
MarkJ
Joined: 28 Feb 08
Posts: 437
Credit: 139,002,861
RAC: 34

If you’re not likely to be

If you’re not likely to be putting the GPU back you might as well abort them.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5,870
Credit: 115,909,334,270
RAC: 35,432,146

cecht wrote:... the best way

cecht wrote:
... the best way to deal with the orphaned GPU tasks ...

... is to not let them get orphaned in the first place. Wink Laughing  Sorry, couldn't resist!

If there is any prospect that you will be putting the GPU back, you could keep them as they will resume crunching immediately you do that.  If not, you should change your settings so that there can't be any more sent and then abort those that you have.  That way you are being kind to your 'wingman' - the other person waiting for your results for validation purposes.  Your tasks can immediately be sent to someone else for crunching without having to wait 2 weeks for the deadline to kick in.

For any others who might be reading, if you know in advance that there is a change coming that will cause you to stop crunching, either fully or partly, please set 'No new tasks' in BOINC Manager at the appropriate time so that there isn't a bunch of excess tasks still sitting there when the change happens.  I'm thinking in particular of things like going on holidays, a business trip, the weather is too hot to keep going, etc, etc.  Even if you make the decision at short notice, just abort and return what's left so that they can be passed on immediately.

There's a quite high percentage of issued tasks that never get returned.  I'm not talking about failed tasks - just tasks that completely disappear.  A small number would be caused by sudden hardware failure, particularly a disk failure.  The bulk are due to people just shutting down without thinking.  Please abort and return before shutting down.

 

Cheers,
Gary.

cecht
cecht
Joined: 7 Mar 18
Posts: 1,509
Credit: 2,805,927,104
RAC: 2,128,242

Okay. Now that I see there

Okay. Now that I see there are no respectable technical solutions to lack of planning and laziness Smile, I've moved the GPU back over to the original host to let the queued GPU tasks to finish out while not allowing new tasks for the project.

On a related issue, is there a way to set local config files for "no new tasks" for specific applications, or must that be done only through Project web prefs?

I'm still a bit fuzzy on the interplay between local prefs, config.xml files, and web prefs. It seems that even when local prefs are selected in BOINC Manager, Project web prefs (but not Computing web prefs?) still have some influence. For example, I had set my Work Set in Project web prefs to not request CPU tasks and not run the Continuous Gravitational Wave search O2 All-Sky app, thinking that it only would apply to Work hosts that were using web prefs, but then another Work host that is set to use local pref said it can't get new work (or something to that effect) because of my web preference settings. I changed the web prefs to allow O2 All-Sky work and the supposedly locally configured host started getting the CPU work again (it's not capable of GPU work).

Do the local config files then just fine tune only some web prefs? It seems things would be easier (for this bear with very little brain) if config files could control all aspects of work request, GPU usage, and CPU usage.

 

Ideas are not fixed, nor should they be; we live in model-dependent reality.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5,870
Credit: 115,909,334,270
RAC: 35,432,146

cecht wrote:Okay. Now that I

cecht wrote:
Okay. Now that I see there are no respectable technical solutions to lack of planning and laziness Smile, ...

I certainly wasn't accusing you of either of those - I saw an 'educational opportunity' and decided to use it :-).

I don't know how 'respectable' you might consider it, but it is quite possible when you make decisions like this - ie move a GPU from one machine to another - to move the whole BOINC tree with it..  You would just need to ensure that a suitable driver was already set up on the recipient machine.  BOINC doesn't care that the motherboard/CPU is different - or even the complete OS for that matter.

Many years ago, when I first started using Linux, It was common practice for me to shut down a Windows machine, save the complete BOINC tree, partly crunched tasks and all, wipe Windows and install Linux, put the BOINC tree back and replace the executables (boinc.exe, boincmgr.exe, boinccmd.exe, and the project apps) with their Linux equivalents and then resume crunching from saved checkpoints that had been created under the previous OS.  I expected problems but didn't really find any.

The key thing to understand is that everything needed to resume crunching at a particular intermediate stage is regularly saved in 'checkpoint' files that are updated on disk on a reasonably regular basis.  The checkpoint file (usually) is quite agnostic to the OS, the BOINC version or the science app version.  Launching a different version of the app usually makes no difference to the successful restarting of crunching from that particular saved point.  It would only matter if there were a change in the checkpoint content/format along with the version change.  I vaguely remember an example of that in the dim, distant past but there would be a very clear warning from the project if that were ever to happen again.

Quote:
... I've moved the GPU back over to the original host to let the queued GPU tasks to finish out while not allowing new tasks for the project.

I'm sorry if my previous comments prompted this - you could have just aborted them.  All the same, thanks for taking the trouble to finish them off.

Quote:
On a related issue, is there a way to set local config files for "no new tasks" for specific applications, or must that be done only through Project web prefs?

NNT is a project setting that applies to all project applications.  The only real way to permanently stop a particular search from sending more tasks is to 'unselect' it in the website prefs.

Quote:
I'm still a bit fuzzy on the interplay between ....

So am I, so welcome to the club :-).

Here's how I *think* the bit you're interested in works.  When you first launch BOINC and attach to a particular project, your boinc client communicates certain information to the project's 'scheduler' by sending a scheduler request.  At any time there is a copy of the most recent scheduler request stored in your BOINC directory, which you can easily browse.  The scheduler will respond with a scheduler reply and a copy of that is also kept (until removed by a subsequent one).   For Einstein these files are called sched_request_einstein.phys.uwm.edu.xml and sched_reply_einstein.phys.uwm.edu.xml respectively.  The client parses the reply, which includes all the website preference settings, and stores them in a file called global_prefs.xml.

If any values are different from the default values the client started with, the client's values will be adjusted at that time.  Adjustments made after that on the website need a further request/reply cycle before any change can be adopted by the client.  It gets more complicated if you support multiple projects and you make changes at different websites.  The best policy is to pick a 'main' project and always make changes there.  Changes will propagate to other projects automatically over time.

A subset of the full suite of settings can be managed locally through the BOINC Manager interface (Advanced view).  An even smaller subset is available through the simple view.  If you're interested in having a bit of control and being able to really see what is happening, you really need the advanced view.  Further comments refer to advanced view.

If you click on the 'preferences' option in BOINC Manager, you can see different tabs with groups of settings you can change.  Pay particular attention to the comments next to the warning symbol at the top.  The words have changed over different BOINC versions and they will probably change further in the future.  The warning is quite terse.  If you ever just look at what is available locally without intending to change anything, make sure you exit with 'cancel'.  If you don't, your global_prefs.xml file will be overridden with a new file called global_prefs_override.xml.  Once that file has been created, it will always override any changes you make to website settings that are listed in the override file.  Over the years, many have been caught by suddenly finding that particular website prefs don't seem to work anymore :-).  In the warning area at the top there is a button to click which deletes the override file and allows website prefs to be active once again.

If you want to see exactly what is under local control and what is under website control, just browse the two files.  If the setting is in both files, you won't be able to modify it on the website.  If there is no setting in the override file, you will be able to change the setting on the website and have the change happen locally after the next scheduler contact.  You can trigger a contact through the 'update' button in BOINC Manager.  There is a User Manual for BOINC that is worthwhile reading through for more complete information.  There is a specific section that deals with client and application configuration.

If there's something I haven't mentioned and you can't see it covered in the manual, don't hesitate to ask :-).

 

Cheers,
Gary.

Holmis
Joined: 4 Jan 05
Posts: 1,118
Credit: 1,055,935,564
RAC: 0

Gary Roberts

Gary Roberts wrote:

Quote:
I'm still a bit fuzzy on the interplay between ....

So am I, so welcome to the club :-).

Here's how I *think* the bit you're interested in works.

A small addendum to Gary's excellent post is that there is actually two different types of preferences:

Computing Preferences - Deals with how Boinc can run and use your computer/device. When to run work, how many processors to use, how much diskspace to use etc. These preferences are global and applies to all projects and they can be set locally in Boinc. If you run multiple projects then choose one to edit these prefs at and they will propagate to the others.

Project Preferences - Deals with how the Project can use your computer/device. Applies only to the project where they are set. Controls what apps/searches to run, what resources (available to Boinc) the project can use, how the graphics will look (if available) etc. Can't be set locally in Boinc and needs to be set at every project your support (if the defaults don't suite you Wink).

cecht
cecht
Joined: 7 Mar 18
Posts: 1,509
Credit: 2,805,927,104
RAC: 2,128,242

Thank you much for the

Thank you much for the detailed reply Gary. You've covered my questions nicely, and then some!

Gary Roberts wrote:
...but it is quite possible when you make decisions like this - ie move a GPU from one machine to another - to move the whole BOINC tree with it..

This is really good to know.  How to transfer work between systems something I would have asked about eventually.

Your overview of the scheduler and preference system and their associated files is something I will refer back to repeatedly.

In the mean time, I will remember to click "Cancel" instead of the little X in the corner of the preference widow and, last but not least, read the manual.

 

Ideas are not fixed, nor should they be; we live in model-dependent reality.

cecht
cecht
Joined: 7 Mar 18
Posts: 1,509
Credit: 2,805,927,104
RAC: 2,128,242

Holmis wrote:A small addendum

Holmis wrote:
A small addendum to Gary's excellent post is that there is actually two different types of preferences:

Thanks for that. I think with this and what Gary wrote, I'm well on my way to getting my hosts to run smoothly.

Ideas are not fixed, nor should they be; we live in model-dependent reality.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.