No work available (reached daily quota)

Burned
Burned
Joined: 25 Jun 21
Posts: 24
Credit: 43,198,649
RAC: 148

Its my limited understanding

Its my limited understanding that there is a hard quota of 256 GPU tasks per day per GPU.  How are you getting around that?  All the tweaking up to this point has only resulted in additional CPU quota apparently.  

Ian&Steve C.
Ian&Steve C.
Joined: 19 Jan 20
Posts: 887
Credit: 5,644,186,508
RAC: 32,140,907

Burned wrote:Its my limited

Burned wrote:

Its my limited understanding that there is a hard quota of 256 GPU tasks per day per GPU.  How are you getting around that?  All the tweaking up to this point has only resulted in additional CPU quota apparently.  

GPU quota and CPU quota are not separate. there is one "per host" quota, that is the sum of the CPU and GPU allowances.

if you do not run CPU tasks, then the CPU allowance adds to your effective GPU allowance. that's why people get around this issue by just increasing their CPU count.

 

it only looks like you are filling up on CPU work, because the GPU processes them at a faster rate. so eventually you run out of GPU work, then you only have CPU tasks left. you could remedy this by increasing your cache size (days of work), but that's really just moving the goal posts and the process of running out of GPU work will just start all over again.

_____________________________________________

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5,492
Credit: 63,721,514,047
RAC: 53,742,555

Here is a completely untried

Here is a completely untried suggestion (so no guarantees) for a scheme that might give Burned a viable solution.  As pointed out by archae86, running CPU tasks is quite inefficient in terms of output per watt so I too have deliberately chosen not to run CPU tasks and so have no need for hacks other than increasing ncpus if necessary.

My recollection may be faulty but I seem to remember that it was project_max_concurrent rather than just the max_concurrent setting for an individual app that was causing the problem.  Since I have no need for either setting, I have no experience of the details of what is really happening.  It would be useful to see if setting up max_concurrent for just the CPU app would have a better outcome.

However, I'm suggesting something quite different.  Burned has a 16 thread CPU.  It currently shows as 48 which is a bit excessive so I suggest it gets reduced to whatever is really needed.  With the following tweaks to app_config.xml, 16 threads should be enough.

Rather than using a max_concurrent to limit the number of CPU tasks to 8, why not use the gpu_usage and cpu_usage values to do it instead?  That way, ncpus would not need to be increased at all.

In app_config.xml, with gpu_usage set to 0.5 and cpu_usage set to 4.0, the two running GPU tasks would budget for 8 threads to be reserved and so the CPU tasks would be limited to 8 for a 16 thread system.  The BOINC setting for % of cores would be 100% so the full quota of 16x32 tasks would be preserved, giving an overall daily quota of 512+256=768 - if I'm remembering the allowances correctly.  There may be other factors I'm overlooking so please comment if you spot a flaw in the reasoning.

As I said, I've never tried this so have no idea if it would work.  There might be some sort of limit on how high cpu_usage can be set.  However it's pretty easy to set up and test :-).

The other thing that could go wrong is that if CPU running had a higher priority, 4 extra CPU tasks might start and prevent a GPU task from running.  You would need to ensure that excessive numbers of CPU tasks on board didn't cause BOINC to 'panic' and elevate the CPU task priority.  I don't know for certain but I think having 2 running GPU tasks and limiting the number of CPU tasks to the non-reserved cores only, would be BOINC's 'normal' (ie, non-panic) operating mode.

Since Burned has a *lot* of CPU tasks at the moment, BOINC may already be in 'panic' mode.  There are two ways to ensure that this doesn't hamper a test of the above suggestion.  Either abort the excess tasks so they can't interfere or just suspend them if you did intend to crunch them eventually.

I know from other experience (quite a while ago when I was running CPU tasks) that suspending the excess does get BOINC out of panic mode - at least it did with the older versions I was using at the time.  I remember having a huge excess of CPU tasks and progressively resuming them in small batches whilst keeping BOINC out of panic mode until the excess had been crunched :-).

Cheers,
Gary.

archae86
archae86
Joined: 6 Dec 05
Posts: 3,012
Credit: 4,878,696,387
RAC: 3,320,237

Gary Roberts wrote:I know

Gary Roberts wrote:
I know from other experience (quite a while ago when I was running CPU tasks) that suspending the excess does get BOINC out of panic mode

It is still true that suspending a sufficient number of tasks gets BOINC out of panic mode on the versions I run (7.16.5 and 7.16.11).  It is worth mentioning that having even a single task in suspension precludes fetch requests for additional work.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.