Reached a maximum quota

Skillz
Skillz
Joined: 5 May 17
Posts: 16
Credit: 3523638163
RAC: 37387
Topic 229530

Why is this a thing?

 

May 15 18:08:14 boinc[2847]: 15-May-2023 18:08:14 [Einstein@Home] (reached daily quota of 2304 tasks)

 

The host can easily do well above 2304 tasks per day with 4 GPUs and each GPU completing 2 tasks in 200 seconds. That's 8 tasks in 200 seconds or 1 task every 25 seconds. Which is around 3,465 tasks per day.

 

The host currently has 1515 pending tasks, 1077 valid tasks and 6 invalids. While it's also got 8 in progress tasks and that's the most tasks it will get. 2 tasks per GPU, but only when the task completes, uploads and reports.

Keith Myers
Keith Myers
Joined: 11 Feb 11
Posts: 5049
Credit: 19059351541
RAC: 6464936

Most projects put a limit on

Most projects put a limit on tasks per day.  Usually a combination of so many tasks per core and so many tasks per gpu up to a max gpu limit of 16.

This is to protect the servers from bad hosts that produce nothing but errors on every task sent it.

The most common way to get around any limit is to use the cpu reported number in the cc_config.xml file and increase it to spoof the actual core count.

Change <ncpus>-1</ncpus> to 

<ncpus>200</ncpus>

or similar.

 

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5877
Credit: 118760919903
RAC: 21206003

Skillz wrote:Why is this a

Skillz wrote:
Why is this a thing?

E@H enforces daily limits that are no longer large enough for the highest performing GPUs.  Your computers are hidden so we can't see exactly what you are running and so can't give specific advice.

The easiest way around this is to fudge the number of CPUs, making sure to limit BOINC (if necessary) so it doesn't try to run CPU tasks on non-existent CPUs.  In the options section of cc_config.xml, just set a higher number of CPUs than you actually have - perhaps 8 times your current core count.  As an example if you had a real 8 threads you could add something like:-

<ncpus>64</ncpus>

as an updated options entry, and then force BOINC to reread the configuration (click the reread config files in BOINC Manager).  This should allow you to get a bunch of new GPU tasks.

Cheers,
Gary.

Skillz
Skillz
Joined: 5 May 17
Posts: 16
Credit: 3523638163
RAC: 37387

I faked the CPUs to 128, but

I faked the CPUs to 128, but still didn't get anymore tasks than just 2 tasks per GPU at any given time. Does this take time to happen?

 

My host is reporting on E@H as having 128 CPUs now.

 

I also ran 

 

boinccmd --host localhost  --passwd password1 --read_cc_config

To force it to re-read the config file. Please no one steal my password. It's top secret.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5877
Credit: 118760919903
RAC: 21206003

Skillz wrote:I faked the CPUs

Skillz wrote:
I faked the CPUs to 128 ...

That should be fine.

Skillz wrote:
... but still didn't get anymore tasks than just 2 tasks per GPU at any given time.

Are you talking about tasks running or new tasks downloading per work fetch request?  What is your work cache size?

Skillz wrote:
Does this take time to happen?

No - assuming you forced an 'update' to make the server aware of the new setting.

Your hosts are hidden so we can't see what is actually happening.

Cheers,
Gary.

Skillz
Skillz
Joined: 5 May 17
Posts: 16
Credit: 3523638163
RAC: 37387

I mean the host will only

I mean the host will only have no more than 8 tasks at a time. When those complete it will send no more than 8 to replace them.

 

So out of 8, if one completes, then I'll get 1 in return. If 3 complete, then I'll get 3 in return.

 

I've changed the work cache days to various different settings with no change.

 

10/10 days

5/5 days

1/1 days

1/.5 days

10/0 days

10/5 days

 

And many more I've experimented with. 

 

Resource share is at 100 and no other tasks are running on the host.

 

I can PM you the host of you would like.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5877
Credit: 118760919903
RAC: 21206003

Skillz wrote:So out of 8, if

Skillz wrote:
So out of 8, if one completes, then I'll get 1 in return. If 3 complete, then I'll get 3 in return.

That sounds like you have <fetch_minimal_work> set in cc_config??  Please check all the options.

 

Skillz wrote:
I've changed the work cache days to various different settings with no change.

Please don't set large multi-day values.  If it suddenly starts working, your client will go berserk.  Unless you really want a large hysteresis effect with BOINC in panic mode, I would suggest setting 0.1/0 since that will give you plently to start with.  Once that settles, gradually increase something like 0.2/0, 0.3/0, etc., until you have whatever number in reserve that you want.  So as not to risk quickly exceeding your new daily limit, you'll need to increase gradually anyway.

I'm guessing you aren't running the manager since you used boinccmd to reread the changes to cc_config.  So are you changing preferences locally or on the website?  Are you making sure that both sides become aware of changes you make?  If both sides did become aware of the work cache changes, it seems like you must have <fetch_minimal_work> set.  That's all I can think of immediately.

 

Skillz wrote:
I can PM you the host of you would like.

Sure, if you wish.

Cheers,
Gary.

Skillz
Skillz
Joined: 5 May 17
Posts: 16
Credit: 3523638163
RAC: 37387

No, <fetch_minimal_work> is

No, <fetch_minimal_work> is not in the cc_config.

 

If the 2304 quota is an actual quota then the instance is doing more than 2304 tasks per 24 hour period. 

 

The host has been running BOINC for a little over 1.2 days. It currently has currently received 3104 tasks in that time frame.

Of those 3104 tasks, 8 of them are "in progress" 1589 of them are "pending" 1442 of them are "valid" and 8 of them are "invalid"

So if Einstein@home will only send a host 2304 tasks in a single 24H period and this host can crunch and return more than 2304 tasks. What happens then?

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5877
Credit: 118760919903
RAC: 21206003

Skillz wrote:If the 2304

Skillz wrote:
If the 2304 quota is an actual quota ...

If that's what you saw when you first got the "exceeded daily quota of ..." message, it no longer applies since your host now clearly shows as having 4 GPUs and 128 CPUs.  The daily quota is calculated as something like (A x ngpus + B x ncpus) where A is something rather large like 256 or 384 and B is something like 32.  I don't know the current values of A and B.  There will now be a much larger daily quota than 2304, depending on the difference between the 'real' and the 'fake' cores and the B value.

Now that I have the host ID, I can verify hardware details and can see what the scheduler thinks about a work request.  You can too, using the last scheduler contact link for that host on the website.  The last request I looked at showed:-

2023-05-16 03:36:24.4000 [PID=810862] [send] work_req_seconds: 1.00 secs

So it was only requesting 1 sec of new work.  In other words, whatever cache size you had set, the client thought that the in-progress tasks were only short by 1 sec.  There was also this line:-

2023-05-16 03:36:24.4485 [PID=810862] [send] est. duration for WU 730378439: unscaled 102941.18 scaled 10296.19

For some reason, the scheduler thinks the estimate for the new task should be 102,941 secs and even when scaled (by things like DCF and on_frac) it would still be 10,296 secs which seems stupidly large.

I see you are running the Petri optimised app (anonymous platform) so I have no idea how the scheduler works out estimates for that app (I have no suitable GPUs so have no knowledge about it).  Maybe one of the people running that app might have a clue as to why your tasks are still being estimated at nearly 3 hours after correcting for a 0.1 DCF.  Speaking of that, earlier on there is a line showing a DCF of 0.01 being adjusted by the scheduler to 0.1.  I've never noticed that adjustment before.  Maybe the scheduler considers 0.01 to be impossibly low and 'fixes' it :-).

Are new tasks, when they arrive, being estimated at close to 3 hours???

Cheers,
Gary.

Harri Liljeroos
Harri Liljeroos
Joined: 10 Dec 05
Posts: 4525
Credit: 3303867108
RAC: 1958072

Skillz wrote: No,

Skillz wrote:

No, <fetch_minimal_work> is not in the cc_config.

Check also that the resource share you have for Einstein@home is higher than 0. With 0 resource share Einstein is used as a backup project and it will run & download only 1 task at a time if you don't have work from other projects or don't have host attached to other projects.

Scrooge McDuck
Scrooge McDuck
Joined: 2 May 07
Posts: 1098
Credit: 18481772
RAC: 10807

Skillz schrieb:The host has

Skillz wrote:
The host has been running BOINC for a little over 1.2 days. It currently has currently received 3104 tasks in that time frame.

Skillz started up a high performance host from a user account with a RAC of almost ZERO (1..2 days ago) skyrocketing it until now to a RAC of 18.5M. If there's some mechanism which enforces new users or new hosts to not request too many tasks initially, then this will maybe look like this. Maybe one has to wait some days for incremental lifting of quotas? Just my 2 cents.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.