Reached a maximum quota

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5872
Credit: 117657939403
RAC: 35193455

Harri Liljeroos wrote:Check

Harri Liljeroos wrote:
Check also that the resource share you have for Einstein@home is higher than 0.

In an earlier message he stated that resource share was set to 100.

It seems strange that he mentions trying a variety of work cache sizes and there was no effect.  Maybe he is setting the cache size for a location other than the one the host is using.  I've been assuming that with a single host just starting up, it would be in the default (generic) location.  Maybe he is changing cache size for a different location (home, work, or school).

I looked at a new scheduler contact where 3 completed tasks were returned and 3 new tasks were sent as replacements.  After that there was the following comment from the scheduler:-

2023-05-16 21:02:50.7023 [PID=2862386] [send] don't need more work

This implies that he has a cache size of effectively zero.  If there is an available resource (eg. 0.5 of a GPU when running x2), the scheduler will send a single task to keep it working.

 

@skillz - Only you (the owner) can see what location a host is assigned to.  Check on the website.  Are you changing the cache settings for the correct location?

Cheers,
Gary.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5872
Credit: 117657939403
RAC: 35193455

Scrooge McDuck wrote:... If

Scrooge McDuck wrote:
... If there's some mechanism which enforces new users or new hosts to not request too many tasks initially ...

I'm not aware of any such mechanism.

It seems like the scheduler is just making sure that an available computing resource has a task to continue working on.  I've never tested it but it seems like this would be the behaviour if the work cache size was actually set to zero.

Cheers,
Gary.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5872
Credit: 117657939403
RAC: 35193455

Scrooge McDuck wrote:Skillz

Scrooge McDuck wrote:
Skillz started up a high performance host from a user account with a RAC of almost ZERO (1..2 days ago) skyrocketing it until now to a RAC of 18.5M.

The account must have a number of high performing hosts since the one we are discussing has a RAC of less than 1M at the moment - although it is rising very fast :-).

Cheers,
Gary.

Skillz
Skillz
Joined: 5 May 17
Posts: 16
Credit: 3521701690
RAC: 3680008

The location is home and I've

The location is home and I've have been changing the home location.

 

I also avoided the web preferences and just made an override file. No fix.

 

We believe the issue might be client related as the client seems to only request 1 second worth of tasks each time.

 

If I let the work finish on the host then reset the project it will download more work for a short while before reverting back to the issue.

mikey
mikey
Joined: 22 Jan 05
Posts: 12689
Credit: 1839095224
RAC: 3720

Skillz wrote: The location

Skillz wrote:

The location is home and I've have been changing the home location.

 

I also avoided the web preferences and just made an override file. No fix.

 

We believe the issue might be client related as the client seems to only request 1 second worth of tasks each time.

 

If I let the work finish on the host then reset the project it will download more work for a short while before reverting back to the issue. 

What does your cc_config.xml file look like? Is it full of settings changes from the defaults? Or do you run it fairly clean with only minor things in it? Also what version of Boinc are you running? Your pc's are hidden so I can't see that, Gary said he thought your were running Petri's tweaked version for Linux. The one other thing is the gpu driver version.

Ian&Steve C.
Ian&Steve C.
Joined: 19 Jan 20
Posts: 3956
Credit: 46963262642
RAC: 64689345

mikey wrote:Gary said he

mikey wrote:

Gary said he thought your were running Petri's tweaked version for Linux.

running the special application has nothing to do with work fetch. And some people are having the same issue with the stock application.

my theory is that it's a bug in the client work fetch logic. and likely centers around all the modifications that were made to work around the work fetch bug when using the project_max_concurrent and max_concurrent statements in app_config. it seems that what happens is that a fast host will initially work fine and load up on work. then *something* happens and triggers the host to go into a minimum work request mode. even though there are no settings or any other kind of indication that it's set this way. it seems to be something that gets set in the background and/internal to the boinc logic. the only fix seems to be resetting the project on the client. but it's just a band-aid, it will pop up again. what that *something* actually is that is the trigger remains unknown. I had this same exact thing happen to me on Primegrid. so it's not specific to Einstein, just a lot of people are experiencing it with Einstein due to the Pentathlon contest going on. It doesnt seem to impact slow hosts.

 

Skillz isn't going to unhide his hosts, he's aware of it and it's intentional. it's a strategy for competition. But I'm sure he'll provide you with a link to the host(s) in question or any other details you like.

_________________________________________________________________________

mikey
mikey
Joined: 22 Jan 05
Posts: 12689
Credit: 1839095224
RAC: 3720

Ian&Steve C. wrote: mikey

Ian&Steve C. wrote:

mikey wrote:

Gary said he thought your were running Petri's tweaked version for Linux.

running the special application has nothing to do with work fetch. And some people are having the same issue with the stock application.

my theory is that it's a bug in the client work fetch logic. and likely centers around all the modifications that were made to work around the work fetch bug when using the project_max_concurrent and max_concurrent statements in app_config. it seems that what happens is that a fast host will initially work fine and load up on work. then *something* happens and triggers the host to go into a minimum work request mode. even though there are no settings or any other kind of indication that it's set this way. it seems to be something that gets set in the background and/internal to the boinc logic. the only fix seems to be resetting the project on the client. but it's just a band-aid, it will pop up again. what that *something* actually is that is the trigger remains unknown. I had this same exact thing happen to me on Primegrid. so it's not specific to Einstein, just a lot of people are experiencing it with Einstein due to the Pentathlon contest going on. It doesnt seem to impact slow hosts.

 

Skillz isn't going to unhide his hosts, he's aware of it and it's intentional. it's a strategy for competition. But I'm sure he'll provide you with a link to the host(s) in question or any other details you like. 

Oh I don't need to see his hosts, they are his and not mine and he can hide or unhide them as he chooses.

I was just trying to think big picture and come up with some of the things that can affect the Boinc client and that involves the cc_config file when Boinc starts and then as it runs the app_config file adds more Boinc configuration options and then you add in the various versions of both Boinc and the gpu drivers and there are alot of moving parts that all need to communicate efficiently for Boinc to work for us like we want it too. AND as you said the "project_max_concurrent and max_concurrent statements in app_config" can also affect how Boinc runs especially on the newer versions where it knows you are using those files and can adjust itself to work with it.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5872
Credit: 117657939403
RAC: 35193455

mikey wrote:...The one other

mikey wrote:
...The one other thing is the gpu driver version.

Mikey,
The host is returning large numbers of validated tasks so how does that suggest there's a problem with "gpu driver version"??

How could it even be remotely possible for the driver to interfere with work fetch??

Cheers,
Gary.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5872
Credit: 117657939403
RAC: 35193455

Ian&Steve C. wrote:...my

Ian&Steve C. wrote:
...my theory is that it's a bug in the client work fetch logic. and likely centers around all the modifications that were made to work around the work fetch bug when using the project_max_concurrent and max_concurrent statements in app_config.

I tend to agree but had discounted the max_concurrent modifications because the client is listed as 7.16.6.  I don't know for sure but doesn't that version pre-date those modifications?

I'm now wondering if it might not be a standard 7.16.6 build but perhaps one with some additional performance tweaks that might be mis-behaving?

I don't follow what is happening in the 'BOINC Pentathlon' world so wasn't even aware there was one going on.  Yep, I know, I'm living under a rock :-).  However that certainly adds additional context as to what might be causing the problem.

Cheers,
Gary.

Skillz
Skillz
Joined: 5 May 17
Posts: 16
Credit: 3521701690
RAC: 3680008

The version of boinc is the

The version of boinc is the one downloaded from the distros package manager. No modifications on my end. Installed by simply typing:

sudo apt install -y boinc boinctui

Its running on Ubuntu 20.3.

 

edit

I have since just created two new instances and loaded E@H on both of them with an app_config that only runs 1 task per GPU. The problem has not happened again on my end, but one of my teammates has experienced the same issue doing this method as well.

What we have begun to do now is just wait for the tasks to complete, upload and report after setting no new tasks. Once all tasks are done and gone. We issue a project reset. 


Then we will get another full cache ~1000 tasks downloaded in a few minutes.

 

After the pentathlon and when I get some time I will try installing a newer version of BOINC on the host and running E@H on it. See if the issue remains or if it's indeed a problem with such an older client.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.