Latest data file for FGRPB1G GPU tasks

mountkidd
mountkidd
Joined: 14 Jun 12
Posts: 176
Credit: 12616032555
RAC: 8009289

I have two hosts that are

I have two hosts that are running Boinc 7.6.33 and are hitting the daily quota limit very early in the day.  Changing ncpus doesn't seem to have an effect on these hosts, unlike my other hosts that are on Boinc 7.12.0/7.14.0.

Tasks get reported and some amount of new wu's downloaded as long as I keep hitting the Projects/Update button, but once the queue runs dry it has been near impossible to get downloads going once again until the change in UTC day.  I have done combinations of reboots, NNT/Allow, Activity Suspend/Run mostly without success.

Does anyone have any other suggestions on how to get around this daily quota problem? 

Both affected hosts are on Ubuntu 14.04 LTS and restricted to AMD fglrx drivers (7970, R9 280X).  Boinc 7.9.3 is available as an upgrade, but I don't know if that will solve the problem and changing involves effort to convert and reconfigure the existing roll-your-own 7.6.33 installation.

Does the daily quota come from the server side?  There doesn't appear to be a way to query the quota value and I only see the value when it is exceeded.  And I can't find anything in the local config files that resembles the value.

 

 

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5872
Credit: 117759378734
RAC: 34779131

mountkidd wrote:I have two

mountkidd wrote:
I have two hosts that are running Boinc 7.6.33 and are hitting the daily quota limit very early in the day.  Changing ncpus doesn't seem to have an effect on these hosts, unlike my other hosts that are on Boinc 7.12.0/7.14.0.

I run a lot of hosts on Linux, some with 7.6.33 (and amdgpu driver) and others still with 7.2.42 with fglrx as the driver.  By putting the ncpus line into a cc_config.xml file, I am able to get a quota increase on all of these.  When I look at such hosts on the website they do show the ncpus setting as the number of processors.  When I look at your two, (both genuine quad core CPUs) they show as 4 processors.  It looks like you haven't been able to make the ncpus setting stick - for whatever reason.

It should be possible.  To convince myself I'm not dreaming, I just checked a Tahiti GPU host of mine with a dual core CPU.  Yep, 16 processors showing on the website and no quota limit exceeded messages yet.  All I have done with mine is install a cc_config.xml file and hit the 'read config files' option in BOINC Manager.

Obviously, if these fast tasks keep coming for a while, I too will see problems at some point.  With Holmis' earlier report about 96 processors, there seems to be plenty of scope for further quota increases :-).

 

Cheers,
Gary.

mmonnin
mmonnin
Joined: 29 May 16
Posts: 291
Credit: 3422836540
RAC: 3739222

mountkidd wrote:I have two

mountkidd wrote:

I have two hosts that are running Boinc 7.6.33 and are hitting the daily quota limit very early in the day.  Changing ncpus doesn't seem to have an effect on these hosts, unlike my other hosts that are on Boinc 7.12.0/7.14.0.

Tasks get reported and some amount of new wu's downloaded as long as I keep hitting the Projects/Update button, but once the queue runs dry it has been near impossible to get downloads going once again until the change in UTC day.  I have done combinations of reboots, NNT/Allow, Activity Suspend/Run mostly without success.

Does anyone have any other suggestions on how to get around this daily quota problem? 

Both affected hosts are on Ubuntu 14.04 LTS and restricted to AMD fglrx drivers (7970, R9 280X).  Boinc 7.9.3 is available as an upgrade, but I don't know if that will solve the problem and changing involves effort to convert and reconfigure the existing roll-your-own 7.6.33 installation.

Does the daily quota come from the server side?  There doesn't appear to be a way to query the quota value and I only see the value when it is exceeded.  And I can't find anything in the local config files that resembles the value.

 

 

You can try running more than 1 client. I do this to split CPU work from GPU work on all my PCs but you could run 1 client per GPU as well. Or if you're running 2x tasks at once you could run 1x per client.

Holmis
Joined: 4 Jan 05
Posts: 1118
Credit: 1055935564
RAC: 0

I've done some trial and

I've done some trial and error research on the daily quotas and here's my findings:

System: CPU i7 3770K (4 cores, 8 threads), Intel iGPU detected but not used, GPU: AMD Vega 56 and only running GPU tasks.

I'm trying out different ncpus settings in cc_config.xml and finding some not intuitive results:

ncpus   daily quota
8       576
15      608
16      640
17      640
18      640
19      640
20      672
21      672
22      672
23      672
24      704
25      704
26      704
27      704
28      736
32      768

There seems to be some rounding going on here!
As it's past my bedtime I won't try do draw any conclusions from this other than if we want to figure this out (without the help of an admin Wink) we might need something similar from a system with say 2 cores 4 threads to compare (comprising of a CPU, iGPU x1 and GPU x1) or one with a CPU 4C8T and one GPU and no iGPU.

mountkidd
mountkidd
Joined: 14 Jun 12
Posts: 176
Credit: 12616032555
RAC: 8009289

Thanks for the heads-up

Thanks for the heads-up Gary!  Knowing 7.6.33 should work was a very important first step.  I checked my cc_config.xml's to ensure correct structure & syntax and inspite of a few embedded CR's (from orig Win install), all version of Boinc file parsers correctly read the file.  In my custom build/install from 4 yrs ago, cc_config.xml was in a non-standard location.  As it turns out, I was making changes to the cc_config file in the standard ppa install location. Dummy me!

We're back up and running now with all cylinders firing - RAC on!

Mmonnin - thanks for the suggestion.

mountkidd
mountkidd
Joined: 14 Jun 12
Posts: 176
Credit: 12616032555
RAC: 8009289

Holmis: In my funbling around

Holmis: In my funbling around last night, my 3770k showed a quota of 1024 with ncpus = 24.  With the non-linear response to the changes I was making,  I suspected there might be some power of 2 thing going on. 

In the Tech News forum, Richard Haselgrove quoted Bernd with quota level being currently "set at 32*8=128 per core".  This doesn't quite compute as 32*8 really is 256 and for a 4 core m/c should then yield a 1024 quota.  And Bernd was talking about upping the quota to 32*16 for the duration of the high pay tasks.

There seems to be a fair degree of complexity and the quota calculation isn't too straight forward...

Oliver Behnke
Oliver Behnke
Moderator
Administrator
Joined: 4 Sep 07
Posts: 984
Credit: 25171438
RAC: 23

Hi guys, finally I have an

Hi guys,

finally I have an official statement for you: we messed things up Innocent

It appears we accidentally sent out an old data set that was probably targeted for CPUs, not GPUs, which explains all your findings. The reason for this mishap was a flawed moving procedure caused by a lack of proper clean up after we cancelled that old data set last year and a workunit generator that wasn't able to detect that naming collision. We sorted things out and as of last night we're sending out a new set 2004L which exhibits the expected runtime behavior. We also see that our infrastructure load is back to normal again. We're going to reissue the new 2003L set again under a different name (2103L) later on.

Sorry for the confusion. At least we paid a nice compensation in terms of credits Wink

Cheers, thanks for 2018 and have a nice festive season!

Oliver

Einstein@Home Project

Richie
Richie
Joined: 7 Mar 14
Posts: 656
Credit: 1702989778
RAC: 0

Oliver Behnke wrote:We also

Oliver Behnke wrote:
We also see that our infrastructure load is back to normal again.

Hi Oliver! That's alright :D

But hosts haven't been able to upload for the last 10 hours. There are tons of communications backing off. I hope that staff there is aware of this final problem before holidays...

https://einsteinathome.org/content/i-cant-upload

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 2143
Credit: 2960592689
RAC: 706136

Richie wrote:I hope that

Richie wrote:
I hope that staff there is aware of this final problem before holidays...

Yes, they are. Bernd has restarted the upload process, and checked with the rest of the team as to possible common causes (it's the same problem as last month).

Richie
Richie
Joined: 7 Mar 14
Posts: 656
Credit: 1702989778
RAC: 0

Richard Haselgrove wrote:Yes,

Richard Haselgrove wrote:
Yes, they are. Bernd has restarted the upload process, and checked with the rest of the team as to possible common causes (it's the same problem as last month).

Sounds great! Smile Yep, uploading is working again. Huge thanks to all 'staff' people of this project! Comfortable next week and so on...

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.