CPU tasks.

Tom M
Tom M
Joined: 2 Feb 06
Posts: 5589
Credit: 7675269359
RAC: 1833593

Gary Roberts wrote: Tom M

Gary Roberts wrote:

Tom M wrote:
I have been running cache sizes from 0.1 to 0.25 the entire time ...
I'm sorry, but you can't have been.

Oh, well.  Since I have been running 0.1 I won't dispute your assertions that I "must have" had it set above 0.25 because the observed downloads are the observed downloads.  I have no explanation for the observed download volume.

I do take credit for the huge number of aborted tasks. :(

I have implemented some changes that should reduce the huge volume of cpu task downloads I have been getting which, I hope, will stop that flood.

Tom M

 

A Proud member of the O.F.A.  (Old Farts Association).  Be well, do good work, and keep in touch.® (Garrison Keillor)

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5842
Credit: 109404461302
RAC: 35479568

Tom M wrote:... I have no

Tom M wrote:
... I have no explanation for the observed download volume.

I had another quick look through the last 50+ pages (page 250 to page 305) of your aborted CPU tasks - more than 1,000 of the 6,000+ tasks still showing.  Those tasks were received in a steady stream between 5:00am UTC and 11:30am UTC on July 24.  Can you think of any changes you made before that time which could have resulted in such a large work fetch?

They all seem to have been aborted around 7:00pm UTC that same day.  Just that fraction of the total tasks you had, at a crunch time of 3 hrs per task and 32 cores to crunch them, represents 4 days of work.

As I've already pointed out, there are two possible explanations.  The first is that your client thought that the cache setting was way higher than 0.1 days.  The second is that the client thought that the estimated crunch time was way lower than 3 hrs.  To request 4 days of work to completely fill a 0.1 day cache size, based on a bad estimate, that estimate would have needed to be around 4.6 minutes.

That just seems extremely unlikely so we are left with the conclusion that the client had a multi-day cache setting.  Do you pay any attention to what task estimates show when you look on the tasks tab of BOINC Manager?  I asked you what estimate showed in BOINC Manager and you haven't supplied an answer.

Do you know what 'location' (previously called 'venue') you have set for your computer?  It will be one of the 4 locations - generic (default) or home or school or work.  Did you check the cache settings for the location your computer is actually assigned to? 

Tom M wrote:
I have implemented some changes that should reduce the huge volume of cpu task downloads I have been getting which, I hope, will stop that flood.

It would be a very good idea to actually mention exactly what was changed.  If you expect to get help, you have to be prepared to share those and many other details so that anyone trying to help you is working with the full picture.

Cheers,
Gary.

Keith Myers
Keith Myers
Joined: 11 Feb 11
Posts: 4704
Credit: 17547859901
RAC: 6423604

I finally bugged Tom enough

I finally bugged Tom enough times into using our proprietary BOINC client which prevents getting more work than the exact number of tasks desired.  He is starting with 25 cpu tasks and 125 gpu tasks. Should not have any future issues with downloading too much work since conventional cache sizes are not relative anymore.

 

mikey
mikey
Joined: 22 Jan 05
Posts: 11889
Credit: 1828165831
RAC: 203220

Keith Myers wrote: I finally

Keith Myers wrote:

I finally bugged Tom enough times into using our proprietary BOINC client which prevents getting more work than the exact number of tasks desired.  He is starting with 25 cpu tasks and 125 gpu tasks. Should not have any future issues with downloading too much work since conventional cache sizes are not relative anymore.

Is it a client side tweak that the Boinc Developers could use? If so you might consider asking the Team about sharing it with the Developers.

Keith Myers
Keith Myers
Joined: 11 Feb 11
Posts: 4704
Credit: 17547859901
RAC: 6423604

The code tweaks and controls

The code tweaks and controls would have to be toned down considerably and severely constrained.  We refer to the client as the Pandora's Box release.  It allows extreme control over the amount of work that can be pulled from a project and how you control the scheduler connections and so can cause major upset to a project database if abused. We don't see releasing that kind of control to the general public wise without setting some upper limits on the number of tasks that could be pulled.  Then you would get people complaining the arbitrary limit the developers placed on the amount of work is too low and then everyone would want a higher limit.

Then the server side limits would have to be adjusted individually by the project admins to protect the databases and you would get a vicious feedback loop.  Just too many adjustments to both client and server side code branches to be practical and wise.

Remember the client was developed for the Seti project and the Seti special sauce gpu application. You can't have everyone asking for a 10,000 gpu task cache.  The databases can't handle it.

 

Tom M
Tom M
Joined: 2 Feb 06
Posts: 5589
Credit: 7675269359
RAC: 1833593

Gary Roberts wrote:Tom M

Gary Roberts wrote:

Tom M wrote:
... I have no explanation for the observed download volume.

I had another quick look through the last 50+ pages (page 250 to page 305) of your aborted CPU tasks - more than 1,000 of the 6,000+ tasks still showing.  Those tasks were received in a steady stream between 5:00am UTC and 11:30am UTC on July 24.  Can you think of any changes you made before that time which could have resulted in such a large work fetch?

They all seem to have been aborted around 7:00pm UTC that same day.  Just that fraction of the total tasks you had, at a crunch time of 3 hrs per task and 32 cores to crunch them, represents 4 days of work.

As I've already pointed out, there are two possible explanations.  The first is that your client thought that the cache setting was way higher than 0.1 days.  The second is that the client thought that the estimated crunch time was way lower than 3 hrs.  To request 4 days of work to completely fill a 0.1 day cache size, based on a bad estimate, that estimate would have needed to be around 4.6 minutes.

That just seems extremely unlikely so we are left with the conclusion that the client had a multi-day cache setting.  Do you pay any attention to what task estimates show when you look on the tasks tab of BOINC Manager?  I asked you what estimate showed in BOINC Manager and you haven't supplied an answer.

Do you know what 'location' (previously called 'venue') you have set for your computer?  It will be one of the 4 locations - generic (default) or home or school or work.  Did you check the cache settings for the location your computer is actually assigned to? 

Tom M wrote:
I have implemented some changes that should reduce the huge volume of cpu task downloads I have been getting which, I hope, will stop that flood.

It would be a very good idea to actually mention exactly what was changed.  If you expect to get help, you have to be prepared to share those and many other details so that anyone trying to help you is working with the full picture.

I was not sure what I could/couldnot talk about the details of the changes. Thank you Keith for being willing to talk about the changes.  The Pandora code is limited to Linux. 

And apparently I haven't managed to shoot myself in the foot on my Windows box (cross fingers) since it is using a straight stock client.

The Linux client I was using before Pandora was leftover from the Seti@Home setup.  So that is another possible reason for odd flooding behavior.

Anyway, I don't expect to have another overflow problem unless something else breaks.

And the RAC is upward bound again. I hope that system will make it onto the Top systems list too.

======edit=======

ps.  Sorry I missed this: "...Do you pay any attention to what task estimates show when you look on the tasks tab of BOINC Manager?  I asked you what estimate showed in BOINC Manager and you haven't supplied an answer..."  I will switch back over to the Linux box and see if I can wrestle up some info.

The GR #5 (cpu) tasks are currently claiming 20 minutes time left to process at "ready to start".  Which is "highly" unlikely.  :)

The currently running cpu tasks are offering estimates between 6 hours and 12 hours.  Depending E@H cpu task I have seen the actual processing time dip below 4 hours.

I just installed the same make/model of ram I have on this Linux box onto my Windows 10 box (also Amd) so I expect a little speed up in processing there too.

Tom M

A Proud member of the O.F.A.  (Old Farts Association).  Be well, do good work, and keep in touch.® (Garrison Keillor)

Tom M
Tom M
Joined: 2 Feb 06
Posts: 5589
Credit: 7675269359
RAC: 1833593

I acquired an Amd A-10 more

I acquired an Amd A-10 more or less by accident yesterday.

After running 3 out of 4 cores and getting maybe 5-6 hours per task I just added a RX 560 and suddenly the iGPU was running too.

As usual, based on prior experience the cpu threads get starved for resources when this particular design of the APU runs a heavy iGPU load (A-4,A-6, A-8, A-10 as far as I know).

I have juggled the parameters around and now have 2 cpus + 2 gpus running which is beginning to show some progress on the cpu %.

I forgot to confirm that I have two ram modules on the MB.

And there is a perfectly good chance I will need suppress the iGPU processing completely to actually maximize total system production.

Tom M

 

A Proud member of the O.F.A.  (Old Farts Association).  Be well, do good work, and keep in touch.® (Garrison Keillor)

mikey
mikey
Joined: 22 Jan 05
Posts: 11889
Credit: 1828165831
RAC: 203220

Tom M wrote: I acquired an

Tom M wrote:

I acquired an Amd A-10 more or less by accident yesterday.

After running 3 out of 4 cores and getting maybe 5-6 hours per task I just added a RX 560 and suddenly the iGPU was running too.

As usual, based on prior experience the cpu threads get starved for resources when this particular design of the APU runs a heavy iGPU load (A-4,A-6, A-8, A-10 as far as I know).

I have juggled the parameters around and now have 2 cpus + 2 gpus running which is beginning to show some progress on the cpu %.

I forgot to confirm that I have two ram modules on the MB.

And there is a perfectly good chance I will need suppress the iGPU processing completely to actually maximize total system production.

Tom M 

try here https://einsteinathome.org/account/prefs/project   hopefully that leads to your settings where you can uncheck the Igpu so it says NO to not use it.

Keith Myers
Keith Myers
Joined: 11 Feb 11
Posts: 4704
Credit: 17547859901
RAC: 6423604

Not sure how using the

Not sure how using the project preferences will prevent using the embedded gpu in the cpu.  It is not an Intel chip. If he checks to not use a AMD gpu, he won't be able to use his RX560 discrete gpu.

 

mikey
mikey
Joined: 22 Jan 05
Posts: 11889
Credit: 1828165831
RAC: 203220

Keith Myers wrote: Not sure

Keith Myers wrote:

Not sure how using the project preferences will prevent using the embedded gpu in the cpu.  It is not an Intel chip. If he checks to not use a AMD gpu, he won't be able to use his RX560 discrete gpu.

Whoops...you are right that won't work....my new laptop has a gpu that can crunch in it so I restarted stopped ad Boinc and got this:

8/15/2020 5:42:33 AM |  | CUDA: NVIDIA GPU 0: GeForce GTX 1660 Ti (driver version 442.94, CUDA version 10.2, compute capability 7.5, 4096MB, 3556MB available, 4884 GFLOPS peak)
8/15/2020 5:42:33 AM |  | OpenCL: NVIDIA GPU 0: GeForce GTX 1660 Ti (driver version 442.94, device version OpenCL 1.2 CUDA, 6144MB, 3556MB available, 4884 GFLOPS peak)
8/15/2020 5:42:33 AM |  | OpenCL: AMD/ATI GPU 0: AMD Radeon(TM) Graphics (driver version 3004.4 (PAL,HSAIL), device version OpenCL 2.0 AMD-APP (3004.4), 6241MB, 6241MB available, 1434 GFLOPS peak)

 

His RX560 should show up as either gpu 1 or 0  and then he can use a cc_config.xml file to exclude it in my case gpu 0 ie:

<cc_config>
 

<options>
<use_all_gpus>1</use_all_gpus>

<exclude_gpu>
<url>http://einstein.phys.uwm.edu/</url>
<device_num>0</device_num>
</exclude_gpu>

</options>

</cc_config>

Of course he will have to continue adding exclude sections as he goes from project to project using his RX560 to keep his cpu's gpu from being used. One thing I've discovered in the past is to ALWAYS use the project address from the Boinc Client as the url as often the one the Project itself uses doesn't work. I had a heck of a time trying to exclude a few projects until I switched to the Boinc Client url's used when attaching to a project, then everything went smoothly.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.