Workunit flooding - how to stop?

Magiceye04

Joined: 18 Feb 06

Posts: 31

Credit: 828298735

RAC: 435636

30 Aug 2021 16:59:11 UTC

Topic 225934

(moderation:

)

Hello,

on one of my PCs Einstein is flooding me with hundreds of work units.

It has an RX460 GPU which needs about 50 minutes per WU, only GPU is allowed.

https://einsteinathome.org/de/host/12523728

The expected run time is realistic shown by the Boinc Manager (50 minutes).

The buffer is set to 0,01 days + 0

Every 60 seconds the boinc manager is requesting new work units and also gets exactly 1 new WU.

I have now more then 500, counting up endless.

The duration correction factor in the client_state.xml was about 2.08... - I have lowered this to 0.99.

Why is Einstein asking for more and more and more work?

And why does the server deliver more and more?

How can I stop this? (not by pressing "no new work" - I want a realistic number of WUs as buffer = 2 or 3)

Best Regards

MagicEye

Harri Liljeroos

Joined: 10 Dec 05

Posts: 4589

Credit: 3342401263

RAC: 1898219

Do you have an app_config.xml

30 Aug 2021 17:51:00 UTC

Message 188684

(moderation:

)

Do you have an app_config.xml file for Einstein where you limit the max_concurrent number of workunits? That would cause this because of a bug in Boinc client which makes it request new tasks over and over again.

Magiceye04

Joined: 18 Feb 06

Posts: 31

Credit: 828298735

RAC: 435636

Yes, that was the

30 Aug 2021 19:03:00 UTC

Message 188685

(moderation:

)

Yes, that was the reason.

Thank you!

I have used this to crunch one Einstein and one WCG-OPN WU in parallel.

Harri Liljeroos

Joined: 10 Dec 05

Posts: 4589

Credit: 3342401263

RAC: 1898219

If you use only your GPU for

30 Aug 2021 21:09:15 UTC

Message 188691

(moderation:

)

If you use only your GPU for Einstein you can use the project_max_concurrent tag in app_config.xml to limit the number of tasks. Note that it goes to a different place in that file: https://boinc.berkeley.edu/wiki/Client_configuration#Project-level_configuration

Eugene Stemple

Joined: 9 Feb 11

Posts: 67

Credit: 405521373

RAC: 416392

@magiceye an optional fix

3 Sep 2021 17:06:26 UTC

Message 188807

(moderation:

)

@magiceye

an optional fix is to revert to boinc-manager 7.14 . That version was running fine for me with the app_config line for <project_max_concurrent> parameter. I am running a mix of cpu and gpu tasks, which is known to be badly managed with regard to cache size, but with limits of 1 + 0.1 it was not overcommiting the cpu loading. At least not past the typical 14-day deadlines.

Recently upgraded the Linux system to Debian 11.0.0 and thought I might as well upgrade to the 7.16 boinc packages. BAD IDEA... I had drained the cache before the transition and (foolishly) resumed E@H late at night expecting it to refill the cache and resume normal operation. The next morning I had 1000 tasks downloaded! It had hit the 512 workunit limit before midnight and then got another 512 the "next day." Looking at the event log, it was fetching 3 or 4 work units every (60-second) cycle with total disregard of the cache limit. The thought occurred to me that the new 7.16 had no run-time history to base its estimates on; however, after letting it run for two days I tried enabling new tasks and it immediately downloaded 4 more cpu and 20 gpu tasks. (Those for the gpu were expected as all gpu tasks in the cache had been completed.) I've switched back to boinc 7.14 and now when I do a work fetch, to get more gpu work, it does NOT fetch any cpu work. Alas, I'll have to abort a big bunch of cpu work as there's no way they'll get done before the deadline.

OT - the 7.16 boinc-manager is missing the "shut down connected client" control option. Not a deal breaker but just inconvenient to close boinc gracefully for a system upgrade or such.

Harri Liljeroos

Joined: 10 Dec 05

Posts: 4589

Credit: 3342401263

RAC: 1898219

The problem is with

3 Sep 2021 17:56:08 UTC

Message 188808

(moderation:

)

The problem is with <max_concurrent> tag, the <project_max_concurrent> should work OK.

Workunit flooding - how to stop?

Forums › Cruncher's Corner

Do you have an app_config.xml

Yes, that was the

If you use only your GPU for

@magiceye an optional fix

The problem is with

Comment viewing options

Forums › Cruncher's Corner