Not running multiple gpu tasks when cpu used by another project

gBaker
gBaker
Joined: 2 Apr 17
Posts: 3
Credit: 2508394975
RAC: 2
Topic 223036

Hello,

Sorry if this is question that's been answered elsewhere.

My main office desktop (Vega VII gpu and threadripper 1950x cpu, https://einsteinathome.org/host/12796547) has been running both Einstein@home on the gpu and Rosetta@home on the cpu for a while now (since November in the computer's current hardware configuration). I have the gpu utilization factor set to 0.5, and previously had no issues running 2 einstein tasks at once.

However, recently I noticed that it's now only running a single gpu task at a time.

When I suspend the Rosetta cpu tasks, it will run 2 einstein gpu tasks as expected, so it seems like there's some sort of fight for cpu cores going on. When Rosetta is running, it runs 32 tasks at once using all available threads. The einstein gpu tasks seem to be fighting to use the last thread at the same time.

The einstein tasks have also started taking much longer (though reducing the number of threads boinc is allowed seems to solve this problem).

I've tried reducing both the number of threads and the cpu time boinc is allowed. This just leaves some threads idle, while leaving einstein and rosetta fighting for the last thread. I've also set the resource share for einstein to 1000, vs 100 for rosetta, with no change.

I have einstein set to only run Gamma-ray pulsar binary search gpu tasks since the gravitational wave search O2 gpu tasks greatly underutilize the gpu (I assume they're more cpu-bound than the gamma-ray pulsar tasks). I did briefly try running the gravitational wave search tasks, and they had no trouble running x2 at once (though still barely loading the gpu).

Any help would be appreciated. I'm happy to provide any further info if needed.

archae86
archae86
Joined: 6 Dec 05
Posts: 3145
Credit: 7024934931
RAC: 1802712

Could it be that your system

Could it be that your system thinks your Rosetta work is in some deadline trouble?

Just possibly a smaller task queue might help.

Another thought:
If you feel like going the ap_config.xml route, you might find it helpful in your specific circumstances to specify that the Einstein GRP task is expected to use, for example. .45 of a CPU.  I think when one is running, the system would think itself able to start another GPU task, but not another Rosetta task in the fractional CPU.

But I'm waving my hands in the dark, as I don't mix projects, and don't mix applications.

 

Keith Myers
Keith Myers
Joined: 11 Feb 11
Posts: 4704
Credit: 17550393623
RAC: 6435601

You are running the 1950

You are running the 1950 overcommitted.  You just can't run gpu tasks without at least some part of a cpu thread to support it.  Remember you are time-slicing processes on every thread.

When trying to run doubles on a gpu card, you should attempt to give it two threads to support the tasks.

I suggest using a max_concurrent statements in app_config.xml files for Einstein and Rosetta cpu tasks and limit them to only 30 concurrent tasks combined.

I actually would recommend dropping that even further, down to 28 so you keep two threads from crunching to run the desktop and background processes.

Your crunch times for both cpu and gpu will improve.

 

gBaker
gBaker
Joined: 2 Apr 17
Posts: 3
Credit: 2508394975
RAC: 2

Thanks for the suggestion.

Thanks for the suggestion. Setting the <project_max_concurrent> for rosetta fixed the problem. I do typically run boinc with it limited to 28 cores, which vastly speeds up einstein, but still didn't allow 2 simultaneous task.

Looking at the rosetta tasks, they do have shorter deadlines than I recall in the past, so perhaps there was some change in that project that affected priorities. 

 

Either way, limiting it in app_config.xml fixed the issue. Many thanks for all your help.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.