I'm in the process of moving E@H to a backfill Condor job. The good news is that we'll have more cores processing the bad news is that I can't figure out how to run multiple jobs on each GPU but I'll probably have a smaller percentage of time allocated to E@H.
How is this done in BRPR4? Do we have something special that allows more than one WU per GPU or can I just run 2 or 3 jobs per GPU?
Has anyone implemented this already who can explain how to do it?
Joe
Copyright © 2024 Einstein@Home. All rights reserved.
BOINC, Condor, CUDA and multiple tasks per GPU
)
In the old days this called for an app_info, so not so easy. Now, however, just go to your account at Einstein@home, select Einstein@home preferences, edit for the location (a.k.a. venue) of your host the following parameter:
GPU utilization factor of BRP apps
The value of 1.0 gives one active WU, the value of 0.5 gives two, 0.33 gives three. You are not likely to find more than three helpful (the big leap is often from one to two), and may not find it possible.
The current Einstein BRP ap requires rather a lot of support from the associated CPU task. Many of us have found total system output to be higher if we restrict BOINC to fewer than the maximum number of CPU tasks when running BRP on a GPU. The reduction in latency-imposed waiting of the GPU task in these cases increases productivity more than enough to pay for the loss of CPU work. The detailed tradeoffs are quite dependent on host characteristics, so experimentation is key. Happily the Einstein BRP jobs seem to be very consistent in computation requirement--so a small sample can be enough in many cases.
Thanks archae86. I have that
)
Thanks archae86. I have that set up when running boinc as a regular user but my question is really "will that work if I'm running boinc as a Condor backfill task?"
Condor (http://research.cs.wisc.edu/condor/) is similar to Boinc in theory but much more flexible (read complicated to set up). It assigns jobs to cores or slots and as far as I can tell so far a GPU has to be assigned to one and only one of these.
We think the Einstein developers did something special to get more than one WU on a GPU and if so I wonder how special.
Joe
Joe--sorry for misconstruing
)
Joe--sorry for misconstruing your question--and I know nothing on the actual one.
RE: Thanks archae86. I
)
How is your setup? AFAIK Condor is a kind of cluster, so are you running a separated instance of BOINC in each core/slot or are you running BOINC in a kind a virtual supercomputer made of all the available resources? (I guess you are using BOINC, true?)
Anyway, the apps has nothing special, its the BOINC client who reads the utilization factor (or the tags in app_info) and then starts as many instances of the GPU app as it can until the whole GPU is used (or the unused fraction is not enough for another instance)
The only speciall thing in Einstein is that they use a customized version of the BOINC server software that allows to set the utilization factor in the prefferences page which save us from the burden of the ap_info maintenance...
Thanks Horacio, Just
)
Thanks Horacio,
Just getting started with this. My current set up has each hyperthread assigned a slot. Boinc will be added as a backfill so when a slot is idle it will run as a separate instance in each slot.
I believe we can assign multiple cores/hyperthreads so some slots but I haven't implemented that yet.
I'll give it a try with 2 or 3 slots advertising they have a gpu available and see what happens.
Joe
Try
)
Try this:
https://nmi.cs.wisc.edu/node/1753
Alexander
RE: Try
)
Thanks Alexander that's one of the better descriptions of backfill that I've seen.
Joe