As we continue to have problems generating enough BRP4 work, for now we will disable BRP4 CPU application versions and thus will ship BRP4 work only to CUDA machines to feed at least these.
I've done everything I could on the software side, but I'm afraid the problems with work generation could only be resolved by some serious hardware upgrade to the machines running the work unit generator and serving the data files. This will need coordination (it's vacation time in Germany), planning, ordering new hardware etc., and thus won't be done in the next few days.
BM
BM
Copyright © 2024 Einstein@Home. All rights reserved.
Limiting BRP4 to CUDA machines
)
That's OK. There is so many work for CPU. Nobody must be afraid to run out of CPU work.
No problem. I'd rather
)
No problem.
I'd rather find a gravity wave than a pulsar anyways ;)
As far as I know BRP's are
)
As far as I know BRP's are g-wave eligible candidates. So go on Einstein at home
OTOH, may be it is better to
)
OTOH, may be it is better to generate workunits only for CPU machines instead of CUDA since they cosume less WUs per second. Am I right, Bernd?
RE: OTOH, may be it is
)
Except that would mean that the volunteer GPU resource would go completely unused. Everyone would take their CUDA cards somewhere else if there was no work here for long stretches of time. I doubt that is what you intend, though.
RE: RE: OTOH, may be it
)
Yes, this will lead to moving GPUs somewhere else, where the work queue is stable. But we here are not cobblestone freaks. We just do the science. So, the BRP will just move a little bit slower, but it will help us to eleminate the problem with WU cache and concentrate on more actual things. When new machines will be installed (I suppose it will take no more then half a year), than it becomes possible to make new WU generators that will be able to produce enough work to support CUDA machines even with new searches.
* As a short term relief we
)
* As a short term relief we hastily set up a new machine that allows us to run another six WUG instances. This means that we are now sending out about 1500 BRP4 tasks per hour (compared to previously 1000). Unfortunately this still doesn't seem to be enough to feed even our GPUs.
* We plan to implement a new WUG that would scale much better, but this will take some more time. With all the things currently going on @AEI (around the BOINC workshop) I would expect this to be done not before the end of next week.
BM
BM
In the meantime, there's
)
In the meantime, there's nothing to stop us leaving CUDA cards connected to Einstein to download work whenever it might be available, but also attaching to other CUDA projects to share the resource around.
I have, however, increased the 'Task Switch Interval' on my Windows 7 machines, so that BRP4 tasks run to completion in a single session - in an attempt to minimise the downclocking and other problems caused by the non-threadsafe exit in the current app. I had one GPU become unusable (until reboot) yesterday after a BRP4 task was switched out in favour of another project.
Too bad we can't distribute
)
Too bad we can't distribute the WU generation to the Grid but I suspect the size of the datasets needed is too big to make that practical. I am curious though what kind of equipment is used to generate the WUs and how much on-line data is needed?
I used to use my daily credit as a sign there were problems to be addressed but it seems that is also very dependent on outside factors such as how much GPU work is available, and how long it takes to validate a returned result. I now have 765 tasks waiting to be validated with the oldest one returned on July 7.
I do enjoy participating in this project. I am more interested in the E@H work than the number of credits I get. Now if we could redeem those credits for something like Boinc T-shirts or pocket protectors it might be different.
Joe
The original instrument data
)
The original instrument data is about 2GB per beam. You probably don't want to download that, especially not for just a few hours of computing time. Furthermore the dedispersion takes a lot of memory that the average user doesn't have or least not wants to donate.
The mid-term plan is to use parts of the Atlas cluster for dedispersion and compression, but for this to work, parts of the current workunit generation needs to be made more scalable. Rest assured that we're working on that.
BM
BM