Hi!
Today the next search for Gravitational Waves has been launched on Einstein@Home. Scientifically it's similar to the previous "Multi-Directed serach" and will basically aim for the same targets (G374.3, CasA, VelaJr) but use the more sensitive data from LIGO's second observation run "O2".
Technically we will start this as a "CPU-only" run, and will continue to validate our GW GPU application with the results from the CPU versions.
There is a long weekend ahead in Germany where we have very limited resources for watching over this new run. So until Monday:
- only rather few workunits will be available
- validation will be started on Monday on the results that have been reported until then
- no GPU App versions (on top of the problem that is fixed in the 1.09 O2AS App version we found another possible one in the code that certainly won't affect O2AS calculation, but might O2MD1 - we need to check this first)
BM
Copyright © 2024 Einstein@Home. All rights reserved.
Update:- the possible
)
Update:
- Validation has started and looks good so far.
- The possible problem in the GPU app code will affect the O2MD1 results, so we'll need to develop a fix or workaround and build new App binaries.
- Most of the client errors we got back so far originate from the old 'compatibility' Linux Apps running on new (libc >= glibc 2.15) systems, producing a segfaut. As the libc version is something we can't automatically detect before running an app, for this case we added a project-specific preference ("Run Linux app versions built with LIBC 2.15"). However, you manually had to opt-in for that to work. This was ok as long as the hosts with newer libcs were a rare minority. But now they aren't anymore. Furthermore the GW search is still the most demanding of our searches in terms of memory and computation time, such that older hosts (that would run older Linux systems) can hardly finish these workunits within the deadline. So bottom line: we'll drop compatibility for pre-libc 2.15 Linux hosts in O2MD1 alltogether, and make the "LIBC215" Linux app the only Linux App. People that run older Linux systems and have trouble with the new Apps (segfaults etc.) should de-select the "O2MD1" application. FGRP5 (Gamma-Ray pulsar search) will still run on their machines.
BM
Another update: There are
)
Another update:
There are now GPU App versions for O2MD1. Preliminary tests show that the speedup compared to CPU versions is even larger in O2MD1 than it was in O2AS. Internal tests show pretty good validation, but can't cover more than a few data points (workunits).
BM
Bernd Machenschalk
)
Have the GPU versions been released to the general community? If so I am not seeing them.
We found a problem in the app
)
We found a problem in the app code that will negatively impact the sensitivity of the search. We will re-start that run with new apps shortly.
BM
BTW: GPU app version will
)
BTW: GPU app version will remain 'beta test' versions until further notice.
BM
Bernd Machenschalk wrote:We
)
So I should abort any pending O2MD tasks since they won't produce any science?
Bernd Machenschalk wrote:...
)
Does this mean that crunch times are likely to increase if the next app does a more sensitive search?
Do the large data files remain unchanged or do the existing ones need to be scrapped?
Thanks.
Cheers,
Gary.
Our internal test showed a
)
Our internal test showed a runtime increase by about 20% (both CPU and GPU). We will use the same data files, only the application will change.
BM
A number of mine over the
)
A number of mine over the weekend took a very long time to execute and then failed outright. I also aborted one this morning because it was still at 2% after 6 hours. Don't know if they are the CPU or GPU versions. One did validate which had "GWnew" in the name.
Mr Anderson wrote:A number of
)
If you click on the task ID link for each failed task will see the reason for the failure. In your case it is "197 (0x000000C5) EXIT_TIME_LIMIT_EXCEEDED". In other words, your old GPU is not able to crunch the tasks in what is regarded as a reasonable time. They were taking way longer than they should so BOINC pulled the plug on them.
One way of telling is to look at the tasks tab in BOINC Manager. A GPU task (which the failed ones were) will show both CPU and GPU resources being used. Another way is to look at the application name. The version 2.01 app has (GW-opencl-ati) which shows the use of an AMD/ATI GPU. The CPU version is different (2.00) and has (GWnew) attached to the name instead. If you look at how long your CPU tasks were taking compared to the GPU tasks (they were about 3 times shorter), you can understand why a "time limit exceeded" was invoked.
If you've successfully used your GPU for the FGRPB1G search you should continue to use it there. Unless you have access to a more modern GPU, you probably should just use CPU cores for the new GW tasks.
Cheers,
Gary.