Multi-Directional Gravitational Wave Search on O3 data (O3MD1/F)

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 4,305
Credit: 248,807,594
RAC: 33,112
Topic 228655

We are finally launching our new Gravitational Wave search! It's again on LIGO O3 data, as the previous O3 All-Sky search was, and ist again targeting the three most promising sources that you may remember from the previous MD-search "O2MD1": G347.3, CasA and VelaJr, this time in only two different frequency segments (20-500Hz and 500-1500Hz).

As we did last time, this time we will also have parts that run on the CPUs ("O3MD1") and parts that use the GPUs ("O3MDF"). The first GPU search "O3MDFG1" (G347.3, low frequency), should be launched today, with some thousand WUs first (Beta test App). We hope to launch the first CPU search next week, depending on how the GPU part goes.

BM

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 4,305
Credit: 248,807,594
RAC: 33,112

The GPU computation has

The GPU computation has become pretty efficient and fast by now. However, this comes at a price: some information that we need at the end is buried so deep in the GPU optimized code that we can't keep it and get it out efficiently when computing the statistics for millions of templates in each task. Our solution is to later reconstruct ("recalculate") this information only at the end for the some thousand "candidates" that are actually returned as the result. As of now, this calculation can't be done efficiently on the GPU, so we're doing it on the CPU, and the GPU is idle at the end. Depending on your combination of GPU and CPU, this last step on the CPU may take a significant share of the overall task time. You can see this in stderr of your task, there should be a line "Recalculating statistics for the final toplist..." when the CPU part starts.

We are working on doing this part on the GPU as well and intend to bring out an improved application version during this run. But I'm afraid this will not reach the same efficiency as the "main calculation" part; the memory access pattern required for this is pretty random, and _random_ memory access is pretty slow on GPU (no, it's not the bandwidth that matters here).

BM

Mike Hewson
Mike Hewson
Moderator
Joined: 1 Dec 05
Posts: 6,578
Credit: 306,914,202
RAC: 178,279

Yeah ! Sounds neat.

Yeah ! Sounds neat. ;-)

Cheers, Mike.

I have made this letter longer than usual because I lack the time to make it shorter ...

... and my other CPU is a Ryzen 5950X :-) Blaise Pascal

Ian&Steve C.
Ian&Steve C.
Joined: 19 Jan 20
Posts: 3,911
Credit: 43,722,935,976
RAC: 63,162,686

I knew this was coming when I

I knew this was coming when I saw O3MDF pop up on the server page. :) sounds cool.

 

I know you guys were really trying to get CUDA apps with BRP7 (settled on opencl for Linux), is that something in the cards for GW? do you have the capability to recompile the app with the CUDA target instead? that's just a curiosity for me, not a hard suggestion.

 

Quote:
and _random_ memory access is pretty slow on GPU (no, it's not the bandwidth that matters here)

memory speed and latency will likely be king here. so GDDR6X models and HBM models.

_________________________________________________________________________

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 4,305
Credit: 248,807,594
RAC: 33,112

For this search we're using a

For this search we're using a different format for the result file, we're still working on a validator for that. There will be no validation (and crediting) until that is done. Sorry. But the App is Beta Test.

BM

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 4,305
Credit: 248,807,594
RAC: 33,112

The first app with the GPU

The first app with the GPU "recalc" step will likely be a CUDA one. I can easily compile it, but so far I have trouble linking it in a way that it will run on other machines than it was built on (more precisely: a system with a different libc version than the app was built on).

BM

Ian&Steve C.
Ian&Steve C.
Joined: 19 Jan 20
Posts: 3,911
Credit: 43,722,935,976
RAC: 63,162,686

Bernd Machenschalk

Bernd Machenschalk wrote:

The first app with the GPU "recalc" step will likely be a CUDA one. I can easily compile it, but so far I have trouble linking it in a way that it will run on other machines than it was built on (more precisely: a system with a different libc version than the app was built on).

Ah. Nice to know. looking forward to that.

For libc compatibility, I maintain an older compile environment (Ubuntu 18.04). it's old enough that the compiles are compatible with most systems, and new enough to work with the latest CUDA toolkits without much hassle. but I understand it may not be practical for you to do that if you have OS/security restrictions

_________________________________________________________________________

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 4,305
Credit: 248,807,594
RAC: 33,112

Our build machines are kept

Our build machines are kept pretty old for maximum compatibility, IIRC the CUDA one is an Ubuntu 14.04.

The problem here is that we need to find the right combination of static and dynamic linking for the various libraries involved. Static linking is ideal in many cases, as we don't want to distribute large numbers of (shared) libraries with our app. But if the library e.g. uses dlopen(), a static linking requires the correct version of the libc (and libdl is pretty picky at that) at runtime on the target system. Not really simple to solve in a software stack as huge as LSC's LALSuite. A small application developed from scratch for a single purpose like our BRP or FGRP apps is much easier to handle.

BM

Ian&Steve C.
Ian&Steve C.
Joined: 19 Jan 20
Posts: 3,911
Credit: 43,722,935,976
RAC: 63,162,686

ah yeah, I was comparing to

ah yeah, I was comparing to the experience compiling the BRP apps, those seem to compile quickly and easily. but I understand the issue better now.

 

Hey, btw. I see that some tasks are "in progress" via the server status page. are these going out in the wild to volunteers? or just internally? I tried several configurations for project preferences, but the scheduler has responded that no work is available. there is no discrete selection on the project preferences page for O3MDF. I have O3AS selected with Beta tasks allowed, and have tried both yes/no for libc 2.15 compatibility without success.

could you add a discrete O3MDF selection box in the project preferences? Or provide configuration guidance for how to get these tasks?

Thanks!

_________________________________________________________________________

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 4,305
Credit: 248,807,594
RAC: 33,112

Sorry, the editing of the

Sorry, the editing of the project prefs template seem to be broken, I'll fix that tomorrow. Currently only option is to select _all_ applications (which will technically eliminate any restrictions on apps), at least for the venue your host is in.

BM

Ian&Steve C.
Ian&Steve C.
Joined: 19 Jan 20
Posts: 3,911
Credit: 43,722,935,976
RAC: 63,162,686

thanks!

thanks!

_________________________________________________________________________

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.