Gravitational Wave search O1 all-sky tuning (O1AS20-100T)

Mike Hewson
Mike Hewson
Moderator
Joined: 1 Dec 05
Posts: 6534
Credit: 284737525
RAC: 103325

Both my Linux and win64 hosts

Both my Linux and win64 hosts are doing well, no invalids, after dozens of units. The windows machine has about twice the runtime ( but now more consistently so ) than the Fedora. Specifically there are no units over 25 hours.

Cheers, Mike.

I have made this letter longer than usual because I lack the time to make it shorter ...

... and my other CPU is a Ryzen 5950X :-) Blaise Pascal

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 4267
Credit: 244934268
RAC: 16197

In cleaning up the "tuning"

In cleaning up the "tuning" run (that this thread is originally about) we just granted credit for the results of early app versions (before 1.04) that "lost" validation to the results of a 1.04 version.

Being able to do this was the only reason we kept these workunits in the DB, we'll now purge that run from the system.

BM

BM

Maximilian Mieth
Maximilian Mieth
Joined: 4 Oct 12
Posts: 128
Credit: 9885704
RAC: 2055

RE: In cleaning up the

Quote:

In cleaning up the "tuning" run (that this thread is originally about) we just granted credit for the results of early app versions (before 1.04) that "lost" validation to the results of a 1.04 version.

Being able to do this was the only reason we kept these workunits in the DB, we'll now purge that run from the system.

BM


Thank you! Worked for three of the four tasks I crunched, but
one did not get credits. Any reason for that?

Sebastian M. Bobrecki
Sebastian M. Bo...
Joined: 20 Feb 05
Posts: 63
Credit: 1529581785
RAC: 84

So if the app uses the FFTW

So if the app uses the FFTW maybe it is possible to easy use of cuFFTW (cuda FFTW compatibility mode) to do some offload.

Bikeman (Heinz-Bernd Eggenstein)
Bikeman (Heinz-...
Moderator
Joined: 28 Aug 06
Posts: 3522
Credit: 686220445
RAC: 548689

RE: So if the app uses the

Quote:
So if the app uses the FFTW maybe it is possible to easy use of cuFFTW (cuda FFTW compatibility mode) to do some offload.

Actually all our searches on Einstein@Home (Binary Pulsar Search, Fermi Gamma-Ray Pulsar Search and the GW search) use the FFT, and all of them would benefit from offloading the FFT to the GPU.

However, the Binary Radio Pulsar search code is by far the most optimized for GPU, we get a speed-up (with GPUs compared to CPU only) well greater than 10 (depending on the individual GPU and CPU of course). For the GW search, the FFT part of the computation takes only roughly half the computing time for CPUs, so offloading this to the GPU can at most speed up the computation by a factor of 2.

So currently, the best use for the GPUs on E@H is to do the Binary Radio Pular search, and that search only.

We may change this decision later depending on science priorities, tho.

Bikeman (Heinz-Bernd Eggenstein)
Bikeman (Heinz-...
Moderator
Joined: 28 Aug 06
Posts: 3522
Credit: 686220445
RAC: 548689

For those who experience

For those who experience surprisingly poor performance of the GW search on their hardware (say more than 14 hrs with a recent CPU), and who like to experiment a bit, there is a "hidden" way to force the app to try a bit harder to fine-tune the FFT computation to their particular hardware.

You can set two environment variables so that the E@H science app sees them (e.g. you could define them systemwide for Windows or in the startup options for BOINC on Linux):

env. variable                 value
=====================================
LAL_FSTAT_FFT_PLAN_MODE         PATIENT
LAL_FSTAT_FFT_PLAN_TIMEOUT       120

This will tell FFTW to spend (roughly) up to two minutes (120s) just on optimizing the FFT computation for your particular hardware. You can play around with even longer durations.

We do not expect this to have a dramatic effect on most hosts, and it can even lead to slightly worse runtime in some cases, so we did not enable this by default. It might help on some hosts tho where the default settings lead to very suboptimal runtime.

HB

Mumak
Joined: 26 Feb 13
Posts: 325
Credit: 3293394367
RAC: 1513109

Does it mean you're currently

Does it mean you're currently not thinking about releasing a GW GPU application, because the current priority for GPUs is BRP ?
Do you know how much of the GW code could be ported to GPUs, or a very approximate possible speed-up factor on GPUs ?

-----

Bikeman (Heinz-Bernd Eggenstein)
Bikeman (Heinz-...
Moderator
Joined: 28 Aug 06
Posts: 3522
Credit: 686220445
RAC: 548689

RE: Does it mean you're

Quote:
Does it mean you're currently not thinking about releasing a GW GPU application, because the current priority for GPUs is BRP ?
Do you know how much of the GW code could be ported to GPUs, or a very approximate possible speed-up factor on GPUs ?

Correct, there are no GPU plans for the O1 GW search. As I wrote, the speedup would be limited by ca a factor of 2 for a rather straight forward offloading of the FFT ( compared to a factor of >>10 for the BRP app, which is by now almost completly running on the GPU).

I'm quite sure the other parts of the computation (besides FFT) can also be ported to GPUs, but we have no plans to do that in the near future.

Mumak
Joined: 26 Feb 13
Posts: 325
Credit: 3293394367
RAC: 1513109

Thanks for the

Thanks for the clarification.
Hoping you'll change your decision later ;-) BRP6 should be finished this year, depending on how much BRP4G work is there (which is currently quite a lot), so I suppose a new GPU app will be required...

-----

rbpeake
rbpeake
Joined: 18 Jan 05
Posts: 266
Credit: 968536998
RAC: 1158502

RE: ....This will tell FFTW

Quote:

....This will tell FFTW to spend (roughly) up to two minutes (120s) just on optimizing the FFT computation for your particular hardware. You can play around with even longer durations.

HB


Does FFTW perform this check for each work unit as it starts, or just once for the hardware system, which it then remembers for that hardware system into the future?

Thanks!

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.