Gravitational Wave search O1 all-sky tuning (O1AS20-100T)

DanNeely
DanNeely
Joined: 4 Sep 05
Posts: 1364
Credit: 3562358667
RAC: 109

RE: Thanks for the

Quote:
Thanks for the clarification.
Hoping you'll change your decision later ;-) BRP6 should be finished this year, depending on how much BRP4G work is there (which is currently quite a lot), so I suppose a new GPU app will be required...

Both BRP projects are getting more data periodically; they've spend about half of the last few years looking like they'd runout within a few months or less before getting plussed up again.

DanNeely
DanNeely
Joined: 4 Sep 05
Posts: 1364
Credit: 3562358667
RAC: 109

RE: RE: Thanks for the

Quote:
Quote:
Thanks for the clarification.
Hoping you'll change your decision later ;-) BRP6 should be finished this year, depending on how much BRP4G work is there (which is currently quite a lot), so I suppose a new GPU app will be required...

Both BRP projects are getting more data periodically; they've spend about half of the last few years looking like they'd runout within a few months or less before getting plussed up again.

... and if they ever actually do run out of new data to analyze they can re-run the old data looking for different (wider, more eccentric) orbits than have been searched for so far. It's a diminishing returns game; but like the GW search is a situation where the limiting factor for what can be searched for is how much compute time can be thrown at the problem.

poppageek
poppageek
Joined: 13 Aug 10
Posts: 259
Credit: 2473733122
RAC: 0

RE: For those who

Quote:

For those who experience surprisingly poor performance of the GW search on their hardware (say more than 14 hrs with a recent CPU), and who like to experiment a bit, there is a "hidden" way to force the app to try a bit harder to fine-tune the FFT computation to their particular hardware.

You can set two environment variables so that the E@H science app sees them (e.g. you could define them systemwide for Windows or in the startup options for BOINC on Linux):

env. variable                 value
=====================================
LAL_FSTAT_FFT_PLAN_MODE         PATIENT
LAL_FSTAT_FFT_PLAN_TIMEOUT       120

This will tell FFTW to spend (roughly) up to two minutes (120s) just on optimizing the FFT computation for your particular hardware. You can play around with even longer durations.

We do not expect this to have a dramatic effect on most hosts, and it can even lead to slightly worse runtime in some cases, so we did not enable this by default. It might help on some hosts tho where the default settings lead to very suboptimal runtime.

HB

Curious if anyone has tried this. I have a AMD 3gz x6 core that is getting 66k + seconds. I am trying it on Linux hoping for a bit of a boost.

AgentB
AgentB
Joined: 17 Mar 12
Posts: 915
Credit: 513211304
RAC: 0

RE: Curious if anyone has

Quote:
Curious if anyone has tried this. I have a AMD 3gz x6 core that is getting 66k + seconds. I am trying it on Linux hoping for a bit of a boost.


No i forgot about this. I have just tried it on this host

So don't expect any differences until ~60K seconds pass.

poppageek
poppageek
Joined: 13 Aug 10
Posts: 259
Credit: 2473733122
RAC: 0

Times are from 1300-2000

Times are from 1300-2000 seconds better. More towards 2k so far. I like it.
AMD 960T quad core unlocked to 6 cores @3gz under Linux.

AgentB
AgentB
Joined: 17 Mar 12
Posts: 915
Credit: 513211304
RAC: 0

I can concur with a small

I can concur with a small saving average over 20 tasks dropped from 61.5K to 60K seconds so about 2.5% improvement with LAL_FSTAT_FFT_PLAN_TIMEOUT=20

the interesting thing with these values, BEFORE

2016-04-23 08:33:10.4455 (16405) [normal]: Reading input data ... 2016-04-23 08:33:18.9566 (16405) [normal]: Search FstatMethod used: 'ResampGeneric'
2016-04-23 08:33:18.9566 (16405) [normal]: Recalc FstatMethod used: 'DemodSSE'

I noticed the extra time AFTER

2016-04-28 21:37:02.1569 (24033) [normal]: Reading input data ... 2016-04-28 21:39:15.1586 (24033) [normal]: Search FstatMethod used: 'ResampGeneric'
2016-04-28 21:39:15.1586 (24033) [normal]: Recalc FstatMethod used: 'DemodSSE'

I'll try some different values and report back. I think I'll go large say 200. Place your bets now...

btw: if anyone is intersted the easies way to set this on the debian style distros is edit the /etc/default/boinc-client and add these lines

export LAL_FSTAT_FFT_PLAN_MODE=PATIENT
export LAL_FSTAT_FFT_PLAN_TIMEOUT=20


and restart boinc-client

poppageek
poppageek
Joined: 13 Aug 10
Posts: 259
Credit: 2473733122
RAC: 0

I

I used:
LAL_FSTAT_FFT_PLAN_TIMEOUT=120

But I think I too will try 200.

JohnMD
JohnMD
Joined: 11 May 12
Posts: 5
Credit: 26039195
RAC: 0

[quote Possibly. The AVX

[quote
Possibly. The AVX versions are still experimental, we'll see how much speedup we get, and whether it's worth the effort to make a version for the relatively small OSX population on E@H.
BM

Adam Socki
Adam Socki
Joined: 7 Mar 16
Posts: 26
Credit: 49102640
RAC: 0

RE: btw: if anyone is

Quote:

btw: if anyone is intersted the easies way to set this on the debian style distros is edit the /etc/default/boinc-client and add these lines

export LAL_FSTAT_FFT_PLAN_MODE=PATIENT
export LAL_FSTAT_FFT_PLAN_TIMEOUT=20

and restart boinc-client

Is there an easy way to try this on the Mac OS? I can't figure out where to add those lines to try it.

AgentB
AgentB
Joined: 17 Mar 12
Posts: 915
Credit: 513211304
RAC: 0

RE: I can concur with a

Quote:

I can concur with a small saving average over 20 tasks dropped from 61.5K to 60K seconds so about 2.5% improvement with LAL_FSTAT_FFT_PLAN_TIMEOUT=20

I'll try some different values and report back. I think I'll go large say 200. Place your bets now...

Average over 20 tasks - 60.5K seconds so no improvement with LAL_FSTAT_FFT_PLAN_TIMEOUT=200, results more varied some a lot quicker 56K, most slightly slower.

OK lets try 60. Place your bets now...

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.