Gravitational Wave search O1 all-sky tuning (O1AS20-100T)

DanNeely

Joined: 4 Sep 05

Posts: 1364

Credit: 3562358667

RAC: 109

RE: Thanks for the

20 Mar 2016 16:33:29 UTC

Message 136977 in response to message 136975

(moderation:

)

Quote:

Thanks for the clarification.
Hoping you'll change your decision later ;-) BRP6 should be finished this year, depending on how much BRP4G work is there (which is currently quite a lot), so I suppose a new GPU app will be required...

Both BRP projects are getting more data periodically; they've spend about half of the last few years looking like they'd runout within a few months or less before getting plussed up again.

DanNeely

Joined: 4 Sep 05

Posts: 1364

Credit: 3562358667

RAC: 109

RE: RE: Thanks for the

20 Mar 2016 17:50:35 UTC

Message 136978 in response to message 136977

(moderation:

)

Quote:

Quote:
Thanks for the clarification.
Hoping you'll change your decision later ;-) BRP6 should be finished this year, depending on how much BRP4G work is there (which is currently quite a lot), so I suppose a new GPU app will be required...

Both BRP projects are getting more data periodically; they've spend about half of the last few years looking like they'd runout within a few months or less before getting plussed up again.

... and if they ever actually do run out of new data to analyze they can re-run the old data looking for different (wider, more eccentric) orbits than have been searched for so far. It's a diminishing returns game; but like the GW search is a situation where the limiting factor for what can be searched for is how much compute time can be thrown at the problem.

poppageek

Joined: 13 Aug 10

Posts: 259

Credit: 2473733122

RAC: 0

RE: For those who

24 Apr 2016 19:57:02 UTC

Message 136979 in response to message 136972

(moderation:

)

Quote:

For those who experience surprisingly poor performance of the GW search on their hardware (say more than 14 hrs with a recent CPU), and who like to experiment a bit, there is a "hidden" way to force the app to try a bit harder to fine-tune the FFT computation to their particular hardware.

You can set two environment variables so that the E@H science app sees them (e.g. you could define them systemwide for Windows or in the startup options for BOINC on Linux):
env. variable                 value
=====================================
LAL_FSTAT_FFT_PLAN_MODE         PATIENT
LAL_FSTAT_FFT_PLAN_TIMEOUT       120
This will tell FFTW to spend (roughly) up to two minutes (120s) just on optimizing the FFT computation for your particular hardware. You can play around with even longer durations.

We do not expect this to have a dramatic effect on most hosts, and it can even lead to slightly worse runtime in some cases, so we did not enable this by default. It might help on some hosts tho where the default settings lead to very suboptimal runtime.

HB

Curious if anyone has tried this. I have a AMD 3gz x6 core that is getting 66k + seconds. I am trying it on Linux hoping for a bit of a boost.

AgentB

Joined: 17 Mar 12

Posts: 915

Credit: 513211304

RAC: 0

RE: Curious if anyone has

24 Apr 2016 20:43:14 UTC

Message 136980 in response to message 136979

(moderation:

)

Quote:

Curious if anyone has tried this. I have a AMD 3gz x6 core that is getting 66k + seconds. I am trying it on Linux hoping for a bit of a boost.

No i forgot about this. I have just tried it on this host

So don't expect any differences until ~60K seconds pass.

poppageek

Joined: 13 Aug 10

Posts: 259

Credit: 2473733122

RAC: 0

Times are from 1300-2000

26 Apr 2016 3:21:55 UTC

Message 136981

(moderation:

)

Times are from 1300-2000 seconds better. More towards 2k so far. I like it.
AMD 960T quad core unlocked to 6 cores @3gz under Linux.

AgentB

Joined: 17 Mar 12

Posts: 915

Credit: 513211304

RAC: 0

I can concur with a small

29 Apr 2016 23:06:02 UTC

Message 136982 in response to message 136981

(moderation:

)

I can concur with a small saving average over 20 tasks dropped from 61.5K to 60K seconds so about 2.5% improvement with LAL_FSTAT_FFT_PLAN_TIMEOUT=20

the interesting thing with these values, BEFORE

2016-04-23 08:33:10.4455 (16405) [normal]: Reading input data ... 2016-04-23 08:33:18.9566 (16405) [normal]: Search FstatMethod used: 'ResampGeneric'
2016-04-23 08:33:18.9566 (16405) [normal]: Recalc FstatMethod used: 'DemodSSE'

I noticed the extra time AFTER

2016-04-28 21:37:02.1569 (24033) [normal]: Reading input data ... 2016-04-28 21:39:15.1586 (24033) [normal]: Search FstatMethod used: 'ResampGeneric'
2016-04-28 21:39:15.1586 (24033) [normal]: Recalc FstatMethod used: 'DemodSSE'

I'll try some different values and report back. I think I'll go large say 200. Place your bets now...

btw: if anyone is intersted the easies way to set this on the debian style distros is edit the /etc/default/boinc-client and add these lines

export LAL_FSTAT_FFT_PLAN_MODE=PATIENT
export LAL_FSTAT_FFT_PLAN_TIMEOUT=20

and restart boinc-client

poppageek

Joined: 13 Aug 10

Posts: 259

Credit: 2473733122

RAC: 0

I

29 Apr 2016 23:57:59 UTC

Message 136983

(moderation:

)

I used:
LAL_FSTAT_FFT_PLAN_TIMEOUT=120

But I think I too will try 200.

JohnMD

Joined: 11 May 12

Posts: 5

Credit: 26039195

RAC: 0

[quote Possibly. The AVX

30 Apr 2016 18:58:10 UTC

Message 136984 in response to message 136833

(moderation:

)

[quote
Possibly. The AVX versions are still experimental, we'll see how much speedup we get, and whether it's worth the effort to make a version for the relatively small OSX population on E@H.
BM

Adam Socki

Joined: 7 Mar 16

Posts: 26

Credit: 49102640

RAC: 0

RE: btw: if anyone is

6 May 2016 17:53:17 UTC

Message 136985 in response to message 136982

(moderation:

)

Quote:

btw: if anyone is intersted the easies way to set this on the debian style distros is edit the /etc/default/boinc-client and add these lines
export LAL_FSTAT_FFT_PLAN_MODE=PATIENT
export LAL_FSTAT_FFT_PLAN_TIMEOUT=20
and restart boinc-client

Is there an easy way to try this on the Mac OS? I can't figure out where to add those lines to try it.

AgentB

Joined: 17 Mar 12

Posts: 915

Credit: 513211304

RAC: 0

RE: I can concur with a

6 May 2016 18:59:58 UTC

Message 136986 in response to message 136982

(moderation:

)

Quote:

I can concur with a small saving average over 20 tasks dropped from 61.5K to 60K seconds so about 2.5% improvement with LAL_FSTAT_FFT_PLAN_TIMEOUT=20

I'll try some different values and report back. I think I'll go large say 200. Place your bets now...

Average over 20 tasks - 60.5K seconds so no improvement with LAL_FSTAT_FFT_PLAN_TIMEOUT=200, results more varied some a lot quicker 56K, most slightly slower.

OK lets try 60. Place your bets now...

Gravitational Wave search O1 all-sky tuning (O1AS20-100T)

Forums › Technical News

Comment viewing options

Forums › Technical News