upcoming GW run S6Bucket (was S6GC1)

Bernd Machenschalk

Moderator

Administrator

Joined: 15 Oct 04

Posts: 4349

Credit: 253663008

RAC: 34911

25 Feb 2011 10:18:24 UTC

Topic 195672

(moderation:

)

Planning for the next GW run is ongoing, but some decisions have already been made and can be publicized:

* We will analyze data from S6 using the same semi-coherent method that we used in past runs, and essentially the same application code. The run will therefore be named "S6GC1".

* As there is less usable data from S6 than there was from S5, the data volume per workunit will decrease a lot. This will mainly affect the download volume at the beginning of a run, and the data volume to be stored on the client's disk.

* To increase the sensitivity, we will go for fewer but longer coherent segments (this is very technical, but the new run will be strongly dominated by the coherent computation (FStat), the incoherent combination will be 20% or less - for comparison in S5GC1HF it's roughly 50/50)

* With the current application code (that does a refinement only in one dimension (spindown) of the parameter space) the memory requirement will also be lower with the reduced number of stacks. A refinement in three dimensions (spindown and sky position - for comaprison: the old incoherent "Hough" code did a refinement only in sky position) would be possible and further increase sensitivity. It is currently under investigation whether we could implement this in the remaining time, how much sensitivity we would gain, and how much memory that would cost.

* Also under investigation is whether and how to possibly add a "vetoing" already in the application that could filter out e.g. known instrumental artifacts. Currently this is done later in post processing. As a limiting factor on the sensitivity of the search is the number of candidates we can send back from the clients, eliminating candidates before sending them back might therefore increase the sensitivity of the search, too. One unwanted side-effect of this vetoing would be that the run-time of a given task becomes data-dependent and less predicable.

I intend to continue posting to this thread as the preparation for S6GC1 progresses.

astro-marwil

Joined: 28 May 05

Posts: 536

Credit: 692676543

RAC: 502524

upcoming GW run S6Bucket (was S6GC1)

2 Mar 2011 0:18:47 UTC

Message 104366

(moderation:

)

Hallo BM !
Many thanks for keeping us informed! IÂ´m somewhat upset about the little response to your thread. Not every day IÂ´m looking throughout the forums. But I will be very much pleased to heare soon more about the progress and planes for this project.

Kind regards
Martin

pabliedung

Joined: 5 Dec 10

Posts: 1

Credit: 11838587

RAC: 0

Thanks a lot for the updates!

4 Mar 2011 0:25:39 UTC

Message 104367 in response to message 104366

(moderation:

)

Thanks a lot for the updates! They are much appreciated.

telegd

Joined: 17 Apr 07

Posts: 91

Credit: 10212522

RAC: 0

Yes, thanks for the

6 Mar 2011 18:25:02 UTC

Message 104368

(moderation:

)

Yes, thanks for the update.

Quote:

...eliminating candidates before sending them back might therefore increase the sensitivity of the search, too. One unwanted side-effect of this vetoing would be that the run-time of a given task becomes data-dependent and less predicable.

Does this mean the work-unit would abort mid-task when a known artifact is encountered? I assume that this would only reduce the time taken on any given task, not increase it?

If so, would this only be an issue for calculating credit? Sorry if I am misinterpreting your explanation!

Mike Hewson

Moderator

Joined: 1 Dec 05

Posts: 6596

Credit: 340168118

RAC: 264521

RE: Yes, thanks for the

7 Mar 2011 0:50:19 UTC

Message 104369 in response to message 104368

(moderation:

)

Quote:

Yes, thanks for the update.

Quote:
...eliminating candidates before sending them back might therefore increase the sensitivity of the search, too. One unwanted side-effect of this vetoing would be that the run-time of a given task becomes data-dependent and less predicable.

Does this mean the work-unit would abort mid-task when a known artifact is encountered? I assume that this would only reduce the time taken on any given task, not increase it?

No, it's just some processing on the client side that would have been done anyway on the server side. So that will, if anything, increase the WU time and that in turn will depend upon whether said artifacts were disclosed by the particular WU. The increased search sensitivity is achieved by having more 'real' candidates returned on a fixed size 'bus back to town' ie. eliminating known dead wood ( so it's possible that without this extra veto work 'quieter' astronomical candidates didn't previously get on the list because of room taken by 'noisier' instrument sources ).

I hope I got that right Bernd! :-)

Cheers, Mike.

I have made this letter longer than usual because I lack the time to make it shorter ...

... and my other CPU is a Ryzen 5950X :-) Blaise Pascal

Bernd Machenschalk

Moderator

Administrator

Joined: 15 Oct 04

Posts: 4349

Credit: 253663008

RAC: 34911

RE: One unwanted

7 Mar 2011 9:13:30 UTC

Message 104370

(moderation:

)

Quote:

One unwanted side-effect of this vetoing would be that the run-time of a given task becomes data-dependent and less predicable.

I have to admit that this is too short for an explanation of a very technical detail.

The GW application currently reports the 10.000 most promising "candidates" found in every workunit. This list ("toplist") is built and maintained during computation of a task. Whether or not a new candidate makes it into this list depends on what's already in the list. Replacing a candidate doesn't take much time, albeit being data-dependent this time is negligible in the current application.

Calculating the veto statistics for a given candidate takes a significant computational effort. However information from the "normal" (F-Statistic) computation could be re-used when still available in memory, reducing the additionally required computation time.

For implementing the veto we thought about three options:

1. Calculate the veto statistics for each candidate before entering it into the toplist. (+) allows to reuse FStat values, easiest implementation. (-) highest computational effort (millions of candidates)

2. Calculate the veto statistics only for those candidates that would make it into the toplist. (+) lower computational effort than 1., FStat values can be re-used. (-) data-dependent, thus unpredictable computational effort (you don't know in advance how often a candidate is replaced in the toplist during a task)

3. Calculate the veto statistics at the end only for the candidates in the toplist. (+) lowest computational effort, depends only on length of toplist, not on number of templates searched. (-) FStat values have to be re-computed for the candidates (old values aren't stored due to memory limitations), most effort required for implementation, veto shortens the toplist (in worst case to empty list).

At the time of writing the initial post we intended to go for option 2. However we fond that we won't be able to gain enough confidence in the veto algorithm and implementation in the remaining time of the current run to actually let it automatically filter out candidates.

So we'll go for a (modified) option 3.: we'll calculate the veto statistics for each candidate of the toplist at the end and add the result to the candidate list (in an additional column) that is reported back. In S6GC1, no candidate will actually be "vetoed" (i.e. removed from the list), but calculating the statistics on the client computers will save time during post-processing and help gain confidence in the veto.

Bottom line: at current state of plans the point "data-dependent computation time" will not apply to S6GC1.

telegd

Joined: 17 Apr 07

Posts: 91

Credit: 10212522

RAC: 0

Thanks Bernd for the

13 Mar 2011 6:32:23 UTC

Message 104371 in response to message 104369

(moderation:

)

Thanks Bernd for the technical explanation. It is much appreciated.

Quote:

having more 'real' candidates returned on a fixed size 'bus back to town'

Thanks Mike, I liked this metaphor!

So, going with modified Option 3 means that the riders on the bus don't change. It just pre-tags them with a warning if it seems like they contain artifacts? How much of a savings on the server side will you be able to realize? I am guessing you are hoping to eventually get something like Option 2 in a future application?

I assume that differences in run time, and therefore credit calculations, makes life more complex for the project.

Bernd Machenschalk

Moderator

Administrator

Joined: 15 Oct 04

Posts: 4349

Credit: 253663008

RAC: 34911

Einstein@home is currently

31 Mar 2011 8:42:53 UTC

Message 104372

(moderation:

)

Einstein@home is currently distributing the S6 data to the download mirrors all over the world. This should affect the downloads of S5GC1HF data files by the clients too much, but you'll never know.

Bernd Machenschalk

Moderator

Administrator

Joined: 15 Oct 04

Posts: 4349

Credit: 253663008

RAC: 34911

The new S6 GW analysis run

9 May 2011 12:31:46 UTC

Message 104373

(moderation:

)

The new S6 GW analysis run will have the internal name "S6Bucket" (because of the bucket-shape of the sensitivity curve used to select the data from S6).

I hope to send out the first few hundred test workunits of the new run in the next hours.